我觉得Android SD卡就是个坑啊! 为啥这样说,因为Google自家的Pixel就没有SD卡槽。之前就有过使用sdcardfs(Google版)的SD卡删除文件后未释放的问题,查过小米一加手机都有这个问题,华为手机没有这个问题因为华为有自研sdcardfs,Google直到9.0框架层才修复。现在9.0对SD卡安装除了用作portable storage之外,又新增了作为Phone Storage用,因为9.0默认data区是强制加密的,这个phone storage就是和这个手机绑定了,相当于emmc变大了,说白了9.0的系统你可以买个出厂是32G的,然后你再买个128G+ SD,然后比性价:]但是这个功能Google今年11月才修复,高通平台当然也没更新。

ok,我们来看具体问题:就是点击phone storage后要format,format一段进度后进Setting看不到SD卡,format失败了?check logcat。

11-03 15:48:22.557   485   488 D vold    : Disk at 179:64 changed
11-03 15:48:22.558   485   488 V vold    : /system/bin/sgdisk
11-03 15:48:22.558   485   488 V vold    :     --android-dump
11-03 15:48:22.558   485   488 V vold    :     /dev/block/vold/disk:179,64
11-03 15:48:22.615   485   488 V vold    : DISK gpt B435862C-D079-4626-9E4D-73F40042A710
11-03 15:48:22.615   485   488 V vold    :     
11-03 15:48:22.615   485   488 V vold    : PART 1 19A710A2-B3CA-11E4-B026-10604B889DCF 4E9FA227-F812-4540-A451-80C2E6FD63B9 android_meta
11-03 15:48:22.615   485   488 V vold    :     
11-03 15:48:22.615   485   488 V vold    : PART 2 193D1EA4-B3CA-11E4-B075-10604B889DCF BA78EBA0-F6AA-47FE-85BD-506BC96A25E7 android_expand
11-03 15:48:22.615   485   488 V vold    :     
11-03 15:48:22.617   485   488 D vold    : Found key for GUID ba78eba0f6aa47fe85bd506bc96a25e7
11-03 15:48:22.617   485   488 D vold    : Device just partitioned; silently formatting
11-03 15:48:22.619   485   488 E Cryptfs : Cannot remove dm-crypt device
11-03 15:48:22.628   485   488 I Cryptfs : Extra parameters for dm_crypt: 
11-03 15:48:22.629   485   488 I Cryptfs : target_type = crypt 
11-03 15:48:22.629   485   488 I Cryptfs : real_blk_name = /dev/block/vold/private:179,66, extra_params = 
11-03 15:48:22.637   485   488 D vold    : Resolved auto to f2fs
11-03 15:48:22.637   485   488 V vold    : /system/bin/make_f2fs
11-03 15:48:22.637   485   488 V vold    :     -f    
11-03 15:48:22.637   485   488 V vold    :     -d1   
11-03 15:48:22.637   485   488 V vold    :     -O    
11-03 15:48:22.637   485   488 V vold    :     quota 
11-03 15:48:22.637   485   488 V vold    :     -O    
11-03 15:48:22.637   485   488 V vold    :     verity
11-03 15:48:22.638   485   488 V vold    :     /dev/block/dm-3
11-03 15:48:22.644  2964  6447 I ActivityManager: Killing 6074:flipboard.boxer.app/u0a114 (adj 906): empty #17
11-03 15:48:22.657   485   488 E vold    : private:179,66 failed to format: No such device or address //tj: here

最后一行还真是format失败,ok,check code,能发现vold下PublicVolume.cpp和PrivateVolume.cpp都有,真正走的是PrivateVolume:

status_t PrivateVolume::doFormat(const std::string& fsType) {
    std::string resolvedFsType = fsType;
    if (fsType == "auto") {
        // For now, assume that all MMC devices are flash-based SD cards, and
        // give everyone else ext4 because sysfs rotational isn't reliable.
        if ((major(mRawDevice) == kMajorBlockMmc) && f2fs::IsSupported()) {
            resolvedFsType = "f2fs";
        } else {
            resolvedFsType = "ext4";
        }   
        LOG(DEBUG) << "Resolved auto to " << resolvedFsType;
    }   

    if (resolvedFsType == "ext4") {
        // TODO: change reported mountpoint once we have better selinux support
        if (ext4::Format(mDmDevPath, 0, "/data")) {
            PLOG(ERROR) << getId() << " failed to format";
            return -EIO;
        }   
    } else if (resolvedFsType == "f2fs") {
        if (f2fs::Format(mDmDevPath)) {
            PLOG(ERROR) << getId() << " failed to format"; //tj: here
            return -EIO;
        }   
    } else {
        LOG(ERROR) << getId() << " unsupported filesystem " << fsType;
        return -EINVAL;
    }   

    return OK; 
}

竟然会用f2fs。跟下来源:

void Disk::createPrivateVolume(dev_t device, const std::string& partGuid) {
    std::string normalizedGuid;
    if (NormalizeHex(partGuid, normalizedGuid)) {
        LOG(WARNING) << "Invalid GUID " << partGuid;
        return;
    }   

    std::string keyRaw;
    if (!ReadFileToString(BuildKeyPath(normalizedGuid), &keyRaw)) {
        PLOG(ERROR) << "Failed to load key for GUID " << normalizedGuid;
        return;
    }   

    LOG(DEBUG) << "Found key for GUID " << normalizedGuid;

    auto vol = std::shared_ptr<VolumeBase>(new PrivateVolume(device, keyRaw));
    if (mJustPartitioned) {
        LOG(DEBUG) << "Device just partitioned; silently formatting";
        vol->setSilent(true);
        vol->create();
        vol->format("auto"); //tj: here
        vol->destroy();
        vol->setSilent(false);
    }   

    mVolumes.push_back(vol);
    vol->setDiskId(getId());
    vol->setPartGuid(partGuid);
    vol->create();
}

ok,再往上就是volume manager了,这篇不关心。这个问题format失败的原因是No such device,来看下vol->create():

status_t PrivateVolume::doCreate() {
    if (CreateDeviceNode(mRawDevPath, mRawDevice)) {
        return -EIO;
    }   
    if (mKeyRaw.size() != cryptfs_get_keysize()) {
      PLOG(ERROR) << getId() << " Raw keysize " << mKeyRaw.size() <<
          " does not match crypt keysize " << cryptfs_get_keysize();
      return -EIO;
    }   

    // Recover from stale vold by tearing down any old mappings
    cryptfs_revert_ext_volume(getId().c_str());

    // TODO: figure out better SELinux labels for private volumes

    unsigned char* key = (unsigned char*) mKeyRaw.data();
    char crypto_blkdev[MAXPATHLEN];
    int res = cryptfs_setup_ext_volume(getId().c_str(), mRawDevPath.c_str(),
            key, crypto_blkdev);
    mDmDevPath = crypto_blkdev;
    if (res != 0) {
        PLOG(ERROR) << getId() << " failed to setup cryptfs";
        return -EIO;
    }   

    return OK; 
}

vold先删掉这个外部设备(private:179,66):

/*
 * Called by vold when it's asked to unmount an encrypted external
 * storage volume.
 */
int cryptfs_revert_ext_volume(const char* label) {
    return delete_crypto_blk_dev((char*) label);
}

delete_crypto_blk_dev之前分析过,Google忘了加error log了,我们加上看到:

E Cryptfs : Cannot remove dm-crypt device private:179,66: No such device or address

没这个设备,因为没有其他创建。

doCreate没对cryptfs_revert_ext_volume成功与否处理,如果是busy是不是有异常了?继续看:

/*
 * Called by vold when it's asked to mount an encrypted external
 * storage volume. The incoming partition has no crypto header/footer,
 * as any metadata is been stored in a separate, small partition.  We
 * assume it must be using our same crypt type and keysize.
 *
 * out_crypto_blkdev must be MAXPATHLEN.
 */
int cryptfs_setup_ext_volume(const char* label, const char* real_blkdev,
        const unsigned char* key, char* out_crypto_blkdev) {
    ...
    return create_crypto_blk_dev(&ext_crypt_ftr, key, real_blkdev, out_crypto_blkdev, label, flags);
}
static int create_crypto_blk_dev(struct crypt_mnt_ftr* crypt_ftr, const unsigned char* master_key,
                                 const char* real_blk_name, char* crypto_blk_name, const char* name,
                                 uint32_t flags) {
    ...
    ioctl_init(io, DM_CRYPT_BUF_SIZE, name, 0);
    err = ioctl(fd, DM_DEV_CREATE, io);
    if (err) {
        SLOGE("Cannot create dm-crypt device %s: %s\n", name, strerror(errno));
        goto errout;
    }

    /* Get the device status, in particular, the name of it's device file */
    ioctl_init(io, DM_CRYPT_BUF_SIZE, name, 0);
    if (ioctl(fd, DM_DEV_STATUS, io)) {
        SLOGE("Cannot retrieve dm-crypt device status\n");
        goto errout;
    }
    minor = (io->dev & 0xff) | ((io->dev >> 12) & 0xfff00);
    snprintf(crypto_blk_name, MAXPATHLEN, "/dev/block/dm-%u", minor);

    ...
    load_count = load_crypto_mapping_table(crypt_ftr, master_key, real_blk_name, name, fd,
                                           extra_params);
    ...

    /* Resume this device to activate it */
    ioctl_init(io, DM_CRYPT_BUF_SIZE, name, 0);

    if (ioctl(fd, DM_DEV_SUSPEND, io)) {
        SLOGE("Cannot resume the dm-crypt device\n");
        goto errout;
    }

    /* We made it here with no errors.  Woot! */
    retval = 0;

errout:
  close(fd);   /* If fd is <0 from a failed open call, it's safe to just ignore the close error */

  return retval;
}

奇怪,没有发现ioctl错误,为啥create完了没这个dm设备?难道又是delay问题?看下log:

11-03 15:48:22.629   485   488 I Cryptfs : real_blk_name = /dev/block/vold/private:179,66, extra_params = 
11-03 15:48:22.657   485   488 E vold    : private:179,66 failed to format: No such device or address //tj: here

差了~20ms不够?有这个可能。和高通确认他们也不知道,翻了下googlesource,竟然有个提交和这个很像:

Wait for dm device to be ready before format

It can sometimes take a moment for the dm-device to appear after
creation, causing operations on it such as formatting to fail.
Ensure the device exists before create_crypto_blk_dev returns.

Test: adb sm set-virtual-disk true and format as adoptable.
Bug: 117586466
Change-Id: Id8f571b551f50fc759e78d917e4ac3080e926722
Merged-In: Id8f571b551f50fc759e78d917e4ac3080e926722

ok,合入后还是看不到SD卡,不过format正常了:

I make_f2fs: Info: format successful

你说是不是坑。继续看log,发现有如下错误:

11-26 04:13:35.941   510  1686 E cutils  : Failed to chown(/mnt/expand/0f64cdaf-640b-4a01-9432-03a80985bcdf, 0, 0): I/O error
11-26 04:13:35.926     0     0 E goodix_fp soc: fpsensor: Selected 'fpsensor_reset_active'
11-26 04:13:35.931     0     0 W gf      : gf_ioctl, exit
11-26 04:13:35.948     0     0 E         : Quota error (device dm-1): qtree_read_dquot: Can't read quota structure for id 0
11-26 04:13:35.941   510  1686 E vold    : private:179,66 failed to create mount point /mnt/expand/0f64cdaf-640b-4a01-9432-03a80985bcdf: I/O error

ok,看来是mount失败了,啥玩意?I/O error?filesystem的问题?我们先看下vold chown相关code:

status_t PrivateVolume::doMount() {
    if (readMetadata()) {
        LOG(ERROR) << getId() << " failed to read metadata";
        return -EIO;
    }   

    mPath = StringPrintf("/mnt/expand/%s", mFsUuid.c_str());
    setPath(mPath);

    if (PrepareDir(mPath, 0700, AID_ROOT, AID_ROOT)) {
        PLOG(ERROR) << getId() << " failed to create mount point " << mPath; //tj: here
        return -EIO;
    }
status_t PrepareDir(const std::string& path, mode_t mode, uid_t uid, gid_t gid) {
    std::lock_guard<std::mutex> lock(kSecurityLock);
    const char* cpath = path.c_str();

    char* secontext = nullptr;
    if (sehandle) {
        if (!selabel_lookup(sehandle, &secontext, cpath, S_IFDIR)) {
            setfscreatecon(secontext);
        }   
    }   

    int res = fs_prepare_dir(cpath, mode, uid, gid); //tj: here

    if (secontext) {
        setfscreatecon(nullptr);
        freecon(secontext);
    }   

    if (res == 0) {
        return OK; 
    } else {
        return -errno;
    }   
}
int fs_prepare_dir(const char* path, mode_t mode, uid_t uid, gid_t gid) {
    return fs_prepare_path_impl(path, mode, uid, gid, /*allow_fixup*/ 1, /*prepare_as_dir*/ 1); 
}
static int fs_prepare_path_impl(const char* path, mode_t mode, uid_t uid, gid_t gid,
        int allow_fixup, int prepare_as_dir) {
    ...
fixup:
    if (TEMP_FAILURE_RETRY(chmod(path, mode)) == -1) {
        ALOGE("Failed to chmod(%s, %d): %s", path, mode, strerror(errno));
        return -1; 
    }   
    if (TEMP_FAILURE_RETRY(chown(path, uid, gid)) == -1) {
        ALOGE("Failed to chown(%s, %d, %d): %s", path, uid, gid, strerror(errno)); //tj: here
        return -1; 
    }   

    return 0;
}

最后chown进的是system/core/libcutils/fs.cpp,一看就和文件系统有关,我手动敲了下chown没问题,但是chgrp有如下异常:

chgrp: 'media/' to '(null):root': I/O error by cmd

像是设备找不到,能看到vold private其实还支持ext4,那让我们用ext4吧,果然没这个问题,这里MSM内核4.9 f2fs待查。

btw: setting对64G+识别size有问题,64G-没问题,丢给app看了:]