我觉得Android SD卡就是个坑啊! 为啥这样说,因为Google自家的Pixel就没有SD卡槽。之前就有过使用sdcardfs(Google版)的SD卡删除文件后未释放的问题,查过小米一加手机都有这个问题,华为手机没有这个问题因为华为有自研sdcardfs,Google直到9.0框架层才修复。现在9.0对SD卡安装除了用作portable storage之外,又新增了作为Phone Storage用,因为9.0默认data区是强制加密的,这个phone storage就是和这个手机绑定了,相当于emmc变大了,说白了9.0的系统你可以买个出厂是32G的,然后你再买个128G+ SD,然后比性价:]但是这个功能Google今年11月才修复,高通平台当然也没更新。

ok,我们来看具体问题:就是点击phone storage后要format,format一段进度后进Setting看不到SD卡,format失败了?check logcat。

11-03 15:48:22.557   485   488 D vold    : Disk at 179:64 changed
11-03 15:48:22.558 485 488 V vold : /system/bin/sgdisk
11-03 15:48:22.558 485 488 V vold : --android-dump
11-03 15:48:22.558 485 488 V vold : /dev/block/vold/disk:179,64
11-03 15:48:22.615 485 488 V vold : DISK gpt B435862C-D079-4626-9E4D-73F40042A710
11-03 15:48:22.615 485 488 V vold :
11-03 15:48:22.615 485 488 V vold : PART 1 19A710A2-B3CA-11E4-B026-10604B889DCF 4E9FA227-F812-4540-A451-80C2E6FD63B9 android_meta
11-03 15:48:22.615 485 488 V vold :
11-03 15:48:22.615 485 488 V vold : PART 2 193D1EA4-B3CA-11E4-B075-10604B889DCF BA78EBA0-F6AA-47FE-85BD-506BC96A25E7 android_expand
11-03 15:48:22.615 485 488 V vold :
11-03 15:48:22.617 485 488 D vold : Found key for GUID ba78eba0f6aa47fe85bd506bc96a25e7
11-03 15:48:22.617 485 488 D vold : Device just partitioned; silently formatting
11-03 15:48:22.619 485 488 E Cryptfs : Cannot remove dm-crypt device
11-03 15:48:22.628 485 488 I Cryptfs : Extra parameters for dm_crypt:
11-03 15:48:22.629 485 488 I Cryptfs : target_type = crypt
11-03 15:48:22.629 485 488 I Cryptfs : real_blk_name = /dev/block/vold/private:179,66, extra_params =
11-03 15:48:22.637 485 488 D vold : Resolved auto to f2fs
11-03 15:48:22.637 485 488 V vold : /system/bin/make_f2fs
11-03 15:48:22.637 485 488 V vold : -f
11-03 15:48:22.637 485 488 V vold : -d1
11-03 15:48:22.637 485 488 V vold : -O
11-03 15:48:22.637 485 488 V vold : quota
11-03 15:48:22.637 485 488 V vold : -O
11-03 15:48:22.637 485 488 V vold : verity
11-03 15:48:22.638 485 488 V vold : /dev/block/dm-3
11-03 15:48:22.644 2964 6447 I ActivityManager: Killing 6074:flipboard.boxer.app/u0a114 (adj 906): empty #17
11-03 15:48:22.657 485 488 E vold : private:179,66 failed to format: No such device or address //tj: here

最后一行还真是format失败,ok,check code,能发现vold下PublicVolume.cpp和PrivateVolume.cpp都有,真正走的是PrivateVolume:

status_t PrivateVolume::doFormat(const std::string& fsType) {
std::string resolvedFsType = fsType;
if (fsType == "auto") {
// For now, assume that all MMC devices are flash-based SD cards, and
// give everyone else ext4 because sysfs rotational isn't reliable.
if ((major(mRawDevice) == kMajorBlockMmc) && f2fs::IsSupported()) {
resolvedFsType = "f2fs";
} else {
resolvedFsType = "ext4";
}
LOG(DEBUG) << "Resolved auto to " << resolvedFsType;
}

if (resolvedFsType == "ext4") {
// TODO: change reported mountpoint once we have better selinux support
if (ext4::Format(mDmDevPath, 0, "/data")) {
PLOG(ERROR) << getId() << " failed to format";
return -EIO;
}
} else if (resolvedFsType == "f2fs") {
if (f2fs::Format(mDmDevPath)) {
PLOG(ERROR) << getId() << " failed to format"; //tj: here
return -EIO;
}
} else {
LOG(ERROR) << getId() << " unsupported filesystem " << fsType;
return -EINVAL;
}

return OK;
}

竟然会用f2fs。跟下来源:

void Disk::createPrivateVolume(dev_t device, const std::string& partGuid) {
std::string normalizedGuid;
if (NormalizeHex(partGuid, normalizedGuid)) {
LOG(WARNING) << "Invalid GUID " << partGuid;
return;
}

std::string keyRaw;
if (!ReadFileToString(BuildKeyPath(normalizedGuid), &keyRaw)) {
PLOG(ERROR) << "Failed to load key for GUID " << normalizedGuid;
return;
}

LOG(DEBUG) << "Found key for GUID " << normalizedGuid;

auto vol = std::shared_ptr<VolumeBase>(new PrivateVolume(device, keyRaw));
if (mJustPartitioned) {
LOG(DEBUG) << "Device just partitioned; silently formatting";
vol->setSilent(true);
vol->create();
vol->format("auto"); //tj: here
vol->destroy();
vol->setSilent(false);
}

mVolumes.push_back(vol);
vol->setDiskId(getId());
vol->setPartGuid(partGuid);
vol->create();
}

ok,再往上就是volume manager了,这篇不关心。这个问题format失败的原因是No such device,来看下vol->create():

status_t PrivateVolume::doCreate() {
if (CreateDeviceNode(mRawDevPath, mRawDevice)) {
return -EIO;
}
if (mKeyRaw.size() != cryptfs_get_keysize()) {
PLOG(ERROR) << getId() << " Raw keysize " << mKeyRaw.size() <<
" does not match crypt keysize " << cryptfs_get_keysize();
return -EIO;
}

// Recover from stale vold by tearing down any old mappings
cryptfs_revert_ext_volume(getId().c_str());

// TODO: figure out better SELinux labels for private volumes

unsigned char* key = (unsigned char*) mKeyRaw.data();
char crypto_blkdev[MAXPATHLEN];
int res = cryptfs_setup_ext_volume(getId().c_str(), mRawDevPath.c_str(),
key, crypto_blkdev);
mDmDevPath = crypto_blkdev;
if (res != 0) {
PLOG(ERROR) << getId() << " failed to setup cryptfs";
return -EIO;
}

return OK;
}

vold先删掉这个外部设备(private:179,66):

/*
* Called by vold when it's asked to unmount an encrypted external
* storage volume.
*/
int cryptfs_revert_ext_volume(const char* label) {
return delete_crypto_blk_dev((char*) label);
}

delete_crypto_blk_dev之前分析过,Google忘了加error log了,我们加上看到:

E Cryptfs : Cannot remove dm-crypt device private:179,66: No such device or address

没这个设备,因为没有其他创建。

doCreate没对cryptfs_revert_ext_volume成功与否处理,如果是busy是不是有异常了?继续看:

/*
* Called by vold when it's asked to mount an encrypted external
* storage volume. The incoming partition has no crypto header/footer,
* as any metadata is been stored in a separate, small partition. We
* assume it must be using our same crypt type and keysize.
*
* out_crypto_blkdev must be MAXPATHLEN.
*/
int cryptfs_setup_ext_volume(const char* label, const char* real_blkdev,
const unsigned char* key, char* out_crypto_blkdev) {
...
return create_crypto_blk_dev(&ext_crypt_ftr, key, real_blkdev, out_crypto_blkdev, label, flags);
}
static int create_crypto_blk_dev(struct crypt_mnt_ftr* crypt_ftr, const unsigned char* master_key,
const char* real_blk_name, char* crypto_blk_name, const char* name,
uint32_t flags) {
...
ioctl_init(io, DM_CRYPT_BUF_SIZE, name, 0);
err = ioctl(fd, DM_DEV_CREATE, io);
if (err) {
SLOGE("Cannot create dm-crypt device %s: %s\n", name, strerror(errno));
goto errout;
}

/* Get the device status, in particular, the name of it's device file */
ioctl_init(io, DM_CRYPT_BUF_SIZE, name, 0);
if (ioctl(fd, DM_DEV_STATUS, io)) {
SLOGE("Cannot retrieve dm-crypt device status\n");
goto errout;
}
minor = (io->dev & 0xff) | ((io->dev >> 12) & 0xfff00);
snprintf(crypto_blk_name, MAXPATHLEN, "/dev/block/dm-%u", minor);

...
load_count = load_crypto_mapping_table(crypt_ftr, master_key, real_blk_name, name, fd,
extra_params);
...

/* Resume this device to activate it */
ioctl_init(io, DM_CRYPT_BUF_SIZE, name, 0);

if (ioctl(fd, DM_DEV_SUSPEND, io)) {
SLOGE("Cannot resume the dm-crypt device\n");
goto errout;
}

/* We made it here with no errors. Woot! */
retval = 0;

errout:
close(fd); /* If fd is <0 from a failed open call, it's safe to just ignore the close error */

return retval;
}

奇怪,没有发现ioctl错误,为啥create完了没这个dm设备?难道又是delay问题?看下log:

11-03 15:48:22.629   485   488 I Cryptfs : real_blk_name = /dev/block/vold/private:179,66, extra_params = 
11-03 15:48:22.657 485 488 E vold : private:179,66 failed to format: No such device or address //tj: here

差了~20ms不够?有这个可能。和高通确认他们也不知道,翻了下googlesource,竟然有个提交和这个很像:

Wait for dm device to be ready before format

It can sometimes take a moment for the dm-device to appear after
creation, causing operations on it such as formatting to fail.
Ensure the device exists before create_crypto_blk_dev returns.

Test: adb sm set-virtual-disk true and format as adoptable.
Bug: 117586466
Change-Id: Id8f571b551f50fc759e78d917e4ac3080e926722
Merged-In: Id8f571b551f50fc759e78d917e4ac3080e926722

ok,合入后还是看不到SD卡,不过format正常了:

I make_f2fs: Info: format successful

你说是不是坑。继续看log,发现有如下错误:

11-26 04:13:35.941   510  1686 E cutils  : Failed to chown(/mnt/expand/0f64cdaf-640b-4a01-9432-03a80985bcdf, 0, 0): I/O error
11-26 04:13:35.926 0 0 E goodix_fp soc: fpsensor: Selected 'fpsensor_reset_active'
11-26 04:13:35.931 0 0 W gf : gf_ioctl, exit
11-26 04:13:35.948 0 0 E : Quota error (device dm-1): qtree_read_dquot: Can't read quota structure for id 0
11-26 04:13:35.941 510 1686 E vold : private:179,66 failed to create mount point /mnt/expand/0f64cdaf-640b-4a01-9432-03a80985bcdf: I/O error

ok,看来是mount失败了,啥玩意?I/O error?filesystem的问题?我们先看下vold chown相关code:

status_t PrivateVolume::doMount() {
if (readMetadata()) {
LOG(ERROR) << getId() << " failed to read metadata";
return -EIO;
}

mPath = StringPrintf("/mnt/expand/%s", mFsUuid.c_str());
setPath(mPath);

if (PrepareDir(mPath, 0700, AID_ROOT, AID_ROOT)) {
PLOG(ERROR) << getId() << " failed to create mount point " << mPath; //tj: here
return -EIO;
}
status_t PrepareDir(const std::string& path, mode_t mode, uid_t uid, gid_t gid) {
std::lock_guard<std::mutex> lock(kSecurityLock);
const char* cpath = path.c_str();

char* secontext = nullptr;
if (sehandle) {
if (!selabel_lookup(sehandle, &secontext, cpath, S_IFDIR)) {
setfscreatecon(secontext);
}
}

int res = fs_prepare_dir(cpath, mode, uid, gid); //tj: here

if (secontext) {
setfscreatecon(nullptr);
freecon(secontext);
}

if (res == 0) {
return OK;
} else {
return -errno;
}
}
int fs_prepare_dir(const char* path, mode_t mode, uid_t uid, gid_t gid) {
return fs_prepare_path_impl(path, mode, uid, gid, /*allow_fixup*/ 1, /*prepare_as_dir*/ 1);
}
static int fs_prepare_path_impl(const char* path, mode_t mode, uid_t uid, gid_t gid,
int allow_fixup, int prepare_as_dir) {
...
fixup:
if (TEMP_FAILURE_RETRY(chmod(path, mode)) == -1) {
ALOGE("Failed to chmod(%s, %d): %s", path, mode, strerror(errno));
return -1;
}
if (TEMP_FAILURE_RETRY(chown(path, uid, gid)) == -1) {
ALOGE("Failed to chown(%s, %d, %d): %s", path, uid, gid, strerror(errno)); //tj: here
return -1;
}

return 0;
}

最后chown进的是system/core/libcutils/fs.cpp,一看就和文件系统有关,我手动敲了下chown没问题,但是chgrp有如下异常:

chgrp: 'media/' to '(null):root': I/O error by cmd

像是设备找不到,能看到vold private其实还支持ext4,那让我们用ext4吧,果然没这个问题,这里MSM内核4.9 f2fs待查。

btw: setting对64G+识别size有问题,64G-没问题,丢给app看了:]