异常进recovery前面已经分析过主要是persist进程crash too many,所以导致进恢复模式,显示"Can't load Android system"。

现在又出现这个恢复界面,是在刷完版本第一次开机加密时进入,看离线logcat,竟然没有"FACTORY_RESET"记录,难道和加密有关?

上OpenGrok搜下prompt_and_wipe_data:

/bootable/recovery/
H A D    recovery.cpp    96 { "prompt_and_wipe_data", no_argument, NULL, 0 },
149 * --prompt_and_wipe_data - prompt the user that data is corrupt,
764 static bool prompt_and_wipe_data(Device* device) { function 
1486 } else if (option == "prompt_and_wipe_data") {
1631 if (!prompt_and_wipe_data(device)) {
/system/core/init/
H A D    builtins.cpp    263 "--prompt_and_wipe_data",
/frameworks/base/core/java/android/os/
H A D    RecoverySystem.java    902 bootCommand(context, null, "--prompt_and_wipe_data", reasonArg, localeArg);

除了java外,还有system也会call,实际上是两处:

./core/init/builtins.cpp:291:                {"--prompt_and_wipe_data", "--reason=set_policy_failed:"s + args[1]});
./core/init/builtins.cpp:1004:                reboot_into_recovery({"--prompt_and_wipe_data", "--reason="s + reboot_reason});

看下code:

            if (e4crypt_is_native()) {
                LOG(ERROR) << "Rebooting into recovery, reason: " << reboot_reason;
                reboot_into_recovery({"--prompt_and_wipe_data", "--reason="s + reboot_reason});
...
    if (e4crypt_is_native()) {
        if (e4crypt_set_directory_policy(args[1].c_str())) {
            return reboot_into_recovery(
                {"--prompt_and_wipe_data", "--reason=set_policy_failed:"s + args[1]});
        }
    }

log里没有发现"Rebooting into recovery, reason: " ,lets check e4crypt_is_native:

bool e4crypt_is_native() {
    char value[PROPERTY_VALUE_MAX];
    property_get("ro.crypto.type", value, "none");
    return !strcmp(value, "file");
}

系统属性ro.crypto.type要是file才会触发,看下项目的:

xxx:/ # getprop ro.crypto.type
block

很明显不是嘛,我去,那是什么导致进入的?有个线索是recovery log会记录reason,reason会保存到cache分区,来看bootable/recovery.cpp:

int main(int argc, char **argv) {
...
  printf("reason is [%s]\n", reason);
...
}

static void copy_logs() {
    // Always write to pmsg, this allows the OTA logs to be caught in logcat -L
    copy_log_file_to_pmsg(TEMPORARY_LOG_FILE, LAST_LOG_FILE);
    copy_log_file_to_pmsg(TEMPORARY_INSTALL_FILE, LAST_INSTALL_FILE);

    // We can do nothing for now if there's no /cache partition.
    if (!has_cache) {
        return;
    }

    ensure_path_mounted(LAST_LOG_FILE);
    ensure_path_mounted(LAST_KMSG_FILE);
    rotate_logs(LAST_LOG_FILE, LAST_KMSG_FILE);

    // Copy logs to cache so the system can find out what happened.
    copy_log_file(TEMPORARY_LOG_FILE, LOG_FILE, true);
    copy_log_file(TEMPORARY_LOG_FILE, LAST_LOG_FILE, false);
    copy_log_file(TEMPORARY_INSTALL_FILE, LAST_INSTALL_FILE, false);
    save_kernel_log(LAST_KMSG_FILE);

但是我们启用了OTA A/B系统把cache分区移除了,其实google在A/B启用后对cache分区的要求是可选的:

cache: The cache partition stores temporary data and is optional if a device uses A/B updates. The cache partition doesn't need to be writable from the bootloader, only erasable. The size depends on the device type and the availability of space on userdata. Currently 50MB-100MB should be ok.

目前9.0 recovery log默认还是存在cache分区的,所以建议在A/B使能后保留cache分区,当然你把log存储path改到/data/misc/recovery也行,参考README.md:

Running the manual tests

recovery-refresh and recovery-persist executables exist only on systems without
/cache partition. And we need to follow special steps to run tests for them.

看下recovery-persist.cpp:

static const char *LAST_LOG_FILE = "/data/misc/recovery/last_log";
static const char *LAST_KMSG_FILE = "/data/misc/recovery/last_kmsg";
...
ssize_t logsave(
        log_id_t /* logId */,
        char /* prio */,
        const char *filename,
        const char *buf, size_t len,
        void * /* arg */) {

    std::string destination("/data/misc/");
    destination += filename;
...
        /*
         * TBD: Future location to move content from
         * /cache/recovery to /data/misc/recovery/
         */

so, 没有cache分区可以转到/data/misc/recovery下。

懒得改了,这个问题在user版本下出现,对应的userdebug死活出不来,没有log没有证据啊。换个新的userdebug版本竟然出现了,adb还在,赶紧抓下,发现还是persist process crash too many导致的。

为了帮助定位问题,其实可以在屏幕上把reason打出来, rt?