最近压测碰到一例内核死机, 提示CFI failure.

先了解下CFI:

Control-flow integrity (CFI) is a technique used to reduce the ability to redirect the execution of a program’s code in attacker-specified ways.

字面直译CFI就是控制流完整性,用来减少通过修改代码流程来进行攻击的风险。

The goal behind CFI is to try to ensure that indirect calls go to the expected addresses and that the return addresses are not changed.

具体说就是用来保护indirect (function) call这种code flow的.

Function pointers are used for indirect function calls, which are different than direct function calls because the address of the call site is not stored in the (non-writable) kernel text. Instead, the address for the call site is fetched from memory, placed into a register, and the call is made via that value. If an attacker can change the memory, they can control where the call actually ends up going. That is the “forward edge” of an indirect call, while the return address on the stack is the “backward edge” of the call. Either can be used by an attacker to redirect the code flow.

这个indirect call就是函数指针了,有两个可攻击的点:一个是forward edge(goto expected address),一个是backward edge(return address)。

The writable function pointers can only exist in the kernel’s heap and stack due to the earlier tightening of the access to the rest of memory. Function pointers can be stored in the heap or on the stack. It turns out that making the stack read-only “makes it very hard to use”, Cook said with a chuckle.

函数指针在kernel heap or stack上, 这些区域都可写,不像direct call那样都在只读区。

问题分析

kernel 5.x, crash stack:

[ 7322.456761] Kernel panic - not syncing: CFI failure (target: 0x1ffffffff):
[...]
[ 7322.485335] Call trace:
[ 7322.488711] dump_backtrace.cfi_jt+0x0/0x8
[ 7322.493730] show_stack+0x28/0x38
[ 7322.497969] dump_stack_lvl+0x84/0xcc
[ 7322.502560] panic+0x180/0x444
[ 7322.506537] __cfi_slowpath_diag+0x1e4/0x230
[ 7322.511733] dma_buf_unmap_attachment+0x108/0x144
[...]

objdump定位到dma_buf_unmap_attachment+0x108如下882行:

; drivers/dma-buf/dma-buf.c:882
ffffffc008b973ac: 97df944b bl 0xffffffc00837c4d8 <__cfi_slowpath_diag>
ffffffc008b973b0: f94003e8 ldr x8, [sp]
ffffffc008b973b4: f9400fa1 ldr x1, [x29, #24]
875 static void __unmap_dma_buf(struct dma_buf_attachment *attach,
876 struct sg_table *sg_table,
877 enum dma_data_direction direction)
878 {
879 /* uses XOR, hence this unmangles */
880 mangle_sg_table(sg_table);
881
882 attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, direction); //tj: here
883 }

啊,这是一个indirect function call。这为啥挂了,看下kernel cfi code:

26 static inline void handle_cfi_failure(void *ptr)
27 {
28 if (IS_ENABLED(CONFIG_CFI_PERMISSIVE))
29 WARN_RATELIMIT(1, "CFI failure (target: %pS):\n", ptr);
30 else
31 panic("CFI failure (target: %pS)\n", ptr); //tj: here
32 }

这个target 0x1ffffffff是啥?简单跟下:

我们配了MODULES:

void __cfi_slowpath_diag(uint64_t id, void *ptr, void *diag)
{
___cfi_slowpath_diag(id, ptr, diag);
}
EXPORT_SYMBOL(__cfi_slowpath_diag);

#else /* !CONFIG_MODULES */
static inline void __nocfi ___cfi_slowpath_diag(uint64_t id, void *ptr, void *diag)
{
cfi_check_fn fn = find_check_fn((unsigned long)ptr);

if (likely(fn))
fn(id, ptr, diag);
else /* Don't allow unchecked modules */
handle_cfi_failure(ptr); //tj: here
}

看上去这个ptr(=0x1ffffffff)不是一个function, so走到了unchecked modules分支,rt? 内存出问题了? 不应该。。。看栈是通过ioctl下来的,所以可能是用户态触发了一个非法请求,暂不论,我们看看如何在内核态规避这个问题。

To assist in debugging CFI failures, enable CONFIG_CFI_PERMISSIVE, which prints out a warning instead of causing a kernel panic. Permissive mode must not be used in production.

使能内核配置CONFIG_CFI_PERMISSIVE,不过我试了下竟然起不来。。。什么鬼,没跟了,而且一开全开似乎不妥,有没有单一文件跳过CFI的?答案是有的,CFI提交日志写道:

commit cf68fffb66d60d96209446bfc4a15291dc5a5d41
Author: Sami Tolvanen <samitolvanen@google.com>
Date: Thu Apr 8 11:28:26 2021 -0700

add support for Clang CFI

[...]

CFI checking can be disabled in a function with the __nocfi attribute.
Additionally, CFI can be disabled for an entire compilation unit by
filtering out CC_FLAGS_CFI.

这个帅啊,可以函数禁用。

diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index d217c382b02d..6de9d0c9377e 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -61,3 +61,5 @@
#if __has_feature(shadow_call_stack)
# define __noscs __attribute__((__no_sanitize__("shadow-call-stack")))
#endif
+
+#define __nocfi __attribute__((__no_sanitize__("cfi")))

注意到没,___cfi_slowpath_diag()就有这个属性。

加上即可:

875 static void __nocfi __unmap_dma_buf(struct dma_buf_attachment *attach,

反汇编看了下,确是没了,完美。不过上层的问题不能老让kernel修吧。。。

References