Rust 因其强大的安全性能而备受青睐,尤其是在内存安全和线程安全方面。然而,这是否意味着只要使用 Rust,就一定能避免编写出不安全的代码呢?事实并非如此。在某些场景下,开发者不得不使用 unsafe Rust 来完成任务,这也带来了潜在的安全隐患。那么,如何在这些不可避免的情况下,最大程度地降低风险,确保代码的可靠性呢?
特斯拉工程师 Colin Breck 针对此问题撰文,总结了三种有效的实践方法,希望能对开发者有所裨益。
原文:https://blog.colinbreck.com/making-unsafe-rust-a-little-safer-tools-for-verifying-unsafe-code/
Rust 之所以能成为一种流行的系统编程语言,其中一个原因是它具有出色的性能,同时可以在编译时消除内存和并发错误,而这些错误在其他具有类似性能特性的语言(如 C 和 C++)中很难避免。不过,开发者可以通过编写 unsafe Rust 代码来绕过这些编译时检查。尽管绝大多数程序员不应编写不安全的 Rust 代码,但一些库出于性能需求、以及直接操作内存或硬件,或者与其他库和系统调用集成的目的,使用了不安全的 Rust 代码。
接下来,本文将探讨验证不安全 Rust 代码的工具,包括从 C 或 C++ 编写的库中调用的不安全代码。现下我想要深入这一主题,主要目的也是想为运营技术(OT)和关键基础设施编写安全且可靠的软件。
内存检测工具 Sanitizers
Sanitizers 是一种运行时工具,专门用来检测程序运行中的问题,比如内存损坏、内存泄漏或线程之间的数据竞争。它的工作原理是在编译代码时自动插入检查机制,帮助验证程序的行为是否正常。在使用 Sanitizers 时虽然它会引入内存和性能开销,但通常仅用于测试环境中。重要的是,与编译器不同,Sanitizer 只能检测在运行时实际被执行的代码路径中的错误——这可以通过测试或直接运行程序来实现。
当我第一次得知 Rust 支持用于查找错误的 Sanitizers 时,我感到很惊讶。因为过往,我比较熟悉如何在 C 和 C++ 中通过 Clang 和 LLVM 编译器使用 Sanitizers。由于 Rust 的编译器 rustc 也是基于 LLVM 构建的,它同样可以使用这些 Sanitizers。
内存越界访问/缓冲区溢出
看一下下面的程序:
fn bad_address(i: i32) -> i32 {
let xs: [i32; 4] = [0, 1, 2, 3];
xs[i as usize]
}
fn main() {
let v = bad_address(4);
println!("Value at offset: {}", v);
}
当我使用 RUST_BACKTRACE=1 cargo run --release 运行程序时,Rust 的边界检查检测到了错误,程序会 panic(崩溃):
thread 'main' panicked at src/main.rs:3:5:
index out of bounds: the len is 4 but the index is 4
stack backtrace:
0: _rust_begin_unwind
1: core::panicking::panic_fmt
2: core::panicking::panic_bounds_check
3: sanitizers::main
程序被终止,这种情况可能是开发者极不愿看到甚至是难以接受的,尤其当该软件对关键基础设施的运行至关重要时,可能会引发其他安全问题。然而,运行时检查可确保程序永远不会执行导致未定义行为的不安全代码。
现在考虑一种情况——如果该函数在一个 unsafe 代码块中使用指针索引数组会发生什么:
fn bad_address(i: i32) -> i32 {
let xs: [i32; 4] = [0, 1, 2, 3];
unsafe { *xs.as_ptr().offset(i as isize) }
}
fn main() {
let v = bad_address(4);
println!("Value at offset: {}", v);
}
在不安全代码中,Rust 编译器不再提供内存和线程安全的保障。程序员需要自己确保不安全代码是符合规则的,并且不会导致未定义行为。当我运行这段代码时,即使程序读取了数组边界外的内存,也不会触发 panic。
Value at offset: 24576
Rust 的 AddressSanitizer 可以帮忙检查代码中对堆栈和堆的越界访问。它的原理是,AddressSanitizer 通过在内存分配之间插入一些“红区”(red-zones),这些区域不能被访问,同时使用影子内存(shadow memory)追踪内存是否被非法读写。如果程序访问了不该碰的内存,AddressSanitizer 就会报错。需要注意的是,这个工具只能在 Rust 的 nightly 版本中使用,不能用在稳定版上。但别担心,nightly 和稳定版工具链可以同时安装,不会互相影响。要安装 nightly 工具链,你可以这样操作:
rustup install nightly
然后启动 AddressSanitizer 运行程序:
export RUSTFLAGS=-Zsanitizer=address
cargo +nightly run
程序会因为越界访问而崩溃,并生成详细的错误报告:
=================================================================
==96148==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x00016dce67b0 at pc 0x00010211bf70 bp 0x00016dce6770 sp 0x00016dce6768
READ of size 4 at 0x00016dce67b0 thread T0
#0 0x00010211bf6c in array_out_of_bounds_unsafe::bad_address::h9a9dae85f9ad5feb array_out_of_bounds_unsafe.rs:3
#1 0x00010211c170 in array_out_of_bounds_unsafe::main::hc84cbff8319e0a2b array_out_of_bounds_unsafe.rs:7
#2 0x00010211bd40 in core::ops::function::FnOnce::call_once::hc75a52fb9134d583 function.rs:250
#3 0x00010211bd8c in std::sys::backtrace::__rust_begin_short_backtrace::h9c09c1d17c8393c3 backtrace.rs:152
#4 0x00010211b888 in std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h3a3a442dfff79e34 rt.rs:195
#5 0x000102135230 in std::rt::lang_start_internal::hc996363c321dd410+0x440 (array_out_of_bounds_unsafe:arm64+0x10001d230)
#6 0x00010211b6c0 in std::rt::lang_start::hae3ff67dcefd99eb rt.rs:194
#7 0x00010211c2e0 in main+0x20 (array_out_of_bounds_unsafe:arm64+0x1000042e0)
#8 0x00019d87e0dc ()
#9 0xf4687ffffffffffc ()
Address 0x00016dce67b0 is located in stack of thread T0 at offset 48 in frame
#0 0x00010211bdbc in array_out_of_bounds_unsafe::bad_address::h9a9dae85f9ad5feb array_out_of_bounds_unsafe.rs:1
This frame has 1 object(s):
[32, 48) 'xs' (line 2) <== Memory access at offset 48 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow array_out_of_bounds_unsafe.rs:3 in array_out_of_bounds_unsafe::bad_address::h9a9dae85f9ad5feb
Shadow bytes around the buggy address:
0x00016dce6500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00016dce6580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00016dce6600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00016dce6680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00016dce6700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x00016dce6780: f1 f1 f1 f1 00 00[f3]f3 00 00 00 00 00 00 00 00
0x00016dce6800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00016dce6880: 00 00 00 00 f1 f1 f1 f1 f8 f8 f2 f2 f8 f8 f8 f8
0x00016dce6900: f8 f8 f2 f2 f2 f2 04 f3 00 00 00 00 00 00 00 00
0x00016dce6980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00016dce6a00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==96148==ABORTING
上述这一示例是在 debug 模式下运行的。如果在 release 模式下运行,由于编译器优化,可能无法识别到该错误。因此,在 release 构建中使用 Sanitizer 时,务必禁用编译器优化:
export RUSTFLAGS="-C opt-level=0 -Zsanitizer=address"
cargo +nightly run --release
值得一提的是,AddressSanitizer 不是每次都能发现内存越界的问题。在前面的例子中,程序的表现取决于我访问数组时用的索引值:程序可能正常运行,也可能因为访问了未知地址而报 SEGV 错误,或者因为堆栈溢出直接崩溃。
数据竞争
为了完整讨论 Sanitizer(检测工具),我还想分享另一个示例,讨论一下在不安全 Rust 代码中出现的错误可以通过 Sanitizer 检测到的方法。再来看一下以下代码,该代码从不同线程中的不安全代码访问共享的可变变量:
fn main() {
static mut A: usize = 0;
let t = std::thread::spawn(|| {
unsafe { A += 1 };
});
unsafe { A += 1 };
t.join().unwrap();
}
正常运行此程序不会产生运行时错误,但当启用 ThreadSanitizer 运行时,它会出现这种情况:
export RUSTFLAGS=-Zsanitizer=thread
cargo +nightly run
它将检测到数据竞争并生成详细的报告:
==================
WARNING: ThreadSanitizer: data race (pid=12331)
Read of size 8 at 0x000104f40460 by thread T1:
#0 sanitizers::main::_$u7b$$u7b$closure$u7d$$u7d$::h77c6a8d926b4ffd9 main.rs:5 (sanitizers:arm64+0x10000ae68)
#1 std::sys::backtrace::__rust_begin_short_backtrace::h3d7723e74dc43907 backtrace.rs:152 (sanitizers:arm64+0x100008f6c)
#2 std::thread::Builder::spawn_unchecked_::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h972cca723fc4d46a mod.rs:561 (sanitizers:arm64+0x1000033a4)
#3 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h339263acc4287c3b unwind_safe.rs:272 (sanitizers:arm64+0x100004e64)
#4 std::panicking::try::do_call::h1720a438c6154692 panicking.rs:573 (sanitizers:arm64+0x1000090b8)
#5 __rust_try (sanitizers:arm64+0x10000351c)
#6 std::thread::Builder::spawn_unchecked_::_$u7b$$u7b$closure$u7d$$u7d$::h4a9e1cb91e992611 mod.rs:559 (sanitizers:arm64+0x100002cf0)
#7 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h4d54278169623269 function.rs:250 (sanitizers:arm64+0x100005254)
#8 std::sys::pal::unix::thread::Thread::new::thread_start::h5efa5b2bb0838bc2 (sanitizers:arm64+0x10002acd4)
Previous write of size 8 at 0x000104f40460 by main thread:
#0 sanitizers::main::he9b6ca8696085c08 main.rs:7 (sanitizers:arm64+0x100004b9c)
#1 core::ops::function::FnOnce::call_once::hba41c0d640901898 function.rs:250 (sanitizers:arm64+0x10000540c)
#2 std::sys::backtrace::__rust_begin_short_backtrace::hc59209a0a1d24814 backtrace.rs:152 (sanitizers:arm64+0x10000901c)
#3 std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h915b5f69c928813d rt.rs:195 (sanitizers:arm64+0x100005148)
#4 std::rt::lang_start_internal::hc996363c321dd410 (sanitizers:arm64+0x10002428c)
#5 main (sanitizers:arm64+0x100004d6c)
Location is global 'sanitizers::main::A::h92ea287e34ba2e52' at 0x000104f40460 (sanitizers+0x100058460)
Thread T1 (tid=11227209, running) created by main thread at:
#0 pthread_create (librustc-nightly_rt.tsan.dylib:arm64+0xa0a8)
#1 std::sys::pal::unix::thread::Thread::new::h0b16ad3e3a52b1cf (sanitizers:arm64+0x10002ab38)
#2 std::thread::Builder::spawn_unchecked::hbd40c84e3aa877bf mod.rs:467 (sanitizers:arm64+0x100002028)
#3 std::thread::spawn::hd17317d53012bcc4 mod.rs:730 (sanitizers:arm64+0x100001fa0)
#4 sanitizers::main::he9b6ca8696085c08 main.rs:4 (sanitizers:arm64+0x100004b54)
#5 core::ops::function::FnOnce::call_once::hba41c0d640901898 function.rs:250 (sanitizers:arm64+0x10000540c)
#6 std::sys::backtrace::__rust_begin_short_backtrace::hc59209a0a1d24814 backtrace.rs:152 (sanitizers:arm64+0x10000901c)
#7 std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h915b5f69c928813d rt.rs:195 (sanitizers:arm64+0x100005148)
#8 std::rt::lang_start_internal::hc996363c321dd410 (sanitizers:arm64+0x10002428c)
#9 main (sanitizers:arm64+0x100004d6c)
SUMMARY: ThreadSanitizer: data race main.rs:5 in sanitizers::main::_$u7b$$u7b$closure$u7d$$u7d$::h77c6a8d926b4ffd9
==================
ThreadSanitizer: reported 1 warnings
Miri
Rust 支持的一套 Sanitizer 工具在查找不安全代码错误方面非常有帮助,但它们不是万能的,无法找到所有错误。而且,这些 Sanitizer 工具之间并不完全兼容,每次只能单独运行,这会增加测试次数和耗费的时间。
Miri 是另一种工具,它是一个解释器,可以更准确地发现不安全代码中的问题,比如越界访问、内存泄漏、使用未初始化数据、释放后使用(use-after-free)以及数据竞争等。它的工作原理是解释 Rust 的中间表示(MIR),这种方式介于编译器的静态分析和 Sanitizer 的动态分析之间,更有针对性地发现潜在问题。
与 Sanitizer 类似,Miri 依赖于 Rust 的 nightly 工具链,安装也很简单:
rustup +nightly component add miri
内存越界访问
让我们重新考虑之前的越界内存访问示例:
fn bad_address(i: i32) -> i32 {
let xs: [i32; 4] = [0, 1, 2, 3];
unsafe { *xs.as_ptr().offset(i as isize) }
}
fn main() {
let v = bad_address(4000);
println!("Value at offset: {}", v);
}
使用 Miri 十分简单:
cargo +nightly miri run
它会报告越界访问错误并提供回溯信息:
error: Undefined Behavior: out-of-bounds pointer arithmetic: expected a pointer to 16000 bytes of memory, but got alloc870 which is only 16 bytes from the end of the allocation
--> src/main.rs:3:15
|
3 | unsafe { *xs.as_ptr().offset(i as isize) }
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ out-of-bounds pointer arithmetic: expected a pointer to 16000 bytes of memory, but got alloc870 which is only 16 bytes from the end of the allocation
|
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
help: alloc870 was allocated here:
--> src/main.rs:2:9
|
2 | let xs: [i32; 4] = [0, 1, 2, 3];
| ^^
= note: BACKTRACE (of the first span):
= note: inside `bad_address` at src/main.rs:3:15: 3:45
note: inside `main`
--> src/main.rs:7:13
|
7 | let v = bad_address(4000);
| ^^^^^^^^^^^^^^^^^
note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace
error: aborting due to 1 previous error
虽然 Sanitizer 也检测到了相同的错误,但 Miri 的输出更加具体,易于理解,包括代码片段而不是内存地址和栈帧。此外,Miri 在单次执行中检查多种未定义行为。与 Sanitizer 类似,Miri 只解释实际执行的代码路径(无论是在测试中还是在二进制程序运行时),并且不会发现未被解释的代码路径中的错误。
数据竞争
为了完整性,我再次使用存在数据竞争的代码,但这次使用 Miri 在测试中运行它:
fn data_race() {
static mut A: usize = 0;
let t = std::thread::spawn(|| {
unsafe { A += 1 };
});
unsafe { A += 1 };
t.join().unwrap();
}
#[cfg(test)]
mod tests {
use crate::data_race;
#[test]
fn data_race_test() {
data_race();
}
}
执行 Miri 的测试命令如下:
cargo +nightly miri test
Miri 成功识别出数据竞争,并包含具体的代码片段和错误信息,比上面 Rust ThreadSanitizer 的输出更易于理解:
running 1 test
test tests::data_race_test ... error: Undefined Behavior: Data race detected between (1) non-atomic write on thread `tests::data_race_test` and (2) non-atomic read on thread `unnamed-2` at alloc1. (2) just happened here
--> src/main.rs:5:18
|
5 | unsafe { A += 1 };
| ^^^^^^ Data race detected between (1) non-atomic write on thread `tests::data_race_test` and (2) non-atomic read on thread `unnamed-2` at alloc1. (2) just happened here
|
help: and (1) occurred earlier here
--> src/main.rs:7:14
|
7 | unsafe { A += 1 };
| ^^^^^^
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
= note: BACKTRACE (of the first span) on thread `unnamed-2`:
= note: inside closure at src/main.rs:5:18: 5:24
note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace
error: aborting due to 1 previous error
C 和 C++ 库的情况
Miri 是一款非常实用的工具,但它还有一个限制:目前它无法解释通过 Rust 外部函数接口(FFI) 调用的代码,而 FFI 是 Rust 调用 C 和 C++ 库的方式。例如,rusqlite 用 FFI 调用 C 语言编写的 SQLite 数据库,duckdb-rs 通过用 FFI 调用了用 C++ 编写的 DuckDB,以及 open62541 调用用 C 语言编写的 OPC UA 库。由于 Miri 使用跨平台的解释器运行程序,程序无法访问 FFI 或大多数平台特定的 API。只有一些常见的功能,比如文件系统访问和标准输出打印,被 Miri 支持。
好消息是,我们可以返回使用 C 和 C++ 编译器(如 GCC 或 Clang)提供的 Sanitizer。关键在于,C 或 C++ 代码必须在调用前通过启动相应的 Sanitizer 进行编译,然后才能在 Rust 调用它。
观察下面 C 语言代码中的不安全示例:
#include <stdio.h>
#include <string.h>
void c_say_hello(const char *message) {
char buffer[10];
strcpy(buffer, message); // Unsafe: no bounds checking!
printf("Hello from C! %s\n", buffer);
}
可以通过一个 build.rs 文件使用 Clang 编译 C 代码,并启用 AddressSanitizer:
fn main() {
let mut build = cc::Build::new();
build
.compiler("clang")
.file("c_src/c_code.c")
.flag("-Wall") // Enable warnings
.flag("-fsanitize=address") // Enable AddressSanitizer
.flag("-fno-omit-frame-pointer"); // Simplify stack tracing
build.compile("c_code");
// Ensure the build script reruns if the C file changes
println!("cargo:rerun-if-changed=c_src/c_code.c");
}
然后,可以使用 FFI 从不安全块中的 Rust 库调用 C 代码:
use std::ffi::{c_char, CString};
#[link(name = "c_code")] // Link to the compiled library
extern "C" {
fn c_say_hello(name: *const c_char);
}
pub fn say_hello(message: &str) {
let name = CString::new(message).expect("CString::new failed");
unsafe {
c_say_hello(name.as_ptr()); // Call the C function
}
}
最后,可以在程序中调用包装 C 代码的“安全” Rust 函数:
use sanitizers::say_hello;
fn main() {
say_hello("This is far too long and will do bad things!");
}
在启用 AddressSanitizer 的情况下运行程序:
export RUSTFLAGS=-Zsanitizer=address
cargo +nightly run
它会报告栈缓冲区溢出错误:
=================================================================
==51935==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x00016fa7e76a at pc 0x000100c32ad0 bp 0x00016fa7e730 sp 0x00016fa7dee0
WRITE of size 45 at 0x00016fa7e76a thread T0
#0 0x000100c32acc in strcpy+0x4ec (librustc-nightly_rt.asan.dylib:arm64+0x4aacc)
#1 0x000100383ee8 in c_say_hello+0x11c (sanitizers:arm64+0x100003ee8)
#2 0x000100383328 in sanitizers::say_hello::h7a86e3249bf087ea+0x1d8 (sanitizers:arm64+0x100003328)
#3 0x0001003815c0 in sanitizers::main::h11beac415f2c6ee0 main.rs:4
#4 0x0001003818d8 in core::ops::function::FnOnce::call_once::h6f36f80e70ecb8a5 function.rs:250
#5 0x000100381910 in std::sys::backtrace::__rust_begin_short_backtrace::h5a6edce2cadaf2f4 backtrace.rs:152
#6 0x000100381488 in std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h6d09ba155578db90 rt.rs:195
#7 0x00010039ccfc in std::rt::lang_start_internal::hc996363c321dd410+0x440 (sanitizers:arm64+0x10001ccfc)
#8 0x0001003812c0 in std::rt::lang_start::hf8df676e77f16e31 rt.rs:194
#9 0x0001003815ec in main+0x20 (sanitizers:arm64+0x1000015ec)
#10 0x00019d87e0dc ()
#11 0xb922fffffffffffc ()
Address 0x00016fa7e76a is located in stack of thread T0 at offset 42 in frame
#0 0x000100383dd8 in c_say_hello+0xc (sanitizers:arm64+0x100003dd8)
This frame has 1 object(s):
[32, 42) 'buffer' (line 5) <== Memory access at offset 42 overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow (librustc-nightly_rt.asan.dylib:arm64+0x4aacc) in strcpy+0x4ec
Shadow bytes around the buggy address:
0x00016fa7e480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00016fa7e500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00016fa7e580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00016fa7e600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00016fa7e680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x00016fa7e700: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00[02]f3 f3
0x00016fa7e780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00016fa7e800: 00 00 00 00 f1 f1 f1 f1 f8 f8 f8 f8 f2 f2 f2 f2
0x00016fa7e880: 00 00 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
0x00016fa7e900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x00016fa7e980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==51935==ABORTING
需要注意的是,错误是发生在调用 c_say_hello 函数里的 strcpy 函数时,也就是 C 的栈帧中。这说明通过 FFI 调用的 C 代码不再是“看不见的黑盒”,可以被检测到问题。如果你把这个例子中的 C 代码用普通方式编译(不加 AddressSanitizer,试着从 build.rs 文件里去掉相关的那一行),而 Rust 代码仍然用 AddressSanitizer 运行,那么程序还是会报错,提示栈缓冲区溢出(stack-buffer-overflow)。不过,这时错误信息会不那么具体,并且会显示问题出在 Rust 代码中,而不是 C 代码里。
结论
本文探讨了三种验证不安全 Rust 代码的技术,以提高代码安全性并避免可能带来严重后果的未定义行为。这些后果包括故障、安全漏洞、违反法规、经济损失、人员伤害甚至死亡。本文并非旨在进行详尽的调查,而是我为解决实际问题所探索的故事、推荐使用的工具。总结三种技术如下:
1. Sanitizer:在运行时检测不安全 Rust 代码;
2. Miri 解释器,用于检查不安全 Rust 代码;
3. C 和 C++ Sanitizer:在 Rust 中调用 C 和 C++ 库时进行运行时检测。
大多数系统程序员和应用开发者都不应编写不安全的 Rust 代码。不安全 Rust 主要应该属于库开发者的领域。但如果必须编写不安全代码,或者通过所依赖的库调用不安全代码,建议使用 Sanitizer 或 Miri 对代码进行测试,避免出现各种错误。