写了Rust不安全代码怎么办?特斯拉工程师支了三招

Rust 因其强大的安全性能而备受青睐,尤其是在内存安全和线程安全方面。然而,这是否意味着只要使用 Rust,就一定能避免编写出不安全的代码呢?事实并非如此。在某些场景下,开发者不得不使用 unsafe Rust 来完成任务,这也带来了潜在的安全隐患。那么,如何在这些不可避免的情况下,最大程度地降低风险,确保代码的可靠性呢?

特斯拉工程师 Colin Breck 针对此问题撰文,总结了三种有效的实践方法,希望能对开发者有所裨益。

原文:https://blog.colinbreck.com/making-unsafe-rust-a-little-safer-tools-for-verifying-unsafe-code/

作者 | Colin Breck       责编 | 苏宓
出品 | CSDN(ID:CSDNnews)

Rust 之所以能成为一种流行的系统编程语言,其中一个原因是它具有出色的性能,同时可以在编译时消除内存和并发错误,而这些错误在其他具有类似性能特性的语言(如 C 和 C++)中很难避免。不过,开发者可以通过编写 unsafe Rust 代码来绕过这些编译时检查。尽管绝大多数程序员不应编写不安全的 Rust 代码,但一些库出于性能需求、以及直接操作内存或硬件,或者与其他库和系统调用集成的目的,使用了不安全的 Rust 代码。

接下来,本文将探讨验证不安全 Rust 代码的工具,包括从 C 或 C++ 编写的库中调用的不安全代码。现下我想要深入这一主题,主要目的也是想为运营技术(OT)和关键基础设施编写安全且可靠的软件。

图片

内存检测工具 Sanitizers

Sanitizers 是一种运行时工具,专门用来检测程序运行中的问题,比如内存损坏、内存泄漏或线程之间的数据竞争。它的工作原理是在编译代码时自动插入检查机制,帮助验证程序的行为是否正常。在使用 Sanitizers 时虽然它会引入内存和性能开销,但通常仅用于测试环境中。重要的是,与编译器不同,Sanitizer 只能检测在运行时实际被执行的代码路径中的错误——这可以通过测试或直接运行程序来实现。

当我第一次得知 Rust 支持用于查找错误的 Sanitizers 时,我感到很惊讶。因为过往,我比较熟悉如何在 C 和 C++ 中通过 Clang 和 LLVM 编译器使用 Sanitizers。由于 Rust 的编译器 rustc 也是基于 LLVM 构建的,它同样可以使用这些 Sanitizers。

内存越界访问/缓冲区溢出

看一下下面的程序:

fn bad_address(i: i32) -> i32 {    let xs: [i32; 4] = [0, 1, 2, 3];    xs[i as usize]}

fn main() { let v = bad_address(4); println!("Value at offset: {}", v);}

当我使用 RUST_BACKTRACE=1 cargo run --release 运行程序时,Rust 的边界检查检测到了错误,程序会 panic(崩溃):

thread 'main' panicked at src/main.rs:3:5:index out of bounds: the len is 4 but the index is 4stack backtrace:   0: _rust_begin_unwind   1: core::panicking::panic_fmt   2: core::panicking::panic_bounds_check   3: sanitizers::main

程序被终止,这种情况可能是开发者极不愿看到甚至是难以接受的,尤其当该软件对关键基础设施的运行至关重要时,可能会引发其他安全问题。然而,运行时检查可确保程序永远不会执行导致未定义行为的不安全代码。

现在考虑一种情况——如果该函数在一个 unsafe 代码块中使用指针索引数组会发生什么:

fn bad_address(i: i32) -> i32 {    let xs: [i32; 4] = [0, 1, 2, 3];    unsafe { *xs.as_ptr().offset(i as isize) }}
fn main() { let v = bad_address(4); println!("Value at offset: {}", v);}

在不安全代码中,Rust 编译器不再提供内存和线程安全的保障。程序员需要自己确保不安全代码是符合规则的,并且不会导致未定义行为。当我运行这段代码时,即使程序读取了数组边界外的内存,也不会触发 panic。

Value at offset: 24576

Rust 的 AddressSanitizer 可以帮忙检查代码中对堆栈和堆的越界访问。它的原理是,AddressSanitizer 通过在内存分配之间插入一些“红区”(red-zones),这些区域不能被访问,同时使用影子内存(shadow memory)追踪内存是否被非法读写。如果程序访问了不该碰的内存,AddressSanitizer 就会报错。需要注意的是,这个工具只能在 Rust 的 nightly 版本中使用,不能用在稳定版上。但别担心,nightly 和稳定版工具链可以同时安装,不会互相影响。要安装 nightly 工具链,你可以这样操作:

rustup install nightly

然后启动 AddressSanitizer 运行程序:

export RUSTFLAGS=-Zsanitizer=addresscargo +nightly run

程序会因为越界访问而崩溃,并生成详细的错误报告:

===================================================================96148==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x00016dce67b0 at pc 0x00010211bf70 bp 0x00016dce6770 sp 0x00016dce6768READ of size 4 at 0x00016dce67b0 thread T0    #0 0x00010211bf6c in array_out_of_bounds_unsafe::bad_address::h9a9dae85f9ad5feb array_out_of_bounds_unsafe.rs:3    #1 0x00010211c170 in array_out_of_bounds_unsafe::main::hc84cbff8319e0a2b array_out_of_bounds_unsafe.rs:7    #2 0x00010211bd40 in core::ops::function::FnOnce::call_once::hc75a52fb9134d583 function.rs:250    #3 0x00010211bd8c in std::sys::backtrace::__rust_begin_short_backtrace::h9c09c1d17c8393c3 backtrace.rs:152    #4 0x00010211b888 in std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h3a3a442dfff79e34 rt.rs:195    #5 0x000102135230 in std::rt::lang_start_internal::hc996363c321dd410+0x440 (array_out_of_bounds_unsafe:arm64+0x10001d230)    #6 0x00010211b6c0 in std::rt::lang_start::hae3ff67dcefd99eb rt.rs:194    #7 0x00010211c2e0 in main+0x20 (array_out_of_bounds_unsafe:arm64+0x1000042e0)    #8 0x00019d87e0dc  ()    #9 0xf4687ffffffffffc  ()

Address 0x00016dce67b0 is located in stack of thread T0 at offset 48 in frame #0 0x00010211bdbc in array_out_of_bounds_unsafe::bad_address::h9a9dae85f9ad5feb array_out_of_bounds_unsafe.rs:1

This frame has 1 object(s): [32, 48) 'xs' (line 2) <== Memory access at offset 48 overflows this variableHINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork (longjmp and C++ exceptions *are* supported)SUMMARY: AddressSanitizer: stack-buffer-overflow array_out_of_bounds_unsafe.rs:3 in array_out_of_bounds_unsafe::bad_address::h9a9dae85f9ad5febShadow bytes around the buggy address: 0x00016dce6500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00016dce6580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00016dce6600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00016dce6680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00016dce6700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00=>0x00016dce6780: f1 f1 f1 f1 00 00[f3]f3 00 00 00 00 00 00 00 00 0x00016dce6800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00016dce6880: 00 00 00 00 f1 f1 f1 f1 f8 f8 f2 f2 f8 f8 f8 f8 0x00016dce6900: f8 f8 f2 f2 f2 f2 04 f3 00 00 00 00 00 00 00 00 0x00016dce6980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00016dce6a00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb==96148==ABORTING

上述这一示例是在 debug 模式下运行的。如果在 release 模式下运行,由于编译器优化,可能无法识别到该错误。因此,在 release 构建中使用 Sanitizer 时,务必禁用编译器优化:

export RUSTFLAGS="-C opt-level=0 -Zsanitizer=address"cargo +nightly run --release

值得一提的是,AddressSanitizer 不是每次都能发现内存越界的问题。在前面的例子中,程序的表现取决于我访问数组时用的索引值:程序可能正常运行,也可能因为访问了未知地址而报 SEGV 错误,或者因为堆栈溢出直接崩溃。

数据竞争

为了完整讨论 Sanitizer(检测工具),我还想分享另一个示例,讨论一下在不安全 Rust 代码中出现的错误可以通过 Sanitizer 检测到的方法。再来看一下以下代码,该代码从不同线程中的不安全代码访问共享的可变变量:

fn main() {    static mut A: usize = 0;

let t = std::thread::spawn(|| { unsafe { A += 1 }; }); unsafe { A += 1 };

t.join().unwrap();}

正常运行此程序不会产生运行时错误,但当启用 ThreadSanitizer 运行时,它会出现这种情况:

export RUSTFLAGS=-Zsanitizer=threadcargo +nightly run

它将检测到数据竞争并生成详细的报告:

==================WARNING: ThreadSanitizer: data race (pid=12331)  Read of size 8 at 0x000104f40460 by thread T1:    #0 sanitizers::main::_$u7b$$u7b$closure$u7d$$u7d$::h77c6a8d926b4ffd9 main.rs:5 (sanitizers:arm64+0x10000ae68)    #1 std::sys::backtrace::__rust_begin_short_backtrace::h3d7723e74dc43907 backtrace.rs:152 (sanitizers:arm64+0x100008f6c)    #2 std::thread::Builder::spawn_unchecked_::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h972cca723fc4d46a mod.rs:561 (sanitizers:arm64+0x1000033a4)    #3 _$LT$core..panic..unwind_safe..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h339263acc4287c3b unwind_safe.rs:272 (sanitizers:arm64+0x100004e64)    #4 std::panicking::try::do_call::h1720a438c6154692 panicking.rs:573 (sanitizers:arm64+0x1000090b8)    #5 __rust_try  (sanitizers:arm64+0x10000351c)    #6 std::thread::Builder::spawn_unchecked_::_$u7b$$u7b$closure$u7d$$u7d$::h4a9e1cb91e992611 mod.rs:559 (sanitizers:arm64+0x100002cf0)    #7 core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h4d54278169623269 function.rs:250 (sanitizers:arm64+0x100005254)    #8 std::sys::pal::unix::thread::Thread::new::thread_start::h5efa5b2bb0838bc2  (sanitizers:arm64+0x10002acd4)

Previous write of size 8 at 0x000104f40460 by main thread: #0 sanitizers::main::he9b6ca8696085c08 main.rs:7 (sanitizers:arm64+0x100004b9c) #1 core::ops::function::FnOnce::call_once::hba41c0d640901898 function.rs:250 (sanitizers:arm64+0x10000540c) #2 std::sys::backtrace::__rust_begin_short_backtrace::hc59209a0a1d24814 backtrace.rs:152 (sanitizers:arm64+0x10000901c) #3 std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h915b5f69c928813d rt.rs:195 (sanitizers:arm64+0x100005148) #4 std::rt::lang_start_internal::hc996363c321dd410 (sanitizers:arm64+0x10002428c) #5 main (sanitizers:arm64+0x100004d6c)

Location is global 'sanitizers::main::A::h92ea287e34ba2e52' at 0x000104f40460 (sanitizers+0x100058460)

Thread T1 (tid=11227209, running) created by main thread at: #0 pthread_create (librustc-nightly_rt.tsan.dylib:arm64+0xa0a8) #1 std::sys::pal::unix::thread::Thread::new::h0b16ad3e3a52b1cf (sanitizers:arm64+0x10002ab38) #2 std::thread::Builder::spawn_unchecked::hbd40c84e3aa877bf mod.rs:467 (sanitizers:arm64+0x100002028) #3 std::thread::spawn::hd17317d53012bcc4 mod.rs:730 (sanitizers:arm64+0x100001fa0) #4 sanitizers::main::he9b6ca8696085c08 main.rs:4 (sanitizers:arm64+0x100004b54) #5 core::ops::function::FnOnce::call_once::hba41c0d640901898 function.rs:250 (sanitizers:arm64+0x10000540c) #6 std::sys::backtrace::__rust_begin_short_backtrace::hc59209a0a1d24814 backtrace.rs:152 (sanitizers:arm64+0x10000901c) #7 std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h915b5f69c928813d rt.rs:195 (sanitizers:arm64+0x100005148) #8 std::rt::lang_start_internal::hc996363c321dd410 (sanitizers:arm64+0x10002428c) #9 main (sanitizers:arm64+0x100004d6c)

SUMMARY: ThreadSanitizer: data race main.rs:5 in sanitizers::main::_$u7b$$u7b$closure$u7d$$u7d$::h77c6a8d926b4ffd9==================ThreadSanitizer: reported 1 warnings

图片

Miri

Rust 支持的一套 Sanitizer 工具在查找不安全代码错误方面非常有帮助,但它们不是万能的,无法找到所有错误。而且,这些 Sanitizer 工具之间并不完全兼容,每次只能单独运行,这会增加测试次数和耗费的时间。

Miri 是另一种工具,它是一个解释器,可以更准确地发现不安全代码中的问题,比如越界访问、内存泄漏、使用未初始化数据、释放后使用(use-after-free)以及数据竞争等。它的工作原理是解释 Rust 的中间表示(MIR),这种方式介于编译器的静态分析和 Sanitizer 的动态分析之间,更有针对性地发现潜在问题。

与 Sanitizer 类似,Miri 依赖于 Rust 的 nightly 工具链,安装也很简单:

rustup +nightly component add miri

内存越界访问

让我们重新考虑之前的越界内存访问示例:

fn bad_address(i: i32) -> i32 {    let xs: [i32; 4] = [0, 1, 2, 3];    unsafe { *xs.as_ptr().offset(i as isize) }}

fn main() { let v = bad_address(4000); println!("Value at offset: {}", v);}

使用 Miri 十分简单:

cargo +nightly miri run

它会报告越界访问错误并提供回溯信息:

error: Undefined Behavior: out-of-bounds pointer arithmetic: expected a pointer to 16000 bytes of memory, but got alloc870 which is only 16 bytes from the end of the allocation --> src/main.rs:3:15  |3 |     unsafe { *xs.as_ptr().offset(i as isize) }  |               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ out-of-bounds pointer arithmetic: expected a pointer to 16000 bytes of memory, but got alloc870 which is only 16 bytes from the end of the allocation  |  = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior  = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further informationhelp: alloc870 was allocated here: --> src/main.rs:2:9  |2 |     let xs: [i32; 4] = [0, 1, 2, 3];  |         ^^  = note: BACKTRACE (of the first span):  = note: inside `bad_address` at src/main.rs:3:15: 3:45note: inside `main` --> src/main.rs:7:13  |7 |     let v = bad_address(4000);  |             ^^^^^^^^^^^^^^^^^

note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace

error: aborting due to 1 previous error

虽然 Sanitizer 也检测到了相同的错误,但 Miri 的输出更加具体,易于理解,包括代码片段而不是内存地址和栈帧。此外,Miri 在单次执行中检查多种未定义行为。与 Sanitizer 类似,Miri 只解释实际执行的代码路径(无论是在测试中还是在二进制程序运行时),并且不会发现未被解释的代码路径中的错误。

数据竞争

为了完整性,我再次使用存在数据竞争的代码,但这次使用 Miri 在测试中运行它:

fn data_race() {    static mut A: usize = 0;

let t = std::thread::spawn(|| { unsafe { A += 1 }; }); unsafe { A += 1 };

t.join().unwrap();}

#[cfg(test)]mod tests { use crate::data_race;

#[test] fn data_race_test() { data_race(); }}

执行 Miri 的测试命令如下:

cargo +nightly miri test

Miri 成功识别出数据竞争,并包含具体的代码片段和错误信息,比上面 Rust  ThreadSanitizer 的输出更易于理解:

running 1 testtest tests::data_race_test ... error: Undefined Behavior: Data race detected between (1) non-atomic write on thread `tests::data_race_test` and (2) non-atomic read on thread `unnamed-2` at alloc1. (2) just happened here --> src/main.rs:5:18  |5 |         unsafe { A += 1 };  |                  ^^^^^^ Data race detected between (1) non-atomic write on thread `tests::data_race_test` and (2) non-atomic read on thread `unnamed-2` at alloc1. (2) just happened here  |help: and (1) occurred earlier here --> src/main.rs:7:14  |7 |     unsafe { A += 1 };  |              ^^^^^^  = help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior  = help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information  = note: BACKTRACE (of the first span) on thread `unnamed-2`:  = note: inside closure at src/main.rs:5:18: 5:24

note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace

error: aborting due to 1 previous error

图片

C 和 C++ 库的情况

Miri 是一款非常实用的工具,但它还有一个限制:目前它无法解释通过 Rust 外部函数接口(FFI) 调用的代码,而 FFI 是 Rust 调用 C 和 C++ 库的方式。例如,rusqlite 用 FFI 调用 C 语言编写的 SQLite 数据库,duckdb-rs 通过用 FFI 调用了用 C++ 编写的 DuckDB,以及 open62541 调用用 C 语言编写的 OPC UA 库。由于 Miri 使用跨平台的解释器运行程序,程序无法访问 FFI 或大多数平台特定的 API。只有一些常见的功能,比如文件系统访问和标准输出打印,被 Miri 支持。

好消息是,我们可以返回使用 C 和 C++ 编译器(如 GCC 或 Clang)提供的 Sanitizer。关键在于,C 或 C++ 代码必须在调用前通过启动相应的 Sanitizer 进行编译,然后才能在 Rust 调用它。

观察下面 C 语言代码中的不安全示例:

#include <stdio.h>#include <string.h>

void c_say_hello(const char *message) { char buffer[10]; strcpy(buffer, message); // Unsafe: no bounds checking! printf("Hello from C! %s\n", buffer);}

可以通过一个 build.rs 文件使用 Clang 编译 C 代码,并启用 AddressSanitizer:

fn main() {    let mut build = cc::Build::new();    build        .compiler("clang")        .file("c_src/c_code.c")        .flag("-Wall") // Enable warnings        .flag("-fsanitize=address") // Enable AddressSanitizer        .flag("-fno-omit-frame-pointer"); // Simplify stack tracing

build.compile("c_code");

// Ensure the build script reruns if the C file changes println!("cargo:rerun-if-changed=c_src/c_code.c");}

然后,可以使用 FFI 从不安全块中的 Rust 库调用 C 代码:

use std::ffi::{c_char, CString};
#[link(name = "c_code")] // Link to the compiled libraryextern "C" { fn c_say_hello(name: *const c_char);}
pub fn say_hello(message: &str) { let name = CString::new(message).expect("CString::new failed"); unsafe { c_say_hello(name.as_ptr()); // Call the C function }}

最后,可以在程序中调用包装 C 代码的“安全” Rust 函数:

use sanitizers::say_hello;
fn main() { say_hello("This is far too long and will do bad things!");}

在启用 AddressSanitizer 的情况下运行程序:

export RUSTFLAGS=-Zsanitizer=addresscargo +nightly run

它会报告栈缓冲区溢出错误:

===================================================================51935==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x00016fa7e76a at pc 0x000100c32ad0 bp 0x00016fa7e730 sp 0x00016fa7dee0WRITE of size 45 at 0x00016fa7e76a thread T0    #0 0x000100c32acc in strcpy+0x4ec (librustc-nightly_rt.asan.dylib:arm64+0x4aacc)    #1 0x000100383ee8 in c_say_hello+0x11c (sanitizers:arm64+0x100003ee8)    #2 0x000100383328 in sanitizers::say_hello::h7a86e3249bf087ea+0x1d8 (sanitizers:arm64+0x100003328)    #3 0x0001003815c0 in sanitizers::main::h11beac415f2c6ee0 main.rs:4    #4 0x0001003818d8 in core::ops::function::FnOnce::call_once::h6f36f80e70ecb8a5 function.rs:250    #5 0x000100381910 in std::sys::backtrace::__rust_begin_short_backtrace::h5a6edce2cadaf2f4 backtrace.rs:152    #6 0x000100381488 in std::rt::lang_start::_$u7b$$u7b$closure$u7d$$u7d$::h6d09ba155578db90 rt.rs:195    #7 0x00010039ccfc in std::rt::lang_start_internal::hc996363c321dd410+0x440 (sanitizers:arm64+0x10001ccfc)    #8 0x0001003812c0 in std::rt::lang_start::hf8df676e77f16e31 rt.rs:194    #9 0x0001003815ec in main+0x20 (sanitizers:arm64+0x1000015ec)    #10 0x00019d87e0dc  ()    #11 0xb922fffffffffffc  ()

Address 0x00016fa7e76a is located in stack of thread T0 at offset 42 in frame #0 0x000100383dd8 in c_say_hello+0xc (sanitizers:arm64+0x100003dd8)

This frame has 1 object(s): [32, 42) 'buffer' (line 5) <== Memory access at offset 42 overflows this variableHINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork (longjmp and C++ exceptions *are* supported)SUMMARY: AddressSanitizer: stack-buffer-overflow (librustc-nightly_rt.asan.dylib:arm64+0x4aacc) in strcpy+0x4ecShadow bytes around the buggy address: 0x00016fa7e480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00016fa7e500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00016fa7e580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00016fa7e600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00016fa7e680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00=>0x00016fa7e700: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00[02]f3 f3 0x00016fa7e780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00016fa7e800: 00 00 00 00 f1 f1 f1 f1 f8 f8 f8 f8 f2 f2 f2 f2 0x00016fa7e880: 00 00 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 0x00016fa7e900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x00016fa7e980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb==51935==ABORTING

需要注意的是,错误是发生在调用 c_say_hello 函数里的 strcpy 函数时,也就是 C 的栈帧中。这说明通过 FFI 调用的 C 代码不再是“看不见的黑盒”,可以被检测到问题。如果你把这个例子中的 C 代码用普通方式编译(不加 AddressSanitizer,试着从 build.rs 文件里去掉相关的那一行),而 Rust 代码仍然用 AddressSanitizer 运行,那么程序还是会报错,提示栈缓冲区溢出(stack-buffer-overflow)。不过,这时错误信息会不那么具体,并且会显示问题出在 Rust 代码中,而不是 C 代码里。

图片

结论

本文探讨了三种验证不安全 Rust 代码的技术,以提高代码安全性并避免可能带来严重后果的未定义行为。这些后果包括故障、安全漏洞、违反法规、经济损失、人员伤害甚至死亡。本文并非旨在进行详尽的调查,而是我为解决实际问题所探索的故事、推荐使用的工具。总结三种技术如下:

1. Sanitizer:在运行时检测不安全 Rust 代码;

2. Miri 解释器,用于检查不安全 Rust 代码;

3. C 和 C++ Sanitizer:在 Rust 中调用 C 和 C++ 库时进行运行时检测。

大多数系统程序员和应用开发者都不应编写不安全的 Rust 代码。不安全 Rust 主要应该属于库开发者的领域。但如果必须编写不安全代码,或者通过所依赖的库调用不安全代码,建议使用 Sanitizer 或 Miri 对代码进行测试,避免出现各种错误。