Safe Rust 的常见陷阱

Pitfalls of Safe Rust

Source | HN Comments

文章探讨了在 Safe Rust 中常见的、编译器无法检测的陷阱，强调内存安全并非构建健壮应用的全部。文章列举了类型转换错误、逻辑错误、`panic`、第三方 crate 和库中的问题、竞态条件等 Safe Rust 无法避免的 bug。随后，文章详细介绍了如何避免整数溢出、数值转换、数组越界、使用有界类型、避免原始类型用于业务逻辑、谨慎处理默认值、安全实现 `Debug` 和序列化、防范 TOCTOU 攻击、使用恒定时间比较、限制输入大小等常见问题，并提供了相应的代码示例和建议。

Idiomatic Rust

Safe Rust 的常见陷阱

上次更新：2025-04-05

当人们说 Rust 是一种“安全的语言”时，他们通常指的是内存安全。虽然内存安全是一个很好的起点，但要构建健壮的应用程序，仅仅靠它是远远不够的。

内存安全很重要，但不足以保证整体的可靠性。

在本文中，我想向你展示一些在 Safe Rust 中常见的、编译器无法检测到的陷阱，以及如何避免它们。

Why Rust Can’t Always Help

即使在 Safe Rust 代码中，你仍然需要处理各种风险和边缘情况。你需要处理诸如输入验证和确保业务逻辑正确性等方面。

以下是 Rust 不能保护你免受的几种类型的 bug：

类型转换错误（例如，溢出）
逻辑错误
由于使用 unwrap 或 expect 导致的 panic
第三方 crate 中恶意或不正确的 build.rs 脚本
第三方库中不正确的 unsafe 代码
竞态条件

让我们看看如何避免一些更常见的问题。这些技巧大致按照你遇到它们的可能性排序。

点击此处展开目录。

Protect Against Integer Overflow

溢出错误很容易发生：

// DON'T: Use unchecked arithmetic
fn calculate_total(price: u32, quantity: u32) -> u32 {
  price * quantity // Could overflow!
}

如果 price 和 quantity 足够大，结果就会溢出。Rust 在 debug 模式下会 panic，但在 release 模式下，它会静默地环绕。

为了避免这种情况，请使用 checked 的算术运算：

// DO: Use checked arithmetic operations
fn calculate_total(price: u32, quantity: u32) -> Result<u32, ArithmeticError> {
  price.checked_mul(quantity)
    .ok_or(ArithmeticError::Overflow)
}

静态检查不会被删除，因为它们不影响生成的代码的性能。因此，如果编译器能够在编译时检测到问题，它就会这样做：

fn main(){
  let x: u8 = 2;
  let y: u8 = 128;
  let z = x * y; // Compile-time error!
}

错误信息将是：

error: this arithmetic operation will overflow
 --> src/main.rs:4:13
 |
4 |   let z = x * y; // Compile-time error!
 |       ^^^^^ attempt to compute `2_u8 * 128_u8`, which would overflow
 |
 = note: `#[deny(arithmetic_overflow)]` on by default

对于所有其他情况，使用 checked_add, checked_sub, checked_mul, 和 checked_div，它们在下溢或溢出时返回 None 而不是环绕。1

快速提示：在 Release 模式下启用溢出检查

Rust 仔细地平衡了性能和安全性。在性能损失可以接受的情况下，内存安全优先。1

整数溢出会导致意外的结果，但它们本质上并不安全。最重要的是，溢出检查可能会很昂贵，这就是 Rust 在 release 模式下禁用它们的原因。2

但是，如果你的应用程序可以用最后的 1% 的性能来换取更好的溢出检测，你可以重新启用它们。

将以下内容放入你的 Cargo.toml 中：

[profile.release]
overflow-checks = true # Enable integer overflow checks in release mode

这将启用 release 模式下的溢出检查。因此，如果发生溢出，代码将会 panic。

有关更多详细信息，请参阅 the docs 。

Rust 接受性能成本以换取安全的一个例子是 checked 的数组索引，它可以防止运行时缓冲区溢出。另一个例子是 Rust 维护者修复了浮点数转换，因为以前的实现可能会导致在将某些浮点数值转换为整数时出现未定义的行为。↩
根据一些基准测试，溢出检查在典型的整数密集型工作负载上会损失几个百分点的性能。请参阅 Dan Luu 的分析 here ↩

Avoid `as` For Numeric Conversions

既然我们正在讨论整数算术，那就让我们谈谈类型转换。除非你确切地知道自己在做什么，否则使用 as 转换值既方便又危险。

let x: i32 = 42;
let y: i8 = x as i8; // Can overflow!

Rust 中有三种主要的数值类型转换方式：

⚠️ 使用 as 关键字：这种方法适用于无损和有损转换。在可能发生数据丢失的情况下（例如，从 i64 转换为 i32），它将简单地截断该值。
使用 From::from()：此方法仅允许无损转换。例如，你可以从 i32 转换为 i64，因为所有 32 位整数都可以容纳在 64 位中。但是，你不能使用此方法从 i64 转换为 i32，因为它可能会丢失数据。
使用 TryFrom：此方法类似于 From::from()，但返回 Result 而不是 panic。当你想要优雅地处理潜在的数据丢失时，这非常有用。

快速提示：安全的数值转换

如有疑问，首选 From::from() 和 TryFrom 而不是 as。

当你能保证没有数据丢失时，使用 From::from()。
当需要优雅地处理潜在的数据丢失时，使用 TryFrom。
只有当你确信潜在的截断不会有问题，或者你知道这些值将适合目标类型的范围，并且性能绝对至关重要时，才使用 as。

（改编自 delnan 的 StackOverflow 回答和 additional context。）

as 运算符对于缩小转换是不安全的。它会静默地截断该值，导致意外的结果。

什么是缩小转换？就是将较大的类型转换为较小的类型，例如 i32 转换为 i8。

例如，看看 as 是如何砍掉我们值中的高位的：

fn main(){
  let a: u16 = 0x1234;
  let b: u8 = a as u8;
  println!("0x{:04x}, 0x{:02x}", a, b); // 0x1234, 0x34
}

所以，回到我们上面的第一个例子，与其写

let x: i32 = 42;
let y: i8 = x as i8; // Can overflow!

不如使用 TryFrom 并优雅地处理错误：

let y = i8::try_from(x).ok_or("Number is too big to be used here")?;

Use Bounded Types for Numeric Values

有界类型使表达不变量和避免无效状态变得更加容易。

例如，如果你有一个数值类型，并且 0 永远不是正确的值，请改用 std::num::NonZeroUsize。

你也可以创建自己的有界类型：

// DON'T: Use raw numeric types for domain values
struct Measurement {
  distance: f64, // Could be negative!
}
// DO: Create bounded types
#[derive(Debug, Clone, Copy)]
struct Distance(f64);
impl Distance {
  pub fn new(value: f64) -> Result<Self, DistanceError> {
    if value < 0.0 || !value.is_finite() {
      return Err(DistanceError::Invalid);
}
    Ok(Distance(value))
}
}
struct Measurement {
  distance: Distance,
}

(Rust Playground)

Don’t Index Into Arrays Without Bounds Checking

每当我看到以下代码时，我都会起鸡皮疙瘩 😨：

let arr = [1, 2, 3];
let elem = arr[3]; // Panic!

这是 bug 的常见来源。与 C 不同，Rust 确实会检查数组边界并防止安全漏洞，但它仍然会在运行时 panic。

相反，请使用 get 方法：

let elem = arr.get(3);

它返回一个 Option，你现在可以优雅地处理它。

有关该主题的更多信息，请参阅 this blog post 。

Use `split_at_checked` Instead Of `split_at`

这个问题与前一个问题有关。假设你有一个 slice，并且你想在某个索引处分割它。

let mid = 4;
let arr = [1, 2, 3];
let (left, right) = arr.split_at(mid);

你可能期望它返回一个 slice 元组，其中第一个 slice 包含所有元素，而第二个 slice 为空。

但是，上面的代码会 panic，因为 mid 索引越界了！

为了更优雅地处理这个问题，请改用 split_at_checked：

let arr = [1, 2, 3];
// This returns an Option
match arr.split_at_checked(mid) {
  Some((left, right)) => {
    // Do something with left and right
}
  None => {
    // Handle the error
}
}

它返回一个 Option，允许你处理错误情况。(Rust Playground)

有关 split_at_checked 的更多信息，请参阅 here。

Avoid Primitive Types For Business Logic

使用原始类型处理所有事情非常诱人。特别是 Rust 初学者会陷入这个陷阱。

// DON'T: Use primitive types for usernames
fn authenticate_user(username: String){
  // Raw String could be anything - empty, too long, or contain invalid characters
}

但是，你真的接受任何字符串作为有效的用户名吗？如果它为空怎么办？如果它包含表情符号或特殊字符怎么办？

你可以为你的域创建一个自定义类型：

#[derive(Debug, Clone, PartialEq, Eq, Hash)]
struct Username(String);
impl Username {
  pub fn new(name: &str) -> Result<Self, UsernameError> {
    // Check for empty username
    if name.is_empty() {
      return Err(UsernameError::Empty);
}
    // Check length (for example, max 30 characters)
    if name.len() > 30 {
      return Err(UsernameError::TooLong);
}
    // Only allow alphanumeric characters and underscores
    if !name.chars().all(|c| c.is_alphanumeric() || c == '_') {
      return Err(UsernameError::InvalidCharacters);
}
    Ok(Username(name.to_string()))
}
  /// Allow to get a reference to the inner string
  pub fn as_str(&self) -> &str {
    &self.0
}
}
fn authenticate_user(username: Username){
  // We know this is always a valid username!
  // No empty strings, no emojis, no spaces, etc.
}

(Rust playground)

Make Invalid States Unrepresentable

下一个点与前一个点密切相关。

你能发现以下代码中的 bug 吗？

// DON'T: Allow invalid combinations
struct Configuration {
  port: u16,
  host: String,
  ssl: bool,
  ssl_cert: Option<String>,
}

问题在于你可以将 ssl 设置为 true，但将 ssl_cert 设置为 None。这是一个无效状态！如果你尝试使用 SSL 连接，你将无法连接，因为没有证书。可以在编译时检测到此问题：

使用类型来强制执行有效状态：

// First, let's define the possible states for the connection
enum ConnectionSecurity {
  Insecure,
  // We can't have an SSL connection
  // without a certificate!
  Ssl { cert_path: String },
}
struct Configuration {
  port: u16,
  host: String,
  // Now we can't have an invalid state!
  // Either we have an SSL connection with a certificate
  // or we don't have SSL at all.
  security: ConnectionSecurity,
}

与前一节相比，该 bug 是由密切相关的字段的无效组合引起的。为了防止这种情况，请清楚地标出所有可能的状态以及它们之间的转换。一种简单的方法是为每个状态定义一个带有可选元数据的 enum。

如果你有兴趣了解更多信息，请访问 blog post on the topic。

Handle Default Values Carefully

为你的类型添加一个 blanket 的 Default 实现是很常见的。但这可能会导致意想不到的问题。

例如，这里有一个端口默认设置为 0 的情况，这不是一个有效的端口号。2

// DON'T: Implement `Default` without consideration
#[derive(Default)] // Might create invalid states!
struct ServerConfig {
  port: u16,   // Will be 0, which isn't a valid port!
  max_connections: usize,
  timeout_seconds: u64,
}

相反，请考虑默认值是否对你的类型有意义。

// DO: Make Default meaningful or don't implement it
struct ServerConfig {
  port: Port,
  max_connections: NonZeroUsize,
  timeout_seconds: Duration,
}
impl ServerConfig {
  pub fn new(port: Port) -> Self {
    Self {
      port,
      max_connections: NonZeroUsize::new(100).unwrap(),
      timeout_seconds: Duration::from_secs(30),
}
}
}

Implement `Debug` Safely

如果你盲目地为你的类型派生 Debug，你可能会暴露敏感数据。相反，手动为包含敏感信息的类型实现 Debug。

// DON'T: Expose sensitive data in debug output
#[derive(Debug)]
struct User {
  username: String,
  password: String, // Will be printed in debug output!
}

相反，你可以这样写：

// DO: Implement Debug manually
#[derive(Debug)]
struct User {
  username: String,
  password: Password,
}
struct Password(String);
impl std::fmt::Debug for Password {
  fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
    f.write_str("[REDACTED]")
}
}
fn main(){
  let user = User {
    username: String::from(""),
    password: Password(String::from("")),
};
  println!("{user:#?}");
}

这会打印

User {
  username: "",
  password: [REDACTED],
}

(Rust playground)

对于生产代码，请使用像 secrecy 这样的 crate。

但是，它也不是非黑即白的：如果你手动实现 Debug，你可能会忘记在 struct 更改时更新实现。一个常见的模式是在 Debug 实现中解构 struct，以捕获此类错误。

而不是这样：

// don't
impl std::fmt::Debug for DatabaseURI {
  fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
    write!(f, "{}://{}:[REDACTED]@{}/{}", self.scheme, self.user, self.host, self.database)
}
}

不如解构 struct 来捕获更改怎么样？

// do
impl std::fmt::Debug for DatabaseURI {
  fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
    // Destructure the struct to catch changes
    // This way, the compiler will warn you if you add a new field
    // and forget to update the Debug implementation
    let DatabaseURI { scheme, user, password: _, host, database, } = self;
    write!(f, "{scheme}://{user}:[REDACTED]@{host}/{database}")?;
    // -- or --
    // f.debug_struct("DatabaseURI")
    //   .field("scheme", scheme)
    //   .field("user", user)
    //   .field("password", &"***")
    //   .field("host", host)
    //   .field("database", database)
    //   .finish()
    Ok(())
}
}

(Rust playground)

感谢 Wesley Moore (wezm) 提供的提示，以及 Simon Brüggen (m3t0r) 提供的示例。

Careful With Serialization

不要盲目地派生 Serialize 和 Deserialize —— 尤其是对于敏感数据。你读取/写入的值可能不是你期望的那样！

// DON'T: Blindly derive Serialize and Deserialize
#[derive(Serialize, Deserialize)]
struct UserCredentials {
  #[serde(default)] // ⚠️ Accepts empty strings when deserializing!
  username: String,
  #[serde(default)]
  password: String, // ⚠️ Leaks the password when serialized!
}

反序列化时，字段可能为空。如果未正确处理，空的凭据可能会通过验证检查

最重要的是，序列化行为也可能泄漏敏感数据。默认情况下，Serialize 会在序列化输出中包含密码字段，这可能会在日志、API 响应或调试输出中暴露敏感凭据。

一个常见的修复方法是使用 impl<'de> Deserialize<'de> for UserCredentials 实现你自己的自定义序列化和反序列化方法。

优点是你完全可以控制输入验证。但是，缺点是你需要自己实现所有逻辑。

另一种策略是使用 #[serde(try_from = "FromType")] 属性。

让我们以 Password 字段为例。首先，使用 newtype 模式来包装标准类型并添加自定义验证：

#[derive(Deserialize)]
// Tell serde to call `Password::try_from` with a `String`
#[serde(try_from = "String")]
pub struct Password(String);

现在为 Password 实现 TryFrom：

impl TryFrom<String> for Password {
  type Error = PasswordError;
  /// Create a new password
  ///
  /// Throws an error if the password is too short.
  /// You can add more checks here.
  fn try_from(value: String) -> Result<Self, Self::Error> {
    // Validate the password
    if value.len() < 8 {
      return Err(PasswordError::TooShort);
}
    Ok(Password(value))
}
}

使用此技巧，你将无法再反序列化无效密码：

// Panic: password too short!
let password: Password = serde_json::from_str(r#""pass""#).unwrap();

（在 Rust Playground 上尝试一下）

感谢 EqualMa’s article on dev.to 以及 Alex Burka (durka) 提供的提示。

Protect Against Time-of-Check to Time-of-Use (TOCTOU)

这是一个更高级的主题，但了解它很重要。TOCTOU（check 时到 use 时）是一类软件 bug，由你检查条件和使用资源之间发生的更改引起。

// DON'T: Vulnerable approach with separate check and use
fn remove_dir(path: &Path) -> io::Result<()> {
  // First check if it's a directory
  if !path.is_dir() {
    return Err(io::Error::new(
      io::ErrorKind::NotADirectory,
      "not a directory"
));
}
  // TOCTOU vulnerability: Between the check above and the use below,
  // the path could be replaced with a symlink to a directory we shouldn't access!
  remove_dir_impl(path)
}

(Rust playground)

更安全的方法是首先打开目录，确保我们对检查的内容进行操作：

// DO: Safer approach that opens first, then checks
fn remove_dir(path: &Path) -> io::Result<()> {
  // Open the directory WITHOUT following symlinks
  let handle = OpenOptions::new()
    .read(true)
    .custom_flags(O_NOFOLLOW | O_DIRECTORY) // Fails if not a directory or is a symlink
    .open(path)?;
  // Now we can safely remove the directory contents using the open handle
  remove_dir_impl(&handle)
}

(Rust playground)

以下是它更安全的原因：当我们持有句柄时，该目录无法被符号链接替换。这样，我们正在处理的目录与我们检查的目录相同。任何替换它的尝试都不会影响我们，因为句柄已经打开。

如果你之前忽略了这个问题，你会被原谅的。事实上，即使 Rust 核心团队也错过了标准库中的这个问题。你所看到的是 std::fs::remove_dir_all 函数中实际 bug 的简化版本。有关更多信息，请阅读 this blog post about CVE-2022-21658。

Use Constant-Time Comparison for Sensitive Data

时序攻击是一种从你的应用程序中提取信息的巧妙方法。它的原理是，比较两个值所需的时间会泄漏有关它们的信息。例如，比较两个字符串所需的时间可以揭示有多少字符是正确的。因此，对于生产代码，在处理密码等敏感数据时，请小心使用常规的相等性检查。

// DON'T: Use regular equality for sensitive comparisons
fn verify_password(stored: &[u8], provided: &[u8]) -> bool {
  stored == provided // Vulnerable to timing attacks!
}
// DO: Use constant-time comparison
use subtle::{ConstantTimeEq, Choice};
fn verify_password(stored: &[u8], provided: &[u8]) -> bool {
  stored.ct_eq(provided).unwrap_u8() == 1
}

Don’t Accept Unbounded Input

使用资源限制防止拒绝服务攻击。当你接受无界输入时，就会发生这种情况，例如，一个巨大的请求正文可能不适合内存。

// DON'T: Accept unbounded input
fn process_request(data: &[u8]) -> Result<(), Error> {
  let decoded = decode_data(data)?; // Could be enormous!
  // Process decoded data
  Ok(())
}

相反，为你的接受的 payload 设置显式限制：

const MAX_REQUEST_SIZE: usize = 1024 * 1024; // 1MiB
fn process_request(data: &[u8]) -> Result<(), Error> {
  if dat