无 Mock 测试：一种模式语言 (2023)

Testing Without Mocks: A Pattern Language (2023)

Source | HN Comments

这篇文章提出了一种“无 Mock 测试”的模式语言，旨在解决传统测试方法（如广泛测试和基于交互的测试）的缺点。核心思想是使用“Nullables”技术，结合窄测试、基于状态的测试和社交性测试，实现快速、可靠、易于重构的单元测试，同时避免了对 Mock 框架的依赖。文章详细介绍了基础模式、架构模式、逻辑模式、基础设施模式、Nullability 模式和遗留代码模式，并强调了其优势，如更快的测试速度、简单的测试设置和高可重用性，但也指出了需要修改生产代码、手写 Stub 代码和可能导致多个测试失败的权衡。

Testing Without Mocks：一种模式语言

2023年2月16日

自动化测试非常重要。没有它们，程序员会浪费大量时间手动检查和修复他们的代码。

不幸的是，许多自动化测试也浪费了大量时间。编写测试的简单、明显的方法是进行广泛测试，这些测试是手动测试的自动化版本。但它们不稳定且速度慢。

业内人士使用 Mocks 和 Spies（在本文中我简称“Mocks”）来编写隔离的基于交互的测试。他们的测试可靠且快速，但它们往往会“锁定”实现，使重构变得困难，并且必须辅以广泛的测试。也很容易创建难以阅读的低质量测试，或者最终只测试它们自己。

糟糕的测试是糟糕设计的标志，因此有些人使用诸如 Hexagonal Architecture 和 functional core, imperative shell 等技术将逻辑与基础设施分离。（基础设施是指涉及外部系统或状态的代码。）它修复了逻辑的问题……但基础设施通常未经测试，并且需要架构更改，这对于拥有现有代码的人来说是遥不可及的。

这种模式语言1描述了第四种选择。它避免了上述所有问题：它不使用广泛的测试，不使用 Mocks，不忽略基础设施，并且不需要架构更改。它具有单元测试的速度、可靠性和可维护性以及广泛测试的能力。但这并非没有它自己的权衡。

1本文的结构灵感来自 Ward Cunningham 的 CHECKS Pattern Language of Information Integrity，它是清晰度和实用性的典范。

这些模式结合了社交性、基于状态的测试与一种名为“Nullables”的新型基础设施技术。乍一看，Nullables 看起来像测试替身，但它们实际上是带有“关闭”开关的生产代码。这就是权衡：你是否想要在你的生产代码中加入它？你的答案决定了这种模式语言是否适合你。

本文的其余部分将详细介绍。不要被它的大小吓倒。它被分解成许多小块，并附带了许多代码示例。

额外资源

有关与 Nullables 和“Testing Without Mocks”模式相关的更多资源，包括截屏视频、自助式培训等，请参阅 Nullables Hub。

示例

这是一个测试简单命令行应用程序的示例。该应用程序从命令行读取一个字符串，使用 ROT-13 对其进行编码，并输出结果。

生产代码使用了可选的 A-Frame Architecture 模式。App 是应用程序的入口点。它依赖于 Rot13（一个逻辑类）和 CommandLine（一个基础设施类）。源代码中提到了其他模式。

// 示例生产代码 (JavaScript + Node.js)
import CommandLine from "./infrastructure/command_line"; // Infrastructure Wrapper[](https://www.jamesshore.com/v2/projects/nullables/<#infrastructure-wrappers>)
import * as rot13 from "./logic/rot13";
export default class App {
 constructor(commandLine = CommandLine.create()) {  // Parameterless Instantiation[](https://www.jamesshore.com/v2/projects/nullables/<#instantiation>)
  this._commandLine = commandLine;
 }
 run() {
  const args = this._commandLine.args();
  if (args.length === 0) {  _// Tested by Test #2_
   this._commandLine.writeOutput("Usage: run text_to_transform\n");
   return;
  }
  if (args.length !== 1) {  _// Tested by Test #3_
   this._commandLine.writeOutput("too many arguments\n");
   return;
  }
  _// Tested by Test #1_
  const input = args[0];             // Logic Sandwich[](https://www.jamesshore.com/v2/projects/nullables/<#logic-sandwich>)
  const output = rot13.transform(input);
  this._commandLine.writeOutput(output + "\n");
 }
};

App 的测试看起来像端到端集成测试，但它们实际上是单元测试。从技术上讲，它们是 Narrow、Sociable 测试，这意味着它们是执行依赖项中的代码的单元测试。

作为 Narrow 测试，这些测试只关心测试 App.run()。每个依赖项都应该有自己的测试，它们也确实有。

这些测试使用 Nullable CommandLine 来丢弃 stdout，并使用 Configurable Responses 来提供预配置的命令行参数。它们还使用 Output Tracking 来查看将要写入 stdout 的内容。

// 示例测试 (JavaScript + Node.js)
import assert from "assert";
import CommandLine from "./infrastructure/command_line";
import App from "./app";
describe("App", () => {
 _// Test #1_
 it("reads command-line argument, transform it with ROT-13, and writes result", () => {
  const { output } = run({ args: [ "my input" ] });   // Signature Shielding[](https://www.jamesshore.com/v2/projects/nullables/<#sig-shielding>), Configurable Responses[](https://www.jamesshore.com/v2/projects/nullables/<#configurable-responses>)
  assert.deepEqual(output.data, [ "zl vachg\n" ];    // Output Tracking[](https://www.jamesshore.com/v2/projects/nullables/<#output-tracking>)
 });
 _// Test #2_
 it("writes usage when no argument provided", () => {
  const { output } = run({ args: [] });                 // Signature Shielding[](https://www.jamesshore.com/v2/projects/nullables/<#sig-shielding>), Configurable Responses[](https://www.jamesshore.com/v2/projects/nullables/<#configurable-responses>)
  assert.deepEqual(output.data, [ "Usage: run text_to_transform\n" ]); // Output Tracking[](https://www.jamesshore.com/v2/projects/nullables/<#output-tracking>)
 });
 _// Test #3_
 it("complains when too many command-line arguments provided", () => {
  const { output } = run({ args: [ "a", "b" ] });            // Signature Shielding[](https://www.jamesshore.com/v2/projects/nullables/<#sig-shielding>), Configurable Responses[](https://www.jamesshore.com/v2/projects/nullables/<#configurable-responses>)
  assert.deepEqual(output.data, [ "too many arguments\n" ]);      // Output Tracking[](https://www.jamesshore.com/v2/projects/nullables/<#output-tracking>)
 });
 function run({ args = [] } = {}) {           // Signature Shielding[](https://www.jamesshore.com/v2/projects/nullables/<#sig-shielding>)
  const commandLine = CommandLine.createNull({ args }); // Nullable[](https://www.jamesshore.com/v2/projects/nullables/<#nullables>), Infrastructure Wrapper[](https://www.jamesshore.com/v2/projects/nullables/<#infrastructure-wrappers>), Configurable Responses[](https://www.jamesshore.com/v2/projects/nullables/<#configurable-responses>)
  const output = commandLine.trackOutput();       // Output Tracking[](https://www.jamesshore.com/v2/projects/nullables/<#output-tracking>)
  const app = new App(commandLine);
  app.run();
  return { output };                  // Signature Shielding[](https://www.jamesshore.com/v2/projects/nullables/<#sig-shielding>)
 }
});

如果您熟悉 Mocks，您可能会认为 CommandLine 是一个测试替身。但它实际上是带有“关闭”开关和监控其输出能力的生产代码。

// 示例 Nullable 基础设施包装器 (JavaScript + Node.js)
import EventEmitter from "node:events";
import OutputTracker from "output_tracker";
const OUTPUT_EVENT = "output";
export default class CommandLine {
 static create() {
  return new CommandLine(process);         // 'process' is a Node.js global
 }
 static createNull({ args = [] } = {}) {       // Parameterless Instantiation[](https://www.jamesshore.com/v2/projects/nullables/<#instantiation>), Configurable Responses[](https://www.jamesshore.com/v2/projects/nullables/<#configurable-responses>)
  return new CommandLine(new StubbedProcess(args)); // Embedded Stub[](https://www.jamesshore.com/v2/projects/nullables/<#embedded-stub>)
 }
 constructor(proc) {
  this._process = proc;
  this._emitter = new EventEmitter();        // Output Tracking[](https://www.jamesshore.com/v2/projects/nullables/<#output-tracking>)
 }
 args() {
  return this._process.argv.slice(2);
 }
 writeOutput(text) {
  this._process.stdout.write(text);
  this._emitter.emit(OUTPUT_EVENT, text);      // Output Tracking[](https://www.jamesshore.com/v2/projects/nullables/<#output-tracking>)
 }
 trackOutput() {                   // Output Tracking[](https://www.jamesshore.com/v2/projects/nullables/<#output-tracking>)
  return OutputTracker.create(this._emitter, OUTPUT_EVENT);
 }
};
// Embedded Stub[](https://www.jamesshore.com/v2/projects/nullables/<#embedded-stub>)
class StubbedProcess {
 constructor(args) {
  this._args = args;                // Configurable Responses[](https://www.jamesshore.com/v2/projects/nullables/<#configurable-responses>)
 }
 get argv() {
  return [ "nulled_process_node", "nulled_process_script.js", ...this._args ];
 }
 get stdout() {
  return {
   write() {}
  };
 }
}

这些模式在具有多层依赖关系更复杂的代码中大放异彩。在此处查找更多示例：

Simple example. 上述示例的完整源代码。（JavaScript 或 TypeScript 与 Node.js）
Complex example. 上述示例的 blinged-out 版本。一个执行 ROT-13 编码的 Web 应用程序和微服务。具有错误处理、日志记录、超时和请求取消的生产级代码。（JavaScript 或 TypeScript 与 Node.js）
TDD Lunch& Learn Screencast. 一系列时长一小时的网络研讨会，演示如何使用这些模式。（JavaScript 与 Node.js）
Nullables Livestream. James Shore 和 Ted M. Young 共同举办的一系列时长三个小时的直播。他们将这些模式应用于现有的 Web 应用程序。（Java 与 Spring Boot）

Contents

目标

创建这种模式语言是为了满足以下目标：

无需广泛的测试。测试套件完全由专注于特定概念的“narrow”测试组成。尽管可以添加广泛的集成测试作为安全网，但它们的失败表明主测试套件中存在差距。
易于重构。对象交互被认为是需要封装的实现，而不是需要测试的行为。尽管测试了对象交互的后果，但没有测试特定的方法调用。这允许进行结构重构而不会破坏测试。
可读的测试。测试遵循直接的“arrange, act, assert”结构。它们描述了被测单元的外部可见行为，而不是其实现。它们可以充当被测单元的文档。
没有魔法。不需要自动删除繁琐工作的工具，例如依赖注入框架和自动 Mock 框架。
快速且确定。测试套件仅在明确属于被测单元的一部分时才执行“缓慢”代码，例如网络调用或文件系统请求。这些测试经过组织，因此它们在每次测试运行时都会产生相同的结果。

经验表明，这些模式还具有以下额外好处：

比 Mock 框架更快。在正面比较中，使用这些模式的测试比使用 Mock 框架的测试快 2-3 个数量级。（此处为比较代码。）
简单的测试设置。测试设置简单明了，易于封装在辅助方法中。
高可重用性。这些模式所需的最复杂的代码也是最通用和可重用的。
内存基础设施测试。可以测试高级基础设施包装器，例如特定 Web 服务的客户端，而无需网络调用或复杂的设置。（示例测试。）
边缘情况支持。可以轻松测试复杂的边缘情况，例如错误情况和超时。（示例测试。）
旧代码兼容性。这些模式与 Mocks 和其他测试替身完全兼容，甚至可以在同一测试中一起使用。可以增量转换旧代码，而不会影响现有代码。

Contents

权衡

没有什么是完美的。以下是使用这种模式语言的缺点：

更改生产代码。这些模式要求您修改生产代码，特别是对于基础设施类。尽管这些修改可在生产中使用，并且具有生产用例，但许多更改仅供测试使用。
手写 Stub 代码。某些第三方基础设施代码必须使用手写 Stub 代码进行模仿。它无法自动生成，并且需要花费额外的编写时间。但是，结果具有高度可重用性。
多个测试失败。尽管测试的编写目的是为了专注于特定概念，但被测单元会在其依赖项中执行代码。（Jay Fields 为这种行为创造了术语“社交性测试”。）这可能会导致引入错误时，导致多个测试失败。

Contents

基础模式

从这里开始。这些模式确立了基本规则。

Narrow Tests

广泛的测试（例如端到端测试）往往速度慢且脆弱。它们编写和阅读起来很复杂，经常随机失败，并且需要很长时间才能运行。因此：

不要使用广泛的测试，而应使用 Narrow Tests。Narrow Tests 检查特定的功能或行为，而不是整个系统。单元测试是一种常见的 Narrow Tests 类型。

测试基础设施时，使用 Narrow Integration Tests。测试纯逻辑时，使用 Logic Patterns。测试具有基础设施依赖项的代码时，使用 Nullables。

为确保您的代码作为一个整体工作，请使用 State-Based Tests 和 Overlapping Sociable Tests。

Contents

State-Based Tests

Mocks 和 Spies 会导致“基于交互”的测试，这些测试会检查被测代码如何使用其依赖项。但是，它们可能难以阅读，并且它们往往会“锁定”您的依赖项，这使得结构重构变得困难。因此：

使用 State-Based Tests 而不是基于交互的测试。State-Based Tests 会检查被测代码的输出或状态，而无需了解其实现。例如，给定以下生产代码：

// 用于描述月相的生产代码 (JavaScript)
import * as moon from "astronomy";
import { format } from "date_formatter";
export function describeMoonPhase(date) {
 const visibility = moon.getPercentOccluded(date);
 const phase = moon.describePhase(visibility);
 const formattedDate = format(date);
 return `The moon is ${phase} on ${formattedDate}.`;
}

State-Based Tests 会传入一个日期并检查结果，如下所示：

// describeMoonPhase() 的 State-Based 测试 (JavaScript)
import { describeMoonPhase } from "describe_phase";
it("describes phase of moon", () => {
 const dateOfFullMoon = new Date("8 Dec 2022");  // 实际上满月的日期
 const description = describeMoonPhase(dateOfFullMoon);
 assert.equal(description, "The moon is full on December 8th, 2022.";
});

相反，基于交互的测试会检查每个依赖项的使用方式，如下所示：

// describeMoonPhase() 的基于交互的测试（JavaScript 和虚构的 Mock 框架）
const moon = mocker.mockImport("astronomy");
const { format } = mocker.mockImport("date_formatter");
const { describeMoonPhase } = mocker.importWithMocks("describe_phase");
it("describes phase of moon", () => {
 const date = new Date();  // 具体日期无关紧要
 mocker.expect(moon.getPercentOccluded).toBeCalledWith(date).thenReturn(999);
 mocker.expect(moon.describePhase).toBeCalledWith(999).thenReturn("PHASE");
 mocker.expect(format).toBeCalledWith(date).thenReturn("DATE");
 const description = describeMoonPhase(date);
 mocker.verify();
 assert.equal(description, "The moon is PHASE on DATE");
};

State-Based Tests 自然会导致 Overlapping Sociable Tests。要在具有基础设施依赖项的代码上使用 State-Based Tests，请使用 Nullability Patterns。

Contents

Overlapping Sociable Tests

使用 Mocks 和其他测试替身的测试通过替换其依赖项来隔离被测代码。这需要广泛的测试来确认系统作为一个整体工作，但我们不想使用广泛的测试。因此：

测试对象与其依赖项之间的交互时，请使用被测代码的真实依赖项。不要测试依赖项的行为，但要测试被测代码是否正确使用其依赖项。使用 State-Based Tests 时，这自然会发生。

例如，以下测试检查 describeMoonPhase 是否正确使用其 Moon 和 format 依赖项。如果它们的工作方式与 describeMoonPhase 认为的方式不同，则测试将失败。

// Sociable Tests 示例 (JavaScript)
// 测试代码
it("describes phase of moon", () => {
 const dateOfFullMoon = new Date("8 Dec 2022");
 const description = describeMoonPhase(dateOfFullMoon);
 assert.equal(description, "The moon is full on December 8th, 2022.";
};
// 生产代码
describeMoonPhase(date) {
 const visibility = moon.getPercentOccluded(date);
 const phase = moon.describePhase(visibility);
 const formattedDate = format(date);
 return `The moon is ${phase} on ${formattedDate}.`;
}

编写 Narrow Tests，这些测试专注于被测代码的行为，而不是其依赖项的行为。每个依赖项都应该有自己的一套彻底的 Narrow Tests。例如，不要在您的 describeMoonPhase() 测试中测试月球的所有相位，而要在您的 Moon 测试中测试它们。同样，不要在您的 describeMoonPhase 测试中检查日期格式化的复杂性，而要在您的 format(date) 测试中测试它们。

除了检查您的代码如何使用其依赖项之外，社交性测试还可以保护您免受未来重大更改的影响。每个测试都与依赖项的测试和依赖者的测试重叠，从而创建一条强大的链接测试链。这使您无需广泛测试的速度和可靠性问题即可获得其覆盖范围。

例如，假设依赖链为 LoginController → Auth0Client → HttpClient：

LoginController 测试检查 LoginController 是否正确，包括它如何使用 Auth0Client。（Auth0Client 反过来运行 HttpClient，但这并未被 LoginController 测试明确检查。）
Auth0Client 测试检查 Auth0Client 是否正确，包括它如何使用 HttpClient。
HttpClient 测试检查 HttpClient 是否正确，包括使用 Narrow Integration Tests 来检查它如何与 HTTP 服务器通信。
它们共同确保检查整个链。即使 HttpClient 及其测试被有意更改，如果该更改破坏了 Auth0Client，其测试也会失败（并且可能 LoginController 测试也会失败）。更改 Auth0Client 的行为同样会破坏 LoginController 测试。

相反，如果 LoginController 测试 Stub 或 Mock 掉 Auth0Client，则链会断开。更改 Auth0Client 的行为不会破坏 LoginController 测试，因为没有任何东西会检查 LoginController 如何使用真实的 Auth0Client。

为避免手动构建整个依赖链，请结合 Zero-Impact Instantiation 使用 Parameterless Instantiation。要将测试与依赖项的行为更改隔离开来，请使用 Collaborator-Based Isolation。为防止您的测试与外部系统和状态交互，请使用 Nullables。要捕获外部系统中的重大更改，请使用 Paranoic Telemetry。作为安全网，请使用 Smoke Tests。

Contents

Smoke Tests

Overlapping Sociable Tests 应该覆盖您的整个系统。但是没有人是完美的，并且会发生错误。因此：

编写一两个端到端测试，以确保您的代码启动并运行常见的 Workflow。例如，如果您正在编写网站，请检查是否可以获得重要的页面。

不要依赖 Smoke Tests 来捕获错误。您的真实测试套件应包含 Narrow、Sociable 测试。如果 Smoke Tests 捕获了其余测试未捕获的内容，请使用更多 Narrow Tests 填补空白。

Contents

Zero-Impact Instantiation

Overlapping Sociable Tests 实例化其依赖项，而这些依赖项又实例化其依赖项，依此类推。如果实例化此依赖关系网花费的时间太长或导致副作用，则测试可能会很慢、难以设置或无法预测地失败。因此：

不要在构造函数中做重要工作。不要连接到外部系统、启动服务或执行长时间的计算。对于需要连接到外部系统或启动服务的代码，请提供 connect() 或 start() 方法。对于需要执行长时间计算的代码，请考虑[延迟初始化](https://www.jamesshore.com/v2/projects/nullables/<https:/martinfowler.c

无 Mock 测试：一种模式语言 (2023)

Testing Without Mocks：一种模式语言

2023年2月16日

额外资源

目录:

示例

目标

权衡

基础模式

Narrow Tests

State-Based Tests

Overlapping Sociable Tests

Smoke Tests

Zero-Impact Instantiation