DevToolBox免费
博客

微服务模式指南:Saga、CQRS、事件溯源、服务网格与领域驱动设计

22 分钟阅读作者 DevToolBox Team

微服务模式完全指南:服务设计、通信、Saga、CQRS、事件溯源与可观测性

全面掌握微服务架构模式 — 服务分解(DDD 限界上下文、绞杀者模式)、服务间通信(REST、gRPC、异步消息)、API 网关、Saga 编排与协调、CQRS、事件溯源、熔断器、隔舱、分布式追踪(OpenTelemetry)、服务网格及数据管理策略,配合生产级代码示例。

TL;DR — 60 秒速览微服务模式
  • 使用 DDD 限界上下文分解服务,用绞杀者模式渐进迁移
  • 同步通信(REST/gRPC)用于查询,异步消息用于事件驱动解耦
  • API 网关统一入口:路由、认证、限流、协议转换
  • Saga 管理分布式事务 — 编排式(集中控制)或协调式(事件驱动)
  • CQRS 分离读写模型,事件溯源保留完整状态变更历史
  • 熔断器 + 隔舱 + 指数退避重试 = 弹性三剑客
  • OpenTelemetry 分布式追踪 + 服务网格(Istio/Linkerd)实现全链路可观测性
  • 每个服务独立数据库,杜绝共享数据库反模式
核心要点
  • 微服务不是银弹 — 只在团队规模和业务复杂度需要时采用
  • 服务边界应与业务能力(而非技术层)对齐
  • 拥抱最终一致性,放弃分布式强一致性的幻想
  • 可观测性(日志、指标、追踪)是微服务的生命线,而非可选项
  • 先用绞杀者模式渐进迁移,避免大爆炸重写
  • 每个模式都有代价 — 评估收益是否超过引入的复杂性

1. 服务分解策略

DDD 限界上下文

Domain-Driven Design(领域驱动设计)的限界上下文是划分微服务边界最可靠的方法。每个限界上下文封装一个完整的业务领域,拥有独立的领域模型、通用语言和数据存储。

// E-commerce bounded contexts example
// Each context becomes a candidate microservice

// Order Context — owns order lifecycle
interface Order {
  orderId: string;
  customerId: string;
  items: OrderItem[];
  status: "pending" | "confirmed" | "shipped" | "delivered";
  totalAmount: number;
}

// Inventory Context — owns stock management
interface InventoryItem {
  sku: string;
  warehouseId: string;
  quantityAvailable: number;
  reservedQuantity: number;
}

// Payment Context — owns payment processing
interface Payment {
  paymentId: string;
  orderId: string;  // reference, not a foreign key
  amount: number;
  method: "credit_card" | "paypal" | "bank_transfer";
  status: "pending" | "authorized" | "captured" | "refunded";
}

// Shipping Context — owns delivery logistics
interface Shipment {
  shipmentId: string;
  orderId: string;
  carrier: string;
  trackingNumber: string;
  estimatedDelivery: Date;
}

绞杀者无花果模式

绞杀者模式允许你渐进式地从单体迁移到微服务。通过 API 网关或反向代理,将流量逐步从单体路由到新的微服务,直到单体被完全替换。

# Nginx configuration for Strangler Fig migration
# Route new endpoints to microservices, legacy to monolith

upstream monolith {
    server monolith-app:8080;
}

upstream order-service {
    server order-service:3001;
}

upstream inventory-service {
    server inventory-service:3002;
}

server {
    listen 80;

    # Migrated: orders now served by microservice
    location /api/v2/orders {
        proxy_pass http://order-service;
    }

    # Migrated: inventory now served by microservice
    location /api/v2/inventory {
        proxy_pass http://inventory-service;
    }

    # Everything else still goes to the monolith
    location / {
        proxy_pass http://monolith;
    }
}

2. 服务间通信

同步通信:REST 与 gRPC

REST 使用 HTTP/JSON 实现简单的请求-响应通信,适合外部 API 和简单查询。gRPC 使用 Protocol Buffers 和 HTTP/2,提供更高性能、类型安全和双向流,适合服务间内部通信。

特性RESTgRPC
协议HTTP/1.1 or HTTP/2HTTP/2
序列化JSON (text)Protobuf (binary)
类型安全无(需 OpenAPI 生成)内置(proto 文件生成)
流式传输有限(SSE/WebSocket)原生双向流
浏览器支持原生支持需要 grpc-web 代理
最佳场景公开 API、CRUD 操作内部服务通信、高性能
// gRPC service definition (order.proto)
syntax = "proto3";

package order;

service OrderService {
  rpc CreateOrder (CreateOrderRequest) returns (OrderResponse);
  rpc GetOrder (GetOrderRequest) returns (OrderResponse);
  rpc StreamOrderUpdates (GetOrderRequest) returns (stream OrderEvent);
}

message CreateOrderRequest {
  string customer_id = 1;
  repeated OrderItem items = 2;
}

message OrderItem {
  string sku = 1;
  int32 quantity = 2;
  double unit_price = 3;
}

message OrderResponse {
  string order_id = 1;
  string status = 2;
  double total_amount = 3;
}

异步消息传递

异步消息通过消息代理(Kafka、RabbitMQ、NATS)实现时间解耦。发送方不需要等待接收方处理完成,适合事件驱动架构和最终一致性场景。

// Publishing domain events with Kafka (Node.js + kafkajs)
import { Kafka, Partitioners } from "kafkajs";

const kafka = new Kafka({
  clientId: "order-service",
  brokers: ["kafka-1:9092", "kafka-2:9092"],
});

const producer = kafka.producer({
  createPartitioner: Partitioners.DefaultPartitioner,
});

interface OrderCreatedEvent {
  eventType: "OrderCreated";
  orderId: string;
  customerId: string;
  totalAmount: number;
  timestamp: string;
}

async function publishOrderCreated(order: OrderCreatedEvent) {
  await producer.connect();
  await producer.send({
    topic: "order-events",
    messages: [
      {
        key: order.orderId,
        value: JSON.stringify(order),
        headers: {
          "event-type": "OrderCreated",
          "correlation-id": crypto.randomUUID(),
        },
      },
    ],
  });
}

// Consuming events in inventory-service
const consumer = kafka.consumer({ groupId: "inventory-group" });

async function startConsumer() {
  await consumer.connect();
  await consumer.subscribe({ topic: "order-events" });
  await consumer.run({
    eachMessage: async ({ topic, partition, message }) => {
      const event = JSON.parse(message.value!.toString());
      if (event.eventType === "OrderCreated") {
        await reserveInventory(event.orderId, event.items);
      }
    },
  });
}

3. API 网关模式

API 网关是所有客户端请求的单一入口点,负责路由、认证、限流、协议转换和请求聚合。它隐藏了内部微服务拓扑,简化了客户端交互。

// Express.js API Gateway with routing + auth + rate limiting
import express from "express";
import { createProxyMiddleware } from "http-proxy-middleware";
import rateLimit from "express-rate-limit";

const app = express();

// Global rate limiting
app.use(rateLimit({
  windowMs: 60 * 1000,    // 1 minute
  max: 100,               // 100 requests per window
  standardHeaders: true,
}));

// JWT authentication middleware
function authenticate(req, res, next) {
  const token = req.headers.authorization?.split(" ")[1];
  if (!token) return res.status(401).json({ error: "Unauthorized" });
  try {
    req.user = jwt.verify(token, process.env.JWT_SECRET);
    next();
  } catch {
    res.status(403).json({ error: "Invalid token" });
  }
}

// Route to microservices
app.use("/api/orders", authenticate, createProxyMiddleware({
  target: "http://order-service:3001",
  changeOrigin: true,
  pathRewrite: { "^/api/orders": "/orders" },
}));

app.use("/api/inventory", authenticate, createProxyMiddleware({
  target: "http://inventory-service:3002",
  changeOrigin: true,
  pathRewrite: { "^/api/inventory": "/inventory" },
}));

app.use("/api/payments", authenticate, createProxyMiddleware({
  target: "http://payment-service:3003",
  changeOrigin: true,
}));

app.listen(8080, () => console.log("API Gateway on :8080"));

服务发现

服务发现允许微服务在运行时动态定位其他服务,无需硬编码地址。客户端发现(服务自己查询注册中心)和服务端发现(负载均衡器查询注册中心)是两种主要模式。

# docker-compose.yml with Consul for service discovery
version: "3.8"

services:
  consul:
    image: hashicorp/consul:1.17
    ports:
      - "8500:8500"
    command: agent -server -bootstrap-expect=1 -ui -client=0.0.0.0

  order-service:
    build: ./order-service
    environment:
      CONSUL_HOST: consul
      SERVICE_NAME: order-service
      SERVICE_PORT: 3001
    depends_on:
      - consul

  inventory-service:
    build: ./inventory-service
    environment:
      CONSUL_HOST: consul
      SERVICE_NAME: inventory-service
      SERVICE_PORT: 3002
    depends_on:
      - consul

4. Saga 模式:分布式事务管理

Saga 模式将分布式事务分解为一系列本地事务,每个本地事务都有对应的补偿操作。当某个步骤失败时,已完成的步骤会被逆序补偿,实现最终一致性。

编排式 Saga(集中控制器)

编排式 Saga 使用一个中心协调器(Saga Orchestrator)来管理整个事务流程。协调器知道所有步骤的执行顺序,并在失败时触发补偿操作。

// Saga Orchestrator for Order Processing
interface SagaStep {
  name: string;
  execute: (context: SagaContext) => Promise<void>;
  compensate: (context: SagaContext) => Promise<void>;
}

interface SagaContext {
  orderId: string;
  customerId: string;
  items: Array<{ sku: string; qty: number }>;
  paymentId?: string;
  shipmentId?: string;
}

class SagaOrchestrator {
  private steps: SagaStep[] = [];
  private completedSteps: SagaStep[] = [];

  addStep(step: SagaStep) {
    this.steps.push(step);
    return this;
  }

  async execute(context: SagaContext): Promise<void> {
    for (const step of this.steps) {
      try {
        console.log("Executing: " + step.name);
        await step.execute(context);
        this.completedSteps.push(step);
      } catch (error) {
        console.error("Failed at: " + step.name + ", compensating...");
        await this.compensate(context);
        throw error;
      }
    }
  }

  private async compensate(context: SagaContext): Promise<void> {
    for (const step of this.completedSteps.reverse()) {
      try {
        console.log("Compensating: " + step.name);
        await step.compensate(context);
      } catch (err) {
        console.error("Compensation failed for: " + step.name);
        // Log for manual intervention
      }
    }
  }
}

// Usage
const orderSaga = new SagaOrchestrator()
  .addStep({
    name: "ReserveInventory",
    execute: async (ctx) => { /* call inventory-service */ },
    compensate: async (ctx) => { /* release reserved stock */ },
  })
  .addStep({
    name: "ProcessPayment",
    execute: async (ctx) => { /* call payment-service */ },
    compensate: async (ctx) => { /* refund payment */ },
  })
  .addStep({
    name: "CreateShipment",
    execute: async (ctx) => { /* call shipping-service */ },
    compensate: async (ctx) => { /* cancel shipment */ },
  });

协调式 Saga(事件驱动)

协调式 Saga 没有中央控制器。每个服务监听事件并发布新事件,形成事件链。服务之间完全解耦,但事务流程更难追踪和调试。

编排 vs 协调选择指南:步骤少于 4 个且服务高度独立时用协调式;步骤复杂、需要全局可见性或涉及条件分支时用编排式。

5. CQRS — 命令查询职责分离

CQRS 将应用的读(Query)和写(Command)模型分离为不同的数据模型甚至不同的数据库。写模型针对一致性和业务规则优化,读模型针对查询性能优化。

// CQRS implementation with separate read/write models

// ── Command Side (Write Model) ──
interface CreateOrderCommand {
  type: "CreateOrder";
  customerId: string;
  items: Array<{ sku: string; quantity: number; price: number }>;
}

class OrderCommandHandler {
  constructor(
    private writeRepo: OrderWriteRepository,
    private eventBus: EventBus
  ) {}

  async handle(cmd: CreateOrderCommand): Promise<string> {
    // Business validation
    if (cmd.items.length === 0) throw new Error("Order must have items");

    const order = {
      id: crypto.randomUUID(),
      customerId: cmd.customerId,
      items: cmd.items,
      status: "pending" as const,
      total: cmd.items.reduce((s, i) => s + i.price * i.quantity, 0),
      createdAt: new Date(),
    };

    await this.writeRepo.save(order);

    // Publish event to sync read model
    await this.eventBus.publish({
      type: "OrderCreated",
      payload: order,
      timestamp: new Date().toISOString(),
    });

    return order.id;
  }
}

// ── Query Side (Read Model) ──
interface OrderReadModel {
  orderId: string;
  customerName: string;      // denormalized
  itemCount: number;         // precomputed
  totalFormatted: string;    // preformatted
  status: string;
  createdAt: string;
}

// Read model projection — listens to events and updates view
class OrderProjection {
  constructor(private readRepo: OrderReadRepository) {}

  async onOrderCreated(event: OrderCreatedEvent) {
    const customer = await lookupCustomer(event.payload.customerId);
    await this.readRepo.upsert({
      orderId: event.payload.id,
      customerName: customer.name,
      itemCount: event.payload.items.length,
      totalFormatted: "$" + event.payload.total.toFixed(2),
      status: event.payload.status,
      createdAt: event.timestamp,
    });
  }
}

6. 事件溯源

事件溯源将每个状态变更存储为不可变事件,而不是只保存当前状态。当前状态通过重放事件序列来重建。这提供了完整的审计日志,支持时间旅行查询,并与 CQRS 天然配合。

// Event Sourcing for an Order aggregate
type OrderEvent =
  | { type: "OrderCreated"; data: { id: string; customerId: string; items: any[] } }
  | { type: "OrderConfirmed"; data: { confirmedAt: string } }
  | { type: "ItemAdded"; data: { sku: string; quantity: number } }
  | { type: "OrderShipped"; data: { trackingNumber: string } }
  | { type: "OrderCancelled"; data: { reason: string } };

interface EventStore {
  append(streamId: string, events: OrderEvent[]): Promise<void>;
  readStream(streamId: string): Promise<OrderEvent[]>;
}

class OrderAggregate {
  private id = "";
  private status = "draft";
  private items: any[] = [];
  private changes: OrderEvent[] = [];

  // Rebuild state from event history
  static fromHistory(events: OrderEvent[]): OrderAggregate {
    const order = new OrderAggregate();
    for (const event of events) {
      order.apply(event);
    }
    return order;
  }

  private apply(event: OrderEvent) {
    switch (event.type) {
      case "OrderCreated":
        this.id = event.data.id;
        this.items = event.data.items;
        this.status = "pending";
        break;
      case "OrderConfirmed":
        this.status = "confirmed";
        break;
      case "OrderShipped":
        this.status = "shipped";
        break;
      case "OrderCancelled":
        this.status = "cancelled";
        break;
    }
  }

  // Command that produces new events
  confirm() {
    if (this.status !== "pending") throw new Error("Only pending orders");
    const event: OrderEvent = {
      type: "OrderConfirmed",
      data: { confirmedAt: new Date().toISOString() },
    };
    this.apply(event);
    this.changes.push(event);
  }

  getUncommittedChanges(): OrderEvent[] {
    return [...this.changes];
  }
}

7. 熔断器模式

熔断器模式防止级联故障。当下游服务调用失败率超过阈值时,熔断器打开,直接返回错误而不发起请求,保护上游服务的资源。经过超时后进入半开状态,允许少量探测请求通过。

// Circuit Breaker implementation in TypeScript
type CircuitState = "CLOSED" | "OPEN" | "HALF_OPEN";

class CircuitBreaker {
  private state: CircuitState = "CLOSED";
  private failureCount = 0;
  private successCount = 0;
  private lastFailureTime = 0;

  constructor(
    private failureThreshold: number = 5,
    private resetTimeoutMs: number = 30000,
    private halfOpenMaxCalls: number = 3
  ) {}

  async call<T>(fn: () => Promise<T>): Promise<T> {
    if (this.state === "OPEN") {
      if (Date.now() - this.lastFailureTime > this.resetTimeoutMs) {
        this.state = "HALF_OPEN";
        this.successCount = 0;
      } else {
        throw new Error("Circuit is OPEN — request rejected");
      }
    }

    try {
      const result = await fn();
      this.onSuccess();
      return result;
    } catch (error) {
      this.onFailure();
      throw error;
    }
  }

  private onSuccess() {
    if (this.state === "HALF_OPEN") {
      this.successCount++;
      if (this.successCount >= this.halfOpenMaxCalls) {
        this.state = "CLOSED";
        this.failureCount = 0;
      }
    } else {
      this.failureCount = 0;
    }
  }

  private onFailure() {
    this.failureCount++;
    this.lastFailureTime = Date.now();
    if (this.failureCount >= this.failureThreshold) {
      this.state = "OPEN";
    }
  }
}

// Usage
const breaker = new CircuitBreaker(5, 30000, 3);

async function getOrderFromService(orderId: string) {
  return breaker.call(async () => {
    const res = await fetch(
      "http://order-service:3001/orders/" + orderId
    );
    if (!res.ok) throw new Error("Service error: " + res.status);
    return res.json();
  });
}

Resilience4j 配置示例(Java/Spring Boot)

# application.yml — Resilience4j circuit breaker config
resilience4j:
  circuitbreaker:
    instances:
      orderService:
        registerHealthIndicator: true
        slidingWindowType: COUNT_BASED
        slidingWindowSize: 10
        failureRateThreshold: 50
        waitDurationInOpenState: 30s
        permittedNumberOfCallsInHalfOpenState: 3
        automaticTransitionFromOpenToHalfOpenEnabled: true
        recordExceptions:
          - java.io.IOException
          - java.util.concurrent.TimeoutException
  retry:
    instances:
      orderService:
        maxAttempts: 3
        waitDuration: 1s
        enableExponentialBackoff: true
        exponentialBackoffMultiplier: 2
  bulkhead:
    instances:
      orderService:
        maxConcurrentCalls: 25
        maxWaitDuration: 500ms

8. 隔舱模式

隔舱模式借鉴了船舶设计中的水密隔舱概念。通过隔离资源池(线程池、连接池、内存),确保一个下游服务的故障不会耗尽整个系统的资源,限制故障的影响范围。

// Bulkhead pattern — isolated resource pools per service
class Bulkhead {
  private activeCount = 0;
  private queue: Array<{ resolve: Function; reject: Function }> = [];

  constructor(
    private maxConcurrent: number,
    private maxQueueSize: number,
    private timeoutMs: number
  ) {}

  async execute<T>(fn: () => Promise<T>): Promise<T> {
    if (this.activeCount >= this.maxConcurrent) {
      if (this.queue.length >= this.maxQueueSize) {
        throw new Error("Bulkhead full — request rejected");
      }
      await new Promise<void>((resolve, reject) => {
        const timer = setTimeout(
          () => reject(new Error("Bulkhead queue timeout")),
          this.timeoutMs
        );
        this.queue.push({
          resolve: () => { clearTimeout(timer); resolve(); },
          reject,
        });
      });
    }

    this.activeCount++;
    try {
      return await fn();
    } finally {
      this.activeCount--;
      if (this.queue.length > 0) {
        const next = this.queue.shift()!;
        next.resolve();
      }
    }
  }
}

// Separate bulkheads per downstream dependency
const orderBulkhead = new Bulkhead(10, 20, 5000);
const paymentBulkhead = new Bulkhead(5, 10, 3000);
const inventoryBulkhead = new Bulkhead(15, 30, 5000);

9. 指数退避重试

瞬时故障(网络抖动、暂时过载)可以通过重试解决,但固定间隔重试可能造成雷鸣群效应。指数退避加抖动可以有效分散重试请求,避免同时向下游服务发起大量重试。

// Retry with exponential backoff + jitter
interface RetryConfig {
  maxAttempts: number;
  baseDelayMs: number;
  maxDelayMs: number;
  jitter: boolean;
}

async function retryWithBackoff<T>(
  fn: () => Promise<T>,
  config: RetryConfig
): Promise<T> {
  let lastError: Error | undefined;

  for (let attempt = 0; attempt < config.maxAttempts; attempt++) {
    try {
      return await fn();
    } catch (error) {
      lastError = error as Error;

      if (attempt === config.maxAttempts - 1) break;

      // Calculate delay: base * 2^attempt
      let delay = Math.min(
        config.baseDelayMs * Math.pow(2, attempt),
        config.maxDelayMs
      );

      // Add jitter: random value between 0 and delay
      if (config.jitter) {
        delay = Math.random() * delay;
      }

      console.log(
        "Attempt " + (attempt + 1) + " failed. " +
        "Retrying in " + Math.round(delay) + "ms..."
      );
      await new Promise(r => setTimeout(r, delay));
    }
  }

  throw lastError;
}

// Usage
const result = await retryWithBackoff(
  () => fetch("http://payment-service:3003/charge"),
  { maxAttempts: 4, baseDelayMs: 500, maxDelayMs: 8000, jitter: true }
);

10. 分布式追踪与 OpenTelemetry

OpenTelemetry 是 CNCF 的可观测性标准。它通过在服务间传播 trace context(trace ID + span ID),为每个请求提供端到端的链路追踪、延迟分析和错误定位。

// OpenTelemetry setup for a Node.js microservice
import { NodeSDK } from "@opentelemetry/sdk-node";
import { getNodeAutoInstrumentations } from
  "@opentelemetry/auto-instrumentations-node";
import { OTLPTraceExporter } from
  "@opentelemetry/exporter-trace-otlp-http";
import { Resource } from "@opentelemetry/resources";
import {
  SEMRESATTRS_SERVICE_NAME,
  SEMRESATTRS_SERVICE_VERSION,
} from "@opentelemetry/semantic-conventions";

const sdk = new NodeSDK({
  resource: new Resource({
    [SEMRESATTRS_SERVICE_NAME]: "order-service",
    [SEMRESATTRS_SERVICE_VERSION]: "1.2.0",
  }),
  traceExporter: new OTLPTraceExporter({
    url: "http://jaeger:4318/v1/traces",
  }),
  instrumentations: [getNodeAutoInstrumentations()],
});

sdk.start();
console.log("OpenTelemetry tracing initialized");

// Custom span for business logic
import { trace, SpanStatusCode } from "@opentelemetry/api";

const tracer = trace.getTracer("order-service");

async function processOrder(orderId: string) {
  return tracer.startActiveSpan(
    "processOrder",
    async (span) => {
      try {
        span.setAttribute("order.id", orderId);
        const order = await fetchOrder(orderId);
        span.setAttribute("order.total", order.total);
        span.setAttribute("order.items_count", order.items.length);

        await validateOrder(order);
        await chargePayment(order);

        span.setStatus({ code: SpanStatusCode.OK });
        return order;
      } catch (error) {
        span.setStatus({
          code: SpanStatusCode.ERROR,
          message: (error as Error).message,
        });
        span.recordException(error as Error);
        throw error;
      } finally {
        span.end();
      }
    }
  );
}

11. 服务网格(Istio / Linkerd)

服务网格通过 sidecar 代理透明地处理服务间通信,提供流量管理、安全(mTLS)、可观测性和弹性功能,无需修改应用代码。Istio 和 Linkerd 是两个最流行的实现。

特性IstioLinkerd
代理Envoylinkerd2-proxy (Rust)
复杂度高(功能丰富)低(轻量简洁)
mTLS内置,自动内置,默认开启
流量管理高级(金丝雀、故障注入、镜像)基础(流量分割)
资源开销较高较低
最佳场景大规模企业级部署中小规模、快速上手
# Istio VirtualService for canary deployment
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: order-service
spec:
  hosts:
    - order-service
  http:
    - route:
        - destination:
            host: order-service
            subset: stable
          weight: 90
        - destination:
            host: order-service
            subset: canary
          weight: 10
      retries:
        attempts: 3
        perTryTimeout: 2s
      timeout: 10s
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: order-service
spec:
  host: order-service
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100
      http:
        h2UpgradePolicy: DEFAULT
        maxRequestsPerConnection: 10
    outlierDetection:
      consecutive5xxErrors: 5
      interval: 30s
      baseEjectionTime: 60s
  subsets:
    - name: stable
      labels:
        version: v1
    - name: canary
      labels:
        version: v2

12. 数据管理策略

每服务独立数据库

每个微服务拥有自己的数据库实例或 schema,确保数据隔离和独立部署。服务之间通过 API 或事件进行数据交换,而非共享数据库。

# docker-compose.yml — database per service
services:
  order-db:
    image: postgres:16
    environment:
      POSTGRES_DB: orders
      POSTGRES_USER: order_svc
      POSTGRES_PASSWORD_FILE: /run/secrets/order_db_pw
    volumes:
      - order-data:/var/lib/postgresql/data
    networks:
      - order-net    # isolated network

  inventory-db:
    image: mongo:7
    environment:
      MONGO_INITDB_DATABASE: inventory
    volumes:
      - inventory-data:/data/db
    networks:
      - inventory-net

  payment-db:
    image: postgres:16
    environment:
      POSTGRES_DB: payments
      POSTGRES_USER: payment_svc
      POSTGRES_PASSWORD_FILE: /run/secrets/payment_db_pw
    volumes:
      - payment-data:/var/lib/postgresql/data
    networks:
      - payment-net

volumes:
  order-data:
  inventory-data:
  payment-data:

networks:
  order-net:
  inventory-net:
  payment-net:

共享数据库反模式

避免共享数据库:多个服务直接访问同一数据库会导致隐式耦合、schema 变更冲突、无法独立部署和扩展。这是微服务架构中最常见的反模式之一。如果必须共享数据,使用事件驱动同步或 API 调用。

13. 健康检查与就绪探针

健康检查(liveness probe)确认服务进程是否存活;就绪探针(readiness probe)确认服务是否准备好接收流量。在 Kubernetes 中,这两个探针决定了 Pod 的生命周期管理和流量路由。

// Health check endpoints in Express.js
import express from "express";

const app = express();
let isReady = false;

// Liveness — is the process alive?
app.get("/health/live", (req, res) => {
  res.status(200).json({ status: "alive" });
});

// Readiness — can it accept traffic?
app.get("/health/ready", async (req, res) => {
  if (!isReady) {
    return res.status(503).json({ status: "not ready" });
  }

  try {
    // Check critical dependencies
    await checkDatabaseConnection();
    await checkCacheConnection();
    res.status(200).json({
      status: "ready",
      checks: { database: "ok", cache: "ok" },
    });
  } catch (error) {
    res.status(503).json({
      status: "not ready",
      error: (error as Error).message,
    });
  }
});

// Startup sequence
async function bootstrap() {
  await connectToDatabase();
  await connectToCache();
  await warmUpCaches();
  isReady = true;
  console.log("Service is ready to accept traffic");
}

bootstrap();
app.listen(3001);
# Kubernetes deployment with health probes
apiVersion: apps/v1
kind: Deployment
metadata:
  name: order-service
spec:
  replicas: 3
  selector:
    matchLabels:
      app: order-service
  template:
    metadata:
      labels:
        app: order-service
    spec:
      containers:
        - name: order-service
          image: order-service:1.2.0
          ports:
            - containerPort: 3001
          livenessProbe:
            httpGet:
              path: /health/live
              port: 3001
            initialDelaySeconds: 10
            periodSeconds: 15
            failureThreshold: 3
          readinessProbe:
            httpGet:
              path: /health/ready
              port: 3001
            initialDelaySeconds: 5
            periodSeconds: 10
            failureThreshold: 3
          startupProbe:
            httpGet:
              path: /health/live
              port: 3001
            initialDelaySeconds: 0
            periodSeconds: 5
            failureThreshold: 30
          resources:
            requests:
              cpu: 250m
              memory: 256Mi
            limits:
              cpu: 500m
              memory: 512Mi

14. 模式决策矩阵

下表帮助你根据具体场景选择合适的微服务模式。

场景推荐模式关键考量
从单体迁移Strangler Fig渐进式,低风险
跨服务事务Saga (Orchestration)需要补偿逻辑
高频读低频写CQRS读写模型独立扩展
完整审计日志Event Sourcing事件存储增长需管理
防止级联故障Circuit Breaker + Bulkhead配合重试和降级
请求链路追踪OpenTelemetry所有服务必须集成
零信任安全Service Mesh (Istio)自动 mTLS,运维开销较大
事件驱动解耦Async Messaging (Kafka)接受最终一致性

总结

微服务架构不是目标,而是实现业务敏捷性和可扩展性的手段。成功的关键在于:理解每个模式的适用场景和代价,以 DDD 限界上下文驱动服务分解,用 Saga 管理分布式事务,用 CQRS/事件溯源解决复杂查询和审计需求,用熔断器和隔舱保障弹性,用 OpenTelemetry 实现可观测性。从小处着手,按需引入模式,避免过度设计。

记住:分布式系统的复杂性是真实的代价。如果一个模块化的单体能满足需求,那就是最好的选择。只有当团队规模、部署频率和业务域复杂度真正需要时,才逐步拆分为微服务。

𝕏 Twitterin LinkedIn
这篇文章有帮助吗?

保持更新

获取每周开发技巧和新工具通知。

无垃圾邮件,随时退订。

试试这些相关工具

{ }JSON FormatterGTGraphQL to TypeScript

相关文章

高级 GraphQL 指南:Schema 设计、Resolver、订阅、Federation 与性能优化

全面的高级 GraphQL 指南,涵盖 Schema 设计、自定义标量、指令、DataLoader 解析器模式、订阅、Apollo Federation、认证、缓存、分页、测试和监控。

DevOps 流水线指南:CI/CD、GitHub Actions、Docker、基础设施即代码与部署策略

完整的 DevOps 流水线指南,涵盖 CI/CD 基础、GitHub Actions、GitLab CI、Docker 多阶段构建、Terraform、Pulumi、部署策略、密钥管理、GitOps 和流水线安全。

系统设计指南:可扩展性、负载均衡、缓存、CAP定理和面试准备

掌握系统设计面试和实际应用。涵盖水平/垂直扩展、负载均衡、缓存(CDN、Redis)、数据库分片、CAP定理、消息队列、速率限制、URL短链接设计、社交媒体信息流以及估算计算。