微服务模式完全指南:服务设计、通信、Saga、CQRS、事件溯源与可观测性
全面掌握微服务架构模式 — 服务分解(DDD 限界上下文、绞杀者模式)、服务间通信(REST、gRPC、异步消息)、API 网关、Saga 编排与协调、CQRS、事件溯源、熔断器、隔舱、分布式追踪(OpenTelemetry)、服务网格及数据管理策略,配合生产级代码示例。
- 使用 DDD 限界上下文分解服务,用绞杀者模式渐进迁移
- 同步通信(REST/gRPC)用于查询,异步消息用于事件驱动解耦
- API 网关统一入口:路由、认证、限流、协议转换
- Saga 管理分布式事务 — 编排式(集中控制)或协调式(事件驱动)
- CQRS 分离读写模型,事件溯源保留完整状态变更历史
- 熔断器 + 隔舱 + 指数退避重试 = 弹性三剑客
- OpenTelemetry 分布式追踪 + 服务网格(Istio/Linkerd)实现全链路可观测性
- 每个服务独立数据库,杜绝共享数据库反模式
- 微服务不是银弹 — 只在团队规模和业务复杂度需要时采用
- 服务边界应与业务能力(而非技术层)对齐
- 拥抱最终一致性,放弃分布式强一致性的幻想
- 可观测性(日志、指标、追踪)是微服务的生命线,而非可选项
- 先用绞杀者模式渐进迁移,避免大爆炸重写
- 每个模式都有代价 — 评估收益是否超过引入的复杂性
1. 服务分解策略
DDD 限界上下文
Domain-Driven Design(领域驱动设计)的限界上下文是划分微服务边界最可靠的方法。每个限界上下文封装一个完整的业务领域,拥有独立的领域模型、通用语言和数据存储。
// E-commerce bounded contexts example
// Each context becomes a candidate microservice
// Order Context — owns order lifecycle
interface Order {
orderId: string;
customerId: string;
items: OrderItem[];
status: "pending" | "confirmed" | "shipped" | "delivered";
totalAmount: number;
}
// Inventory Context — owns stock management
interface InventoryItem {
sku: string;
warehouseId: string;
quantityAvailable: number;
reservedQuantity: number;
}
// Payment Context — owns payment processing
interface Payment {
paymentId: string;
orderId: string; // reference, not a foreign key
amount: number;
method: "credit_card" | "paypal" | "bank_transfer";
status: "pending" | "authorized" | "captured" | "refunded";
}
// Shipping Context — owns delivery logistics
interface Shipment {
shipmentId: string;
orderId: string;
carrier: string;
trackingNumber: string;
estimatedDelivery: Date;
}绞杀者无花果模式
绞杀者模式允许你渐进式地从单体迁移到微服务。通过 API 网关或反向代理,将流量逐步从单体路由到新的微服务,直到单体被完全替换。
# Nginx configuration for Strangler Fig migration
# Route new endpoints to microservices, legacy to monolith
upstream monolith {
server monolith-app:8080;
}
upstream order-service {
server order-service:3001;
}
upstream inventory-service {
server inventory-service:3002;
}
server {
listen 80;
# Migrated: orders now served by microservice
location /api/v2/orders {
proxy_pass http://order-service;
}
# Migrated: inventory now served by microservice
location /api/v2/inventory {
proxy_pass http://inventory-service;
}
# Everything else still goes to the monolith
location / {
proxy_pass http://monolith;
}
}2. 服务间通信
同步通信:REST 与 gRPC
REST 使用 HTTP/JSON 实现简单的请求-响应通信,适合外部 API 和简单查询。gRPC 使用 Protocol Buffers 和 HTTP/2,提供更高性能、类型安全和双向流,适合服务间内部通信。
| 特性 | REST | gRPC |
|---|---|---|
| 协议 | HTTP/1.1 or HTTP/2 | HTTP/2 |
| 序列化 | JSON (text) | Protobuf (binary) |
| 类型安全 | 无(需 OpenAPI 生成) | 内置(proto 文件生成) |
| 流式传输 | 有限(SSE/WebSocket) | 原生双向流 |
| 浏览器支持 | 原生支持 | 需要 grpc-web 代理 |
| 最佳场景 | 公开 API、CRUD 操作 | 内部服务通信、高性能 |
// gRPC service definition (order.proto)
syntax = "proto3";
package order;
service OrderService {
rpc CreateOrder (CreateOrderRequest) returns (OrderResponse);
rpc GetOrder (GetOrderRequest) returns (OrderResponse);
rpc StreamOrderUpdates (GetOrderRequest) returns (stream OrderEvent);
}
message CreateOrderRequest {
string customer_id = 1;
repeated OrderItem items = 2;
}
message OrderItem {
string sku = 1;
int32 quantity = 2;
double unit_price = 3;
}
message OrderResponse {
string order_id = 1;
string status = 2;
double total_amount = 3;
}异步消息传递
异步消息通过消息代理(Kafka、RabbitMQ、NATS)实现时间解耦。发送方不需要等待接收方处理完成,适合事件驱动架构和最终一致性场景。
// Publishing domain events with Kafka (Node.js + kafkajs)
import { Kafka, Partitioners } from "kafkajs";
const kafka = new Kafka({
clientId: "order-service",
brokers: ["kafka-1:9092", "kafka-2:9092"],
});
const producer = kafka.producer({
createPartitioner: Partitioners.DefaultPartitioner,
});
interface OrderCreatedEvent {
eventType: "OrderCreated";
orderId: string;
customerId: string;
totalAmount: number;
timestamp: string;
}
async function publishOrderCreated(order: OrderCreatedEvent) {
await producer.connect();
await producer.send({
topic: "order-events",
messages: [
{
key: order.orderId,
value: JSON.stringify(order),
headers: {
"event-type": "OrderCreated",
"correlation-id": crypto.randomUUID(),
},
},
],
});
}
// Consuming events in inventory-service
const consumer = kafka.consumer({ groupId: "inventory-group" });
async function startConsumer() {
await consumer.connect();
await consumer.subscribe({ topic: "order-events" });
await consumer.run({
eachMessage: async ({ topic, partition, message }) => {
const event = JSON.parse(message.value!.toString());
if (event.eventType === "OrderCreated") {
await reserveInventory(event.orderId, event.items);
}
},
});
}3. API 网关模式
API 网关是所有客户端请求的单一入口点,负责路由、认证、限流、协议转换和请求聚合。它隐藏了内部微服务拓扑,简化了客户端交互。
// Express.js API Gateway with routing + auth + rate limiting
import express from "express";
import { createProxyMiddleware } from "http-proxy-middleware";
import rateLimit from "express-rate-limit";
const app = express();
// Global rate limiting
app.use(rateLimit({
windowMs: 60 * 1000, // 1 minute
max: 100, // 100 requests per window
standardHeaders: true,
}));
// JWT authentication middleware
function authenticate(req, res, next) {
const token = req.headers.authorization?.split(" ")[1];
if (!token) return res.status(401).json({ error: "Unauthorized" });
try {
req.user = jwt.verify(token, process.env.JWT_SECRET);
next();
} catch {
res.status(403).json({ error: "Invalid token" });
}
}
// Route to microservices
app.use("/api/orders", authenticate, createProxyMiddleware({
target: "http://order-service:3001",
changeOrigin: true,
pathRewrite: { "^/api/orders": "/orders" },
}));
app.use("/api/inventory", authenticate, createProxyMiddleware({
target: "http://inventory-service:3002",
changeOrigin: true,
pathRewrite: { "^/api/inventory": "/inventory" },
}));
app.use("/api/payments", authenticate, createProxyMiddleware({
target: "http://payment-service:3003",
changeOrigin: true,
}));
app.listen(8080, () => console.log("API Gateway on :8080"));服务发现
服务发现允许微服务在运行时动态定位其他服务,无需硬编码地址。客户端发现(服务自己查询注册中心)和服务端发现(负载均衡器查询注册中心)是两种主要模式。
# docker-compose.yml with Consul for service discovery
version: "3.8"
services:
consul:
image: hashicorp/consul:1.17
ports:
- "8500:8500"
command: agent -server -bootstrap-expect=1 -ui -client=0.0.0.0
order-service:
build: ./order-service
environment:
CONSUL_HOST: consul
SERVICE_NAME: order-service
SERVICE_PORT: 3001
depends_on:
- consul
inventory-service:
build: ./inventory-service
environment:
CONSUL_HOST: consul
SERVICE_NAME: inventory-service
SERVICE_PORT: 3002
depends_on:
- consul4. Saga 模式:分布式事务管理
Saga 模式将分布式事务分解为一系列本地事务,每个本地事务都有对应的补偿操作。当某个步骤失败时,已完成的步骤会被逆序补偿,实现最终一致性。
编排式 Saga(集中控制器)
编排式 Saga 使用一个中心协调器(Saga Orchestrator)来管理整个事务流程。协调器知道所有步骤的执行顺序,并在失败时触发补偿操作。
// Saga Orchestrator for Order Processing
interface SagaStep {
name: string;
execute: (context: SagaContext) => Promise<void>;
compensate: (context: SagaContext) => Promise<void>;
}
interface SagaContext {
orderId: string;
customerId: string;
items: Array<{ sku: string; qty: number }>;
paymentId?: string;
shipmentId?: string;
}
class SagaOrchestrator {
private steps: SagaStep[] = [];
private completedSteps: SagaStep[] = [];
addStep(step: SagaStep) {
this.steps.push(step);
return this;
}
async execute(context: SagaContext): Promise<void> {
for (const step of this.steps) {
try {
console.log("Executing: " + step.name);
await step.execute(context);
this.completedSteps.push(step);
} catch (error) {
console.error("Failed at: " + step.name + ", compensating...");
await this.compensate(context);
throw error;
}
}
}
private async compensate(context: SagaContext): Promise<void> {
for (const step of this.completedSteps.reverse()) {
try {
console.log("Compensating: " + step.name);
await step.compensate(context);
} catch (err) {
console.error("Compensation failed for: " + step.name);
// Log for manual intervention
}
}
}
}
// Usage
const orderSaga = new SagaOrchestrator()
.addStep({
name: "ReserveInventory",
execute: async (ctx) => { /* call inventory-service */ },
compensate: async (ctx) => { /* release reserved stock */ },
})
.addStep({
name: "ProcessPayment",
execute: async (ctx) => { /* call payment-service */ },
compensate: async (ctx) => { /* refund payment */ },
})
.addStep({
name: "CreateShipment",
execute: async (ctx) => { /* call shipping-service */ },
compensate: async (ctx) => { /* cancel shipment */ },
});协调式 Saga(事件驱动)
协调式 Saga 没有中央控制器。每个服务监听事件并发布新事件,形成事件链。服务之间完全解耦,但事务流程更难追踪和调试。
5. CQRS — 命令查询职责分离
CQRS 将应用的读(Query)和写(Command)模型分离为不同的数据模型甚至不同的数据库。写模型针对一致性和业务规则优化,读模型针对查询性能优化。
// CQRS implementation with separate read/write models
// ── Command Side (Write Model) ──
interface CreateOrderCommand {
type: "CreateOrder";
customerId: string;
items: Array<{ sku: string; quantity: number; price: number }>;
}
class OrderCommandHandler {
constructor(
private writeRepo: OrderWriteRepository,
private eventBus: EventBus
) {}
async handle(cmd: CreateOrderCommand): Promise<string> {
// Business validation
if (cmd.items.length === 0) throw new Error("Order must have items");
const order = {
id: crypto.randomUUID(),
customerId: cmd.customerId,
items: cmd.items,
status: "pending" as const,
total: cmd.items.reduce((s, i) => s + i.price * i.quantity, 0),
createdAt: new Date(),
};
await this.writeRepo.save(order);
// Publish event to sync read model
await this.eventBus.publish({
type: "OrderCreated",
payload: order,
timestamp: new Date().toISOString(),
});
return order.id;
}
}
// ── Query Side (Read Model) ──
interface OrderReadModel {
orderId: string;
customerName: string; // denormalized
itemCount: number; // precomputed
totalFormatted: string; // preformatted
status: string;
createdAt: string;
}
// Read model projection — listens to events and updates view
class OrderProjection {
constructor(private readRepo: OrderReadRepository) {}
async onOrderCreated(event: OrderCreatedEvent) {
const customer = await lookupCustomer(event.payload.customerId);
await this.readRepo.upsert({
orderId: event.payload.id,
customerName: customer.name,
itemCount: event.payload.items.length,
totalFormatted: "$" + event.payload.total.toFixed(2),
status: event.payload.status,
createdAt: event.timestamp,
});
}
}6. 事件溯源
事件溯源将每个状态变更存储为不可变事件,而不是只保存当前状态。当前状态通过重放事件序列来重建。这提供了完整的审计日志,支持时间旅行查询,并与 CQRS 天然配合。
// Event Sourcing for an Order aggregate
type OrderEvent =
| { type: "OrderCreated"; data: { id: string; customerId: string; items: any[] } }
| { type: "OrderConfirmed"; data: { confirmedAt: string } }
| { type: "ItemAdded"; data: { sku: string; quantity: number } }
| { type: "OrderShipped"; data: { trackingNumber: string } }
| { type: "OrderCancelled"; data: { reason: string } };
interface EventStore {
append(streamId: string, events: OrderEvent[]): Promise<void>;
readStream(streamId: string): Promise<OrderEvent[]>;
}
class OrderAggregate {
private id = "";
private status = "draft";
private items: any[] = [];
private changes: OrderEvent[] = [];
// Rebuild state from event history
static fromHistory(events: OrderEvent[]): OrderAggregate {
const order = new OrderAggregate();
for (const event of events) {
order.apply(event);
}
return order;
}
private apply(event: OrderEvent) {
switch (event.type) {
case "OrderCreated":
this.id = event.data.id;
this.items = event.data.items;
this.status = "pending";
break;
case "OrderConfirmed":
this.status = "confirmed";
break;
case "OrderShipped":
this.status = "shipped";
break;
case "OrderCancelled":
this.status = "cancelled";
break;
}
}
// Command that produces new events
confirm() {
if (this.status !== "pending") throw new Error("Only pending orders");
const event: OrderEvent = {
type: "OrderConfirmed",
data: { confirmedAt: new Date().toISOString() },
};
this.apply(event);
this.changes.push(event);
}
getUncommittedChanges(): OrderEvent[] {
return [...this.changes];
}
}7. 熔断器模式
熔断器模式防止级联故障。当下游服务调用失败率超过阈值时,熔断器打开,直接返回错误而不发起请求,保护上游服务的资源。经过超时后进入半开状态,允许少量探测请求通过。
// Circuit Breaker implementation in TypeScript
type CircuitState = "CLOSED" | "OPEN" | "HALF_OPEN";
class CircuitBreaker {
private state: CircuitState = "CLOSED";
private failureCount = 0;
private successCount = 0;
private lastFailureTime = 0;
constructor(
private failureThreshold: number = 5,
private resetTimeoutMs: number = 30000,
private halfOpenMaxCalls: number = 3
) {}
async call<T>(fn: () => Promise<T>): Promise<T> {
if (this.state === "OPEN") {
if (Date.now() - this.lastFailureTime > this.resetTimeoutMs) {
this.state = "HALF_OPEN";
this.successCount = 0;
} else {
throw new Error("Circuit is OPEN — request rejected");
}
}
try {
const result = await fn();
this.onSuccess();
return result;
} catch (error) {
this.onFailure();
throw error;
}
}
private onSuccess() {
if (this.state === "HALF_OPEN") {
this.successCount++;
if (this.successCount >= this.halfOpenMaxCalls) {
this.state = "CLOSED";
this.failureCount = 0;
}
} else {
this.failureCount = 0;
}
}
private onFailure() {
this.failureCount++;
this.lastFailureTime = Date.now();
if (this.failureCount >= this.failureThreshold) {
this.state = "OPEN";
}
}
}
// Usage
const breaker = new CircuitBreaker(5, 30000, 3);
async function getOrderFromService(orderId: string) {
return breaker.call(async () => {
const res = await fetch(
"http://order-service:3001/orders/" + orderId
);
if (!res.ok) throw new Error("Service error: " + res.status);
return res.json();
});
}Resilience4j 配置示例(Java/Spring Boot)
# application.yml — Resilience4j circuit breaker config
resilience4j:
circuitbreaker:
instances:
orderService:
registerHealthIndicator: true
slidingWindowType: COUNT_BASED
slidingWindowSize: 10
failureRateThreshold: 50
waitDurationInOpenState: 30s
permittedNumberOfCallsInHalfOpenState: 3
automaticTransitionFromOpenToHalfOpenEnabled: true
recordExceptions:
- java.io.IOException
- java.util.concurrent.TimeoutException
retry:
instances:
orderService:
maxAttempts: 3
waitDuration: 1s
enableExponentialBackoff: true
exponentialBackoffMultiplier: 2
bulkhead:
instances:
orderService:
maxConcurrentCalls: 25
maxWaitDuration: 500ms8. 隔舱模式
隔舱模式借鉴了船舶设计中的水密隔舱概念。通过隔离资源池(线程池、连接池、内存),确保一个下游服务的故障不会耗尽整个系统的资源,限制故障的影响范围。
// Bulkhead pattern — isolated resource pools per service
class Bulkhead {
private activeCount = 0;
private queue: Array<{ resolve: Function; reject: Function }> = [];
constructor(
private maxConcurrent: number,
private maxQueueSize: number,
private timeoutMs: number
) {}
async execute<T>(fn: () => Promise<T>): Promise<T> {
if (this.activeCount >= this.maxConcurrent) {
if (this.queue.length >= this.maxQueueSize) {
throw new Error("Bulkhead full — request rejected");
}
await new Promise<void>((resolve, reject) => {
const timer = setTimeout(
() => reject(new Error("Bulkhead queue timeout")),
this.timeoutMs
);
this.queue.push({
resolve: () => { clearTimeout(timer); resolve(); },
reject,
});
});
}
this.activeCount++;
try {
return await fn();
} finally {
this.activeCount--;
if (this.queue.length > 0) {
const next = this.queue.shift()!;
next.resolve();
}
}
}
}
// Separate bulkheads per downstream dependency
const orderBulkhead = new Bulkhead(10, 20, 5000);
const paymentBulkhead = new Bulkhead(5, 10, 3000);
const inventoryBulkhead = new Bulkhead(15, 30, 5000);9. 指数退避重试
瞬时故障(网络抖动、暂时过载)可以通过重试解决,但固定间隔重试可能造成雷鸣群效应。指数退避加抖动可以有效分散重试请求,避免同时向下游服务发起大量重试。
// Retry with exponential backoff + jitter
interface RetryConfig {
maxAttempts: number;
baseDelayMs: number;
maxDelayMs: number;
jitter: boolean;
}
async function retryWithBackoff<T>(
fn: () => Promise<T>,
config: RetryConfig
): Promise<T> {
let lastError: Error | undefined;
for (let attempt = 0; attempt < config.maxAttempts; attempt++) {
try {
return await fn();
} catch (error) {
lastError = error as Error;
if (attempt === config.maxAttempts - 1) break;
// Calculate delay: base * 2^attempt
let delay = Math.min(
config.baseDelayMs * Math.pow(2, attempt),
config.maxDelayMs
);
// Add jitter: random value between 0 and delay
if (config.jitter) {
delay = Math.random() * delay;
}
console.log(
"Attempt " + (attempt + 1) + " failed. " +
"Retrying in " + Math.round(delay) + "ms..."
);
await new Promise(r => setTimeout(r, delay));
}
}
throw lastError;
}
// Usage
const result = await retryWithBackoff(
() => fetch("http://payment-service:3003/charge"),
{ maxAttempts: 4, baseDelayMs: 500, maxDelayMs: 8000, jitter: true }
);10. 分布式追踪与 OpenTelemetry
OpenTelemetry 是 CNCF 的可观测性标准。它通过在服务间传播 trace context(trace ID + span ID),为每个请求提供端到端的链路追踪、延迟分析和错误定位。
// OpenTelemetry setup for a Node.js microservice
import { NodeSDK } from "@opentelemetry/sdk-node";
import { getNodeAutoInstrumentations } from
"@opentelemetry/auto-instrumentations-node";
import { OTLPTraceExporter } from
"@opentelemetry/exporter-trace-otlp-http";
import { Resource } from "@opentelemetry/resources";
import {
SEMRESATTRS_SERVICE_NAME,
SEMRESATTRS_SERVICE_VERSION,
} from "@opentelemetry/semantic-conventions";
const sdk = new NodeSDK({
resource: new Resource({
[SEMRESATTRS_SERVICE_NAME]: "order-service",
[SEMRESATTRS_SERVICE_VERSION]: "1.2.0",
}),
traceExporter: new OTLPTraceExporter({
url: "http://jaeger:4318/v1/traces",
}),
instrumentations: [getNodeAutoInstrumentations()],
});
sdk.start();
console.log("OpenTelemetry tracing initialized");
// Custom span for business logic
import { trace, SpanStatusCode } from "@opentelemetry/api";
const tracer = trace.getTracer("order-service");
async function processOrder(orderId: string) {
return tracer.startActiveSpan(
"processOrder",
async (span) => {
try {
span.setAttribute("order.id", orderId);
const order = await fetchOrder(orderId);
span.setAttribute("order.total", order.total);
span.setAttribute("order.items_count", order.items.length);
await validateOrder(order);
await chargePayment(order);
span.setStatus({ code: SpanStatusCode.OK });
return order;
} catch (error) {
span.setStatus({
code: SpanStatusCode.ERROR,
message: (error as Error).message,
});
span.recordException(error as Error);
throw error;
} finally {
span.end();
}
}
);
}11. 服务网格(Istio / Linkerd)
服务网格通过 sidecar 代理透明地处理服务间通信,提供流量管理、安全(mTLS)、可观测性和弹性功能,无需修改应用代码。Istio 和 Linkerd 是两个最流行的实现。
| 特性 | Istio | Linkerd |
|---|---|---|
| 代理 | Envoy | linkerd2-proxy (Rust) |
| 复杂度 | 高(功能丰富) | 低(轻量简洁) |
| mTLS | 内置,自动 | 内置,默认开启 |
| 流量管理 | 高级(金丝雀、故障注入、镜像) | 基础(流量分割) |
| 资源开销 | 较高 | 较低 |
| 最佳场景 | 大规模企业级部署 | 中小规模、快速上手 |
# Istio VirtualService for canary deployment
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: order-service
spec:
hosts:
- order-service
http:
- route:
- destination:
host: order-service
subset: stable
weight: 90
- destination:
host: order-service
subset: canary
weight: 10
retries:
attempts: 3
perTryTimeout: 2s
timeout: 10s
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: order-service
spec:
host: order-service
trafficPolicy:
connectionPool:
tcp:
maxConnections: 100
http:
h2UpgradePolicy: DEFAULT
maxRequestsPerConnection: 10
outlierDetection:
consecutive5xxErrors: 5
interval: 30s
baseEjectionTime: 60s
subsets:
- name: stable
labels:
version: v1
- name: canary
labels:
version: v212. 数据管理策略
每服务独立数据库
每个微服务拥有自己的数据库实例或 schema,确保数据隔离和独立部署。服务之间通过 API 或事件进行数据交换,而非共享数据库。
# docker-compose.yml — database per service
services:
order-db:
image: postgres:16
environment:
POSTGRES_DB: orders
POSTGRES_USER: order_svc
POSTGRES_PASSWORD_FILE: /run/secrets/order_db_pw
volumes:
- order-data:/var/lib/postgresql/data
networks:
- order-net # isolated network
inventory-db:
image: mongo:7
environment:
MONGO_INITDB_DATABASE: inventory
volumes:
- inventory-data:/data/db
networks:
- inventory-net
payment-db:
image: postgres:16
environment:
POSTGRES_DB: payments
POSTGRES_USER: payment_svc
POSTGRES_PASSWORD_FILE: /run/secrets/payment_db_pw
volumes:
- payment-data:/var/lib/postgresql/data
networks:
- payment-net
volumes:
order-data:
inventory-data:
payment-data:
networks:
order-net:
inventory-net:
payment-net:共享数据库反模式
13. 健康检查与就绪探针
健康检查(liveness probe)确认服务进程是否存活;就绪探针(readiness probe)确认服务是否准备好接收流量。在 Kubernetes 中,这两个探针决定了 Pod 的生命周期管理和流量路由。
// Health check endpoints in Express.js
import express from "express";
const app = express();
let isReady = false;
// Liveness — is the process alive?
app.get("/health/live", (req, res) => {
res.status(200).json({ status: "alive" });
});
// Readiness — can it accept traffic?
app.get("/health/ready", async (req, res) => {
if (!isReady) {
return res.status(503).json({ status: "not ready" });
}
try {
// Check critical dependencies
await checkDatabaseConnection();
await checkCacheConnection();
res.status(200).json({
status: "ready",
checks: { database: "ok", cache: "ok" },
});
} catch (error) {
res.status(503).json({
status: "not ready",
error: (error as Error).message,
});
}
});
// Startup sequence
async function bootstrap() {
await connectToDatabase();
await connectToCache();
await warmUpCaches();
isReady = true;
console.log("Service is ready to accept traffic");
}
bootstrap();
app.listen(3001);# Kubernetes deployment with health probes
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-service
spec:
replicas: 3
selector:
matchLabels:
app: order-service
template:
metadata:
labels:
app: order-service
spec:
containers:
- name: order-service
image: order-service:1.2.0
ports:
- containerPort: 3001
livenessProbe:
httpGet:
path: /health/live
port: 3001
initialDelaySeconds: 10
periodSeconds: 15
failureThreshold: 3
readinessProbe:
httpGet:
path: /health/ready
port: 3001
initialDelaySeconds: 5
periodSeconds: 10
failureThreshold: 3
startupProbe:
httpGet:
path: /health/live
port: 3001
initialDelaySeconds: 0
periodSeconds: 5
failureThreshold: 30
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi14. 模式决策矩阵
下表帮助你根据具体场景选择合适的微服务模式。
| 场景 | 推荐模式 | 关键考量 |
|---|---|---|
| 从单体迁移 | Strangler Fig | 渐进式,低风险 |
| 跨服务事务 | Saga (Orchestration) | 需要补偿逻辑 |
| 高频读低频写 | CQRS | 读写模型独立扩展 |
| 完整审计日志 | Event Sourcing | 事件存储增长需管理 |
| 防止级联故障 | Circuit Breaker + Bulkhead | 配合重试和降级 |
| 请求链路追踪 | OpenTelemetry | 所有服务必须集成 |
| 零信任安全 | Service Mesh (Istio) | 自动 mTLS,运维开销较大 |
| 事件驱动解耦 | Async Messaging (Kafka) | 接受最终一致性 |
总结
微服务架构不是目标,而是实现业务敏捷性和可扩展性的手段。成功的关键在于:理解每个模式的适用场景和代价,以 DDD 限界上下文驱动服务分解,用 Saga 管理分布式事务,用 CQRS/事件溯源解决复杂查询和审计需求,用熔断器和隔舱保障弹性,用 OpenTelemetry 实现可观测性。从小处着手,按需引入模式,避免过度设计。
记住:分布式系统的复杂性是真实的代价。如果一个模块化的单体能满足需求,那就是最好的选择。只有当团队规模、部署频率和业务域复杂度真正需要时,才逐步拆分为微服务。