Kubernetes(K8s)是行业标准的容器编排平台。核心概念:Pod(最小可部署单元)、Deployment(管理副本和滚动更新)、Service(稳定网络端点)、ConfigMap/Secret(配置管理)、Ingress(HTTP 路由)、PersistentVolume(持久化存储)、HPA(自动弹性伸缩)。使用 kubectl 与集群交互,使用 Helm 将应用打包为 Chart。
- Kubernetes automates deployment, scaling, and self-healing of containerized applications across clusters.
- Pods are the smallest unit; Deployments manage Pods at scale with rolling updates and rollback support.
- Services provide stable DNS names and load balancing; Ingress routes HTTP traffic to multiple services.
- HPA autoscales Pod replicas based on CPU/memory; the Cluster Autoscaler adds/removes nodes automatically.
- Helm packages applications as Charts — reusable, versioned, configurable — and manages release lifecycle.
- PersistentVolumes and StorageClasses provide durable storage for stateful workloads like databases.
- For production, use a managed Kubernetes service (EKS, GKE, AKS) to eliminate control plane operations.
Kubernetes 是什么,为什么重要
Kubernetes(K8s)是一个开源容器编排系统,最初由 Google 设计,2014 年捐献给云原生计算基金会(CNCF)。它能在机器集群上自动化容器化应用的部署、扩缩和生命周期管理。2026 年,Kubernetes 已成为大规模运行生产工作负载的事实标准——从初创公司的第一个微服务,到超大规模企业每天管理数百万个容器。
在 Kubernetes 出现之前,团队需要手动管理服务器:自定义部署脚本、脆弱的 SSH 部署方式、手工搭建的负载均衡器。Docker 解决了"在我机器上能跑"的问题,但在没有编排工具的情况下,在数十台服务器上运行数千个容器是一团混乱。Kubernetes 带来了秩序:你在 YAML 清单中描述系统的期望状态,Kubernetes 持续将实际状态协调到与期望状态一致。容器崩溃了?Kubernetes 自动重启。节点宕机了?Kubernetes 将工作负载重新调度到健康节点。流量激增?Kubernetes 自动扩容。流量下降?Kubernetes 自动缩容——完全自动。
平台内置了自我修复、水平扩缩、零停机滚动更新、Secret 管理、服务发现、负载均衡和存储编排等能力。本指南涵盖了从"什么是 Pod"到使用 Helm、自动伸缩和持久化存储运行生产级应用所需的一切知识。
Kubernetes 核心特性
- 自我修复:自动重启失败的容器,替换无响应的 Pod,杀死未通过健康检查的容器,并将其重新调度到健康节点。
- 水平扩缩:通过单条命令(kubectl scale)手动扩缩,或通过 HPA 基于 CPU、内存或自定义指标自动扩缩。
- 滚动更新与回滚:使用可配置的滚动更新策略实现零停机部署;出现问题时可立即回滚。
- 服务发现与负载均衡:内置 DNS 和 kube-proxy 无需手动配置即可自动将流量分发到健康的 Pod 实例。
- 配置管理:通过 ConfigMap(非敏感配置)和 Secret(敏感配置)将应用代码与配置分离,以环境变量或文件挂载方式注入。
- 存储编排:使用 PersistentVolume 和 StorageClass 自动为 Pod 供应和挂载云端或本地存储。
- 基础设施抽象:相同的 Kubernetes 清单可在 AWS EKS、Google GKE、Azure AKS 或裸金属上以极少改动运行。
- 可扩展性:CustomResourceDefinitions(CRD)和 Operator 允许你为数据库、消息系统等扩展 Kubernetes 的领域特定自动化能力。
Kubernetes Cluster Architecture
┌─────────────────────────── CONTROL PLANE ───────────────────────────┐
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌────────────────────────┐ │
│ │ API Server │ │ Scheduler │ │ Controller Manager │ │
│ │ (REST API) │ │ (pod→node) │ │ (node/replica/svc) │ │
│ └──────┬───────┘ └──────────────┘ └────────────────────────┘ │
│ │ │
│ ┌──────┴───────┐ ┌──────────────┐ │
│ │ etcd │ │ Cloud │ │
│ │ (key-value) │ │ Controller │ │
│ └──────────────┘ └──────────────┘ │
└──────────────────────────────────────────────────────────────────────┘
│ │ │
▼ ▼ ▼
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ Worker Node 1 │ │ Worker Node 2 │ │ Worker Node 3 │
│ │ │ │ │ │
│ ┌────────────┐ │ │ ┌────────────┐ │ │ ┌────────────┐ │
│ │ kubelet │ │ │ │ kubelet │ │ │ │ kubelet │ │
│ └────────────┘ │ │ └────────────┘ │ │ └────────────┘ │
│ ┌────────────┐ │ │ ┌────────────┐ │ │ ┌────────────┐ │
│ │ kube-proxy │ │ │ │ kube-proxy │ │ │ │ kube-proxy │ │
│ └────────────┘ │ │ └────────────┘ │ │ └────────────┘ │
│ ┌────────────┐ │ │ ┌────────────┐ │ │ ┌────────────┐ │
│ │ containerd │ │ │ │ containerd │ │ │ │ containerd │ │
│ └────────────┘ │ │ └────────────┘ │ │ └────────────┘ │
│ [Pod][Pod][Pod] │ │ [Pod][Pod][Pod] │ │ [Pod][Pod][Pod] │
└──────────────────┘ └──────────────────┘ └──────────────────┘核心概念:Pod、Deployment、Service 等
Pod — 最小可部署单元
Pod 是 Kubernetes 的原子单位,是可以部署的最小实体。Pod 将一个或多个共享网络命名空间(通过 localhost 通信)和存储卷的容器封装在一起。实际上,大多数 Pod 运行单个应用容器,辅以用于日志、代理(Envoy/Istio)或 Secret 注入(Vault Agent)的边车容器。
Deployment — 管理副本和更新
Deployment 是一个高级抽象,用于管理 ReplicaSet,ReplicaSet 再管理一组相同的 Pod。你告诉 Deployment"我要 3 个这个容器镜像的副本",它就会创建、监控并维护恰好 3 个运行中的 Pod。更新镜像时,Deployment 会编排滚动更新——在终止旧 Pod 之前启动新 Pod——确保零停机。
Service — 稳定的网络端点
Pod 是短暂的——它们可以随时被创建、销毁和重新调度,且 IP 地址会变化。Service 提供一个稳定的虚拟 IP(ClusterIP)和 DNS 名称,将流量代理到一组通过标签选择的 Pod。四种 Service 类型:ClusterIP(集群内部访问)、NodePort(通过每个节点 IP 访问)、LoadBalancer(云厂商外部负载均衡器)、ExternalName(DNS 别名)。
ReplicaSet — 确保可用性
ReplicaSet 确保在任何给定时间运行指定数量的 Pod 副本。如果一个 Pod 崩溃,ReplicaSet 控制器会立即创建替代 Pod。ReplicaSet 很少直接创建,Deployment 会自动管理它们,并额外提供滚动更新和回滚历史记录的能力。
Namespace — 逻辑隔离
Namespace 将单个集群划分为多个虚拟集群,是实现多租户的主要机制:分离开发、测试和生产环境;隔离团队;或组织微服务。同一 Namespace 内的资源共享相同的 DNS 域,可以通过 ResourceQuota 和 LimitRange 进行限制,防止某个 Namespace 消耗所有集群资源。
kubectl 命令:完整参考
kubectl 是与 Kubernetes 集群交互的命令行工具。它与 Kubernetes API 服务器通信,是你部署应用、检查资源、调试问题和管理集群的主要界面。
Install kubectl and Set Up a Local Cluster
# Install kubectl (macOS)
brew install kubectl
# Install kubectl (Linux)
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install kubectl /usr/local/bin/kubectl
# Verify installation
kubectl version --client
# ── Local Cluster Options ─────────────────────────────────────────
# Option 1: minikube (recommended for beginners)
brew install minikube
minikube start --cpus=4 --memory=8192 --driver=docker
minikube dashboard # Open web UI
minikube stop
# Option 2: kind (Kubernetes in Docker)
brew install kind
kind create cluster --name dev
kind get clusters
kind delete cluster --name dev
# Option 3: Docker Desktop
# Settings → Kubernetes → Enable Kubernetes → Apply & Restart
# Check cluster connection
kubectl cluster-info
kubectl get nodes -o wideEssential kubectl Commands
# ── Cluster & Nodes ──────────────────────────────────────────────
kubectl cluster-info # Show control plane URL
kubectl get nodes # List all nodes
kubectl get nodes -o wide # Nodes with IP, OS, runtime
kubectl top nodes # CPU & memory usage per node
kubectl describe node <node-name> # Detailed node info & events
# ── Pods ─────────────────────────────────────────────────────────
kubectl get pods # Pods in current namespace
kubectl get pods -A # Pods across all namespaces
kubectl get pods -n staging # Pods in staging namespace
kubectl get pods -o wide # Pods with node & IP
kubectl get pods --show-labels # Show pod labels
kubectl get pods -l app=web-app # Filter by label
kubectl get pods -w # Watch pod status live
kubectl describe pod <pod-name> # Detailed pod info & events
kubectl logs <pod-name> # Container stdout logs
kubectl logs <pod-name> -f # Stream logs (follow)
kubectl logs <pod-name> --previous # Logs from crashed container
kubectl logs <pod-name> -c <container> # Logs from specific container
kubectl exec -it <pod-name> -- bash # Shell into pod
kubectl exec -it <pod-name> -- /bin/sh # Shell (Alpine/BusyBox)
kubectl exec <pod-name> -- env # Print environment variables
kubectl port-forward pod/<pod-name> 8080:80 # Local port 8080 to pod port 80
kubectl port-forward svc/<svc-name> 8080:80 # Local port to service port
kubectl delete pod <pod-name> # Delete pod (Deployment recreates it)
# ── Deployments ──────────────────────────────────────────────────
kubectl get deployments
kubectl get deploy -o wide
kubectl describe deployment <name>
kubectl apply -f deployment.yaml # Create or update
kubectl delete -f deployment.yaml # Delete
kubectl diff -f deployment.yaml # Preview changes
kubectl scale deploy <name> --replicas=5 # Scale manually
kubectl set image deploy/<name> app=myimage:2.0.0 # Update image
kubectl rollout status deploy/<name> # Watch rollout progress
kubectl rollout history deploy/<name> # View revision history
kubectl rollout undo deploy/<name> # Rollback to previous
kubectl rollout undo deploy/<name> --to-revision=3 # Specific revision
# ── Services & Ingress ───────────────────────────────────────────
kubectl get services
kubectl get svc -o wide
kubectl describe svc <name>
kubectl get ingress
kubectl describe ingress <name>
# ── ConfigMaps & Secrets ─────────────────────────────────────────
kubectl get configmaps
kubectl describe configmap <name>
kubectl get secrets
kubectl get secret <name> -o jsonpath="{.data.password}" | base64 --decode
# ── Namespaces ───────────────────────────────────────────────────
kubectl get namespaces
kubectl create namespace staging
kubectl config set-context --current --namespace=staging
# ── Debugging ────────────────────────────────────────────────────
kubectl get events --sort-by=.metadata.creationTimestamp
kubectl get pod <name> -o yaml # Full YAML spec
kubectl run debug-pod --image=busybox:latest -it --rm -- sh
kubectl top pods # Resource usage
# ── All Resources ────────────────────────────────────────────────
kubectl get all # All resources in namespace
kubectl get all -n my-namespace
kubectl get all -A # All resources everywhereYAML 清单:Pod、Deployment、Service、ConfigMap、Secret
Kubernetes 资源定义为 YAML 清单,使用 kubectl apply -f 应用到集群。每个清单有四个顶级字段:apiVersion、kind、metadata 和 spec。以下是最常见资源类型的生产就绪示例。
Pod Manifest
# pod.yaml — Production-ready Pod definition
apiVersion: v1
kind: Pod
metadata:
name: my-app
namespace: production
labels:
app: my-app
version: v1.0.0
tier: backend
spec:
containers:
- name: app
image: nginx:1.27-alpine
ports:
- containerPort: 80
protocol: TCP
resources:
requests:
memory: "64Mi"
cpu: "100m"
limits:
memory: "128Mi"
cpu: "250m"
livenessProbe:
httpGet:
path: /healthz
port: 80
initialDelaySeconds: 10
periodSeconds: 15
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 80
initialDelaySeconds: 5
periodSeconds: 10
startupProbe:
httpGet:
path: /healthz
port: 80
failureThreshold: 30
periodSeconds: 5
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 5"]
terminationGracePeriodSeconds: 30
restartPolicy: AlwaysDeployment Manifest (Production-Ready)
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
namespace: production
labels:
app: web-app
annotations:
kubernetes.io/change-cause: "Release v1.2.0 — performance improvements"
spec:
replicas: 3
selector:
matchLabels:
app: web-app
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # Max 1 extra Pod during update
maxUnavailable: 0 # No downtime
template:
metadata:
labels:
app: web-app
version: "1.2.0"
spec:
containers:
- name: web
image: my-registry/web-app:1.2.0
ports:
- containerPort: 3000
env:
- name: NODE_ENV
value: "production"
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: db-credentials
key: password
envFrom:
- configMapRef:
name: app-config
resources:
requests:
memory: "128Mi"
cpu: "200m"
limits:
memory: "256Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 10
periodSeconds: 15
readinessProbe:
httpGet:
path: /ready
port: 3000
initialDelaySeconds: 5
periodSeconds: 5
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 10"]
terminationGracePeriodSeconds: 60
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: web-appService Manifest — All Four Types
# ClusterIP — internal cluster access only (default)
apiVersion: v1
kind: Service
metadata:
name: web-app-svc
namespace: production
spec:
type: ClusterIP
selector:
app: web-app
ports:
- protocol: TCP
port: 80
targetPort: 3000
---
# NodePort — accessible at <NodeIP>:30080 from outside cluster
apiVersion: v1
kind: Service
metadata:
name: web-app-nodeport
spec:
type: NodePort
selector:
app: web-app
ports:
- port: 80
targetPort: 3000
nodePort: 30080 # Must be 30000–32767
---
# LoadBalancer — provisions cloud LB (EKS/GKE/AKS)
apiVersion: v1
kind: Service
metadata:
name: web-app-lb
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
spec:
type: LoadBalancer
selector:
app: web-app
ports:
- port: 443
targetPort: 3000ConfigMap and Secret
# configmap.yaml — non-sensitive configuration
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
namespace: production
data:
NODE_ENV: "production"
LOG_LEVEL: "info"
PORT: "3000"
MAX_CONNECTIONS: "100"
app.json: |
{
"features": {
"darkMode": true,
"betaFeatures": false
},
"pagination": {
"defaultPageSize": 20
}
}
---
# secret.yaml — sensitive data (base64-encoded in etcd)
apiVersion: v1
kind: Secret
metadata:
name: db-credentials
namespace: production
type: Opaque
stringData:
username: admin
password: "super-secret-password-change-me"
url: "postgres://admin:password@postgres:5432/myapp"
---
# TLS Secret — for HTTPS certificates
apiVersion: v1
kind: Secret
metadata:
name: tls-cert
namespace: production
type: kubernetes.io/tls
data:
tls.crt: <base64-encoded-cert>
tls.key: <base64-encoded-key>Helm:Kubernetes 包管理器
Helm 是 Kubernetes 的包管理器,类似于 Debian 的 apt 或 Node.js 的 npm。Helm Chart 是一组带有可配置参数值的 Kubernetes 清单模板。Chart 可通过仓库共享(Artifact Hub 托管了数千个社区 Chart,用于数据库、监控、Ingress 控制器等)。Helm 管理 Release 的完整生命周期:安装、升级、回滚和卸载。
Helm v3(当前版本)去除了 Helm v2 的服务端 Tiller 组件。Chart 有版本管理,Release 可以被检查、对比和回滚。对于需要按环境定制的团队,Helm values 文件在 Chart 逻辑和部署特定配置之间提供了清晰的分离。
Helm Installation and Basic Commands
# Install Helm (macOS)
brew install helm
# Install Helm (Linux)
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
# Verify
helm version
# ── Repository Management ────────────────────────────────────────
helm repo add stable https://charts.helm.sh/stable
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update # Refresh repo indices
helm repo list # Show configured repos
# ── Installing Charts ────────────────────────────────────────────
# Install NGINX Ingress Controller
helm install ingress-nginx ingress-nginx/ingress-nginx \
--namespace ingress-nginx \
--create-namespace
# Install PostgreSQL with custom values
helm install my-postgres bitnami/postgresql \
--namespace production \
--set auth.postgresPassword=secret123 \
--set primary.persistence.size=20Gi \
--set primary.resources.requests.memory=256Mi
# Install with a values file (preferred for complex configs)
helm install my-release my-chart/ -f values-production.yaml
# ── Release Management ───────────────────────────────────────────
helm list # List all releases
helm list -n production # Releases in namespace
helm status my-postgres # Release status
helm history my-postgres # Release history
# Upgrade a release
helm upgrade my-postgres bitnami/postgresql \
--namespace production \
--set auth.postgresPassword=secret123 \
--set primary.persistence.size=30Gi
# Rollback to previous version
helm rollback my-postgres 1
# Uninstall a release
helm uninstall my-postgres -n production
# ── Chart Development ────────────────────────────────────────────
helm create my-app # Scaffold a new chart
helm lint my-app/ # Validate chart syntax
helm template my-app my-app/ -f values.yaml # Render templates locally
helm install my-app my-app/ --dry-run --debug
helm package my-app/ # Package chart as .tgzHelm Values File Example
# values-production.yaml
# Override default chart values for production environment
replicaCount: 3
image:
repository: my-registry/web-app
tag: "1.2.0"
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 80
ingress:
enabled: true
className: nginx
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
hosts:
- host: app.example.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: app-tls
hosts:
- app.example.com
resources:
requests:
memory: 256Mi
cpu: 200m
limits:
memory: 512Mi
cpu: 500m
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
targetCPUUtilizationPercentage: 70
config:
NODE_ENV: production
LOG_LEVEL: info
postgresql:
enabled: true
auth:
database: myapp
existingSecret: db-credentials
primary:
persistence:
size: 20Gi
storageClass: gp3Kubernetes 网络:ClusterIP、NodePort、LoadBalancer、Ingress
Kubernetes 网络在三个层面运行:Pod 间通信(由 CNI 插件处理——Flannel、Calico、Cilium)、基于 Service 的负载均衡(kube-proxy)和外部访问(Ingress 控制器)。理解每一层对于设计高弹性、安全的应用架构至关重要。
Ingress — HTTP Routing with TLS
# ingress.yaml — Route traffic to multiple services via one entry point
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
namespace: production
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/proxy-body-size: "50m"
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
ingressClassName: nginx
tls:
- hosts:
- app.example.com
- api.example.com
secretName: app-tls-secret
rules:
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: frontend-svc
port:
number: 80
- host: api.example.com
http:
paths:
- path: /v1
pathType: Prefix
backend:
service:
name: api-v1-svc
port:
number: 80
- path: /v2
pathType: Prefix
backend:
service:
name: api-v2-svc
port:
number: 80
---
# ClusterIssuer — Let's Encrypt production certificates
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: admin@example.com
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- http01:
ingress:
class: nginxNetworkPolicy — Zero-Trust Pod Networking
# networkpolicy.yaml — Deny all, then allow only required paths
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: api-network-policy
namespace: production
spec:
podSelector:
matchLabels:
app: api-server
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend # Only frontend can call API
- namespaceSelector:
matchLabels:
name: monitoring # Prometheus can scrape metrics
ports:
- protocol: TCP
port: 3000
egress:
- to:
- podSelector:
matchLabels:
app: postgres # API can only reach the DB
ports:
- protocol: TCP
port: 5432存储:PersistentVolume、PersistentVolumeClaim、StorageClass
容器存储默认是临时的——容器重启时数据会丢失。对于有状态应用(数据库、缓存、文件存储),Kubernetes 提供了 PersistentVolume(PV)子系统。PV 是集群中的一块存储,由管理员预配置或由 StorageClass 动态供应。PersistentVolumeClaim(PVC)是用户对存储的请求,指定大小和访问模式。Kubernetes 将 PVC 绑定到匹配的 PV。
StorageClass 支持动态供应:无需预先创建 PV,集群在创建 PVC 时自动供应存储(AWS 的 EBS、GCP 的 Persistent Disk、Azure Disk)。这是托管 Kubernetes 上的标准做法。StatefulSet 使用 volumeClaimTemplates 为每个 Pod 实例提供自己的稳定、唯一命名的 PVC,在 Pod 重新调度后仍然持久。
StorageClass, PVC, and StatefulSet
# storageclass.yaml — Dynamic provisioning on AWS
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
provisioner: ebs.csi.aws.com
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Retain
parameters:
type: gp3
iops: "3000"
throughput: "125"
encrypted: "true"
---
# pvc.yaml — Request 20Gi of SSD storage
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-data
namespace: production
spec:
accessModes:
- ReadWriteOnce
storageClassName: fast-ssd
resources:
requests:
storage: 20Gi
---
# StatefulSet — PostgreSQL with per-pod stable storage
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
namespace: production
spec:
serviceName: postgres
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:16-alpine
ports:
- containerPort: 5432
env:
- name: POSTGRES_DB
value: myapp
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: db-credentials
key: username
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: db-credentials
key: password
- name: PGDATA
value: /var/lib/postgresql/data/pgdata
volumeMounts:
- name: postgres-storage
mountPath: /var/lib/postgresql/data
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "1"
livenessProbe:
exec:
command: ["pg_isready", "-U", "admin", "-d", "myapp"]
initialDelaySeconds: 30
periodSeconds: 10
volumeClaimTemplates:
- metadata:
name: postgres-storage
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd
resources:
requests:
storage: 20Gi
---
# Headless service for StatefulSet DNS
apiVersion: v1
kind: Service
metadata:
name: postgres
namespace: production
spec:
clusterIP: None # Headless — no virtual IP
selector:
app: postgres
ports:
- port: 5432
targetPort: 5432自动伸缩:HPA、VPA 和集群自动扩缩器
Kubernetes 提供三种互补的自动扩缩机制,在栈的不同层面运行。水平 Pod 自动扩缩器(HPA)根据指标缩放 Pod 副本数量。垂直 Pod 自动扩缩器(VPA)调整现有 Pod 的资源请求和限制。集群自动扩缩器根据待调度的 Pod 和节点利用率调整集群中的节点数量。
对于大多数应用,HPA 是主要的自动扩缩工具。它查询 Metrics API(CPU、内存,或来自 Prometheus Adapter 的自定义指标),并通过可配置的稳定窗口扩缩目标 Deployment 或 StatefulSet,防止频繁抖动。
Horizontal Pod Autoscaler (HPA)
# hpa.yaml — Scale web-app pods based on CPU and memory
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: web-app-hpa
namespace: production
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web-app
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: "1000"
behavior:
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Pods
value: 4
periodSeconds: 60
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 10
periodSeconds: 60Vertical Pod Autoscaler (VPA)
# vpa.yaml — Automatically tune resource requests and limits
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: web-app-vpa
namespace: production
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: web-app
updatePolicy:
updateMode: "Auto"
resourcePolicy:
containerPolicies:
- containerName: web
minAllowed:
cpu: 100m
memory: 128Mi
maxAllowed:
cpu: "2"
memory: 2Gi
controlledResources: ["cpu", "memory"]Test Autoscaling with Load Generation
# Apply HPA
kubectl apply -f hpa.yaml
# Check current HPA status
kubectl get hpa -n production
kubectl describe hpa web-app-hpa -n production
# Generate CPU load to trigger autoscaling
kubectl run load-generator \
--image=busybox:latest \
--rm -it \
--restart=Never \
-- /bin/sh -c "while true; do wget -q -O- http://web-app-svc/; done"
# In a second terminal: watch pods scale up
kubectl get pods -n production -w
# Watch HPA metrics update live
kubectl get hpa -n production -w
# Stop load generator (Ctrl+C), watch pods scale down after 5 min
# Cluster Autoscaler adds nodes when pods are Pending
kubectl get nodes -wKubernetes vs Docker Swarm vs HashiCorp Nomad
Kubernetes 不是唯一的容器编排器。Docker Swarm 更简单,内置于 Docker Engine。HashiCorp Nomad 是通用编排器,可处理容器、二进制文件、Java JAR 等。以下是它们在生产中最重要维度上的对比:
| Feature | Kubernetes | Docker Swarm | HashiCorp Nomad |
|---|---|---|---|
| Origin | Google → CNCF (2014) | Docker Inc. (2016) | HashiCorp (2015) |
| Workloads | OCI containers | Docker containers only | Containers, binaries, JARs, VMs |
| Learning Curve | High | Low | Medium |
| Setup Complexity | High (many components) | Very low (built-in to Docker) | Low (single binary) |
| Auto-scaling | HPA, VPA, Cluster Autoscaler | Basic / manual | Dynamic task placement |
| Self-healing | Full reschedule + restart | Basic restart policies | Job re-scheduling |
| Networking | CNI plugins, NetworkPolicy, full SDN | Overlay network, basic routing | Consul Connect, CNI plugins |
| Storage | PV/PVC/StorageClass ecosystem | Docker volumes | Host volumes, CSI plugins |
| Rolling Updates | Native, configurable | Built-in, limited options | Canary, blue-green, rolling |
| Service Discovery | CoreDNS + kube-proxy | Built-in DNS (VIP) | Consul (first-class) |
| RBAC | Granular, mature | Basic Swarm secrets | ACL policies via Consul/Vault |
| Ecosystem | Enormous (CNCF, Helm, Operators) | Small, Docker-centric | HashiCorp stack integration |
| Managed Cloud | EKS, GKE, AKS, DOKS, and more | Not offered by major clouds | HCP Nomad (HashiCorp Cloud) |
| Resource Overhead | High (~2GB control plane) | Minimal (embedded in Docker) | Low (~256MB single agent) |
| Best For | Large-scale microservices, cloud-native | Simple teams, small scale | Mixed workloads (containers + legacy) |
- Running 10+ microservices that need independent scaling
- Production workloads requiring zero-downtime deploys and auto-rollback
- Teams need fine-grained RBAC and namespace-level resource quotas
- You want the richest ecosystem (Helm, Operators, service mesh, Prometheus)
- You are deploying to AWS, GCP, or Azure with managed control plane offerings
- Small team with simple Docker-based apps running on 1-5 servers
- You want zero additional tooling beyond Docker Engine
- Quick setup is more valuable than advanced features
- Migrating from Docker Compose to basic orchestration with minimal relearning
- Limited budget for DevOps expertise
- Mixed workloads: containers AND legacy binaries, Java JARs, or VMs
- Already invested in the HashiCorp stack (Consul, Vault, Terraform)
- You need Kubernetes-grade features with significantly lower operational overhead
- Edge deployments or multi-datacenter workloads across heterogeneous infrastructure
- Compliance requires workload isolation without full Kubernetes complexity
Complete Production Application Example
The following manifest deploys a Node.js API with PostgreSQL, Ingress, and HPA in a single file. Apply with kubectl apply -f complete-app.yaml.
# complete-app.yaml
# 1. Namespace
apiVersion: v1
kind: Namespace
metadata:
name: myapp
labels:
environment: production
---
# 2. ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
namespace: myapp
data:
NODE_ENV: production
PORT: "3000"
LOG_LEVEL: info
---
# 3. Secret
apiVersion: v1
kind: Secret
metadata:
name: app-secrets
namespace: myapp
type: Opaque
stringData:
DATABASE_URL: postgres://app:secret@postgres:5432/myapp
JWT_SECRET: change-me-in-production
---
# 4. PostgreSQL StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
namespace: myapp
spec:
serviceName: postgres
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:16-alpine
env:
- { name: POSTGRES_DB, value: myapp }
- { name: POSTGRES_USER, value: app }
- { name: POSTGRES_PASSWORD, value: secret }
volumeMounts:
- { name: pg-data, mountPath: /var/lib/postgresql/data }
resources:
requests: { memory: 256Mi, cpu: 250m }
limits: { memory: 512Mi, cpu: "1" }
volumeClaimTemplates:
- metadata:
name: pg-data
spec:
accessModes: [ReadWriteOnce]
resources:
requests:
storage: 10Gi
---
# 5. Postgres Service
apiVersion: v1
kind: Service
metadata:
name: postgres
namespace: myapp
spec:
type: ClusterIP
selector:
app: postgres
ports:
- { port: 5432, targetPort: 5432 }
---
# 6. App Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
namespace: myapp
spec:
replicas: 3
selector:
matchLabels:
app: web-app
strategy:
type: RollingUpdate
rollingUpdate: { maxSurge: 1, maxUnavailable: 0 }
template:
metadata:
labels:
app: web-app
spec:
containers:
- name: app
image: my-registry/web-app:1.0.0
ports:
- { containerPort: 3000 }
envFrom:
- configMapRef: { name: app-config }
- secretRef: { name: app-secrets }
resources:
requests: { memory: 128Mi, cpu: 100m }
limits: { memory: 256Mi, cpu: 500m }
livenessProbe:
httpGet: { path: /health, port: 3000 }
initialDelaySeconds: 15
readinessProbe:
httpGet: { path: /ready, port: 3000 }
initialDelaySeconds: 5
---
# 7. App Service
apiVersion: v1
kind: Service
metadata:
name: web-app-svc
namespace: myapp
spec:
type: ClusterIP
selector:
app: web-app
ports:
- { port: 80, targetPort: 3000 }
---
# 8. Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: web-app-ingress
namespace: myapp
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
ingressClassName: nginx
tls:
- hosts: [myapp.example.com]
secretName: myapp-tls
rules:
- host: myapp.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: web-app-svc
port: { number: 80 }
---
# 9. HPA
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: web-app-hpa
namespace: myapp
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web-app
minReplicas: 3
maxReplicas: 15
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70Verify the Deployment
# Apply the entire stack at once
kubectl apply -f complete-app.yaml
# Verify all resources are created
kubectl get all -n myapp
# Watch pods reach Running state
kubectl get pods -n myapp -w
# Stream application logs
kubectl logs -n myapp -l app=web-app --tail=50 -f
# Test connectivity from inside the cluster
kubectl run test --image=curlimages/curl:latest --rm -it \
--restart=Never -n myapp \
-- curl http://web-app-svc/health
# Port-forward for local testing
kubectl port-forward -n myapp svc/web-app-svc 8080:80
# Open http://localhost:8080
# Check HPA status
kubectl get hpa -n myapp
# Check TLS certificate (cert-manager)
kubectl get certificate -n myapp
kubectl describe certificate myapp-tls -n myapp试试我们相关的开发者工具
JSON Formatter · YAML Validator · Base64 Encoder · JWT Decoder · Cron Expression Generator
常见问题
Kubernetes 中 Pod 和容器有什么区别?
容器是具有自己文件系统和进程命名空间的单个进程。Pod 是 Kubernetes 的抽象层,包裹一个或多个共享网络命名空间和存储卷的容器。Pod 内的容器通过 localhost 通信。大多数情况下 Pod 包含一个容器,但边车模式会使用多个容器——例如,应用容器加上 Envoy 代理用于服务网格流量管理。
如何调试 CrashLoopBackOff 的 Pod?
CrashLoopBackOff 意味着容器不断崩溃,Kubernetes 以指数退避方式不断重启它。调试方法:(1) kubectl describe pod 查看 Events 部分的错误信息;(2) kubectl logs --previous 查看上次崩溃实例的日志;(3) kubectl exec -it 在容器短暂启动时进入 Shell;(4) 检查资源限制——OOMKilled 表示容器超过了内存限制。
Helm 是什么,什么时候该用它?
Helm 是 Kubernetes 的包管理器。它将相关的 Kubernetes 清单打包成可复用的版本化包(称为 Chart)。以下情况使用 Helm:(1) 部署有 5 个以上相互依赖资源的复杂应用;(2) 需要按环境(开发、测试、生产)配置而不重复清单;(3) 想从 Artifact Hub 安装社区维护的软件(PostgreSQL、Redis、NGINX Ingress、Prometheus)。对于简单的单服务应用,直接 kubectl apply 通常就够了。
运行 Kubernetes 的成本是多少?
Kubernetes 本身是免费的。集群成本包括:托管控制平面费用(EKS/GKE/AKS 每月 $70-150)、工作节点计算资源(根据工作负载每月 $50-500+)、负载均衡器(每个每月 $15-20)和持久化存储(每 GB 每月 $0.10-0.20)。EKS 上最小的生产集群可能每月需要 $200-400。K3s 在单台 VPS 上的轻量替代方案低至每月 $10-20。
Deployment 和 StatefulSet 有什么区别?
Deployment 用于无状态应用,Pod 之间可以互换。StatefulSet 用于有状态应用(数据库、队列),Pod 需要:(1) 稳定、唯一的网络身份(pod-0、pod-1、pod-2);(2) 重新调度后仍跟随 Pod 的稳定持久存储;(3) 有序、优雅的部署和扩缩。Web 服务器和 API 用 Deployment;PostgreSQL、MongoDB、Kafka、Elasticsearch 和 ZooKeeper 用 StatefulSet。
Kubernetes 如何实现零停机部署?
Kubernetes Deployment 默认使用 RollingUpdate 策略。设置 maxUnavailable: 0 和 maxSurge: 1,Kubernetes 先用新镜像创建一个新 Pod,等它变为 Ready 状态(通过就绪探针),再终止一个旧 Pod,如此循环直到所有副本更新完毕。配合 pre-stop hook 和 terminationGracePeriodSeconds,运行中的请求在旧 Pod 被杀死前完成处理,实现真正的零停机部署。
应该选择哪个托管 Kubernetes 服务:EKS、GKE 还是 AKS?
Google GKE 通常被认为功能最完善、运维体验最好——毕竟 Kubernetes 是 Google 发明的。AWS EKS 与 AWS 生态深度集成(IAM、ALB、EBS),如果你已全面使用 AWS 则是最佳选择。Azure AKS 是使用 Azure AD 和 Azure DevOps 的微软技术栈团队的自然选择。三者都提供了 Autopilot/Serverless 节点管理模式,进一步降低运维开销。
如何将 Kubernetes 应用暴露到互联网?
有三种选择:(1) NodePort:在每个节点 IP 上以静态端口暴露服务——最简单但不适合生产;(2) LoadBalancer:自动供应云负载均衡器——云环境最简单,但每个服务创建一个 LB(成本较高);(3) Ingress:单个入口点根据主机和路径规则将 HTTP/HTTPS 流量路由到多个服务——推荐的生产模式。使用 Ingress 控制器(NGINX、Traefik、AWS ALB Ingress)和 cert-manager 自动管理 TLS 证书。
总结
Kubernetes 是现代基础设施中最具变革性的技术之一,但它有真实的复杂性。关键是循序渐进地学习:从本地集群(minikube 或 kind)的 Pod 和 Deployment 开始,添加 Service 和 Ingress,然后进阶到 Helm 打包、HPA 自动伸缩和 PersistentVolume 有状态工作负载。在生产环境中,使用托管服务(EKS、GKE、AKS),让你专注于应用本身而不是集群运维。投入学习 Kubernetes 的时间会在未来数年持续产生回报——它是在云原生环境中工作的任何开发者的基础技能。