DevToolBox免费
博客

AWS 服务指南:EC2、S3、RDS、Lambda、ECS、CloudFront、IAM 与成本优化

25 分钟阅读作者 DevToolBox Team

AWS 服务全指南 2026:EC2、S3、RDS、Lambda、ECS/EKS、CloudFront、IAM、VPC 与成本优化

全面讲解 AWS 核心服务:计算(EC2、Lambda、ECS/EKS)、存储(S3)、数据库(RDS、DynamoDB)、网络(VPC、Route 53、CloudFront)、安全(IAM)、消息(SQS/SNS)、监控(CloudWatch)以及经过验证的成本优化策略。

TL;DR — 60 秒速览 AWS 核心服务
  • EC2 提供弹性计算能力;Spot 实例可节省高达 90% 成本
  • S3 提供 99.999999999%(11 个 9)的持久性,适合存储任何规模数据
  • RDS 适合关系型数据,DynamoDB 适合单毫秒级延迟的键值/文档工作负载
  • Lambda 按调用计费,ECS/EKS 用于容器化微服务
  • VPC + 安全组 + IAM 构建纵深防御安全模型
  • CloudFront + Route 53 实现全球亚 100ms 响应时间
  • SQS/SNS 解耦微服务;CloudWatch 提供全栈可观测性
  • 预留实例 + Savings Plans + 自动伸缩可削减 40-60% 账单
核心要点
  • 选择正确的实例类型和定价模型是最大的成本杠杆
  • 一切设计都应考虑多可用区高可用
  • IAM 最小权限原则是安全的基石,使用角色而非密钥
  • 无服务器架构(Lambda + DynamoDB + API Gateway)可将运维开销降至接近零
  • 基础设施即代码(CloudFormation / CDK / Terraform)是生产环境必需
  • AWS Well-Architected Framework 五大支柱指导架构决策

1. AWS 服务全景图

Amazon Web Services(AWS)是全球最大的云平台,提供超过 200 种服务,覆盖计算、存储、数据库、网络、安全、分析、机器学习等领域。理解核心服务及其组合方式是构建可靠、可扩展云架构的基础。

类别核心服务典型用途
计算EC2, Lambda, ECS, EKSWeb 服务器、API、批处理、微服务
存储S3, EBS, EFS, Glacier对象存储、块存储、文件系统、归档
数据库RDS, DynamoDB, ElastiCache, Aurora关系型、NoSQL、缓存、高性能 OLTP
网络VPC, Route 53, CloudFront, ELB虚拟网络、DNS、CDN、负载均衡
安全IAM, KMS, WAF, Shield身份管理、加密、Web 防火墙、DDoS 防护
消息/集成SQS, SNS, EventBridge, Step Functions消息队列、发布订阅、事件驱动、工作流
监控CloudWatch, X-Ray, CloudTrail指标、日志、链路追踪、审计

2. EC2:弹性计算的基石

Amazon EC2(Elastic Compute Cloud)提供按需、可调整大小的计算能力。从单台开发服务器到数千台高性能计算节点,EC2 都能满足需求。理解实例类型、定价模型和自动伸缩是控制成本和保证性能的关键。

实例类型选择

系列优化方向典型实例场景
T3 / T4g突发性能t3.medium开发环境、轻量 Web
M6i / M7g通用均衡m6i.xlarge应用服务器、中型数据库
C6i / C7g计算密集c6i.2xlarge批处理、科学建模、编码转换
R6i / R7g内存密集r6i.4xlarge内存数据库、实时大数据
P4d / G5GPU 加速p4d.24xlargeML 训练、图形渲染、HPC

EC2 定价模型对比

模型折扣承诺最佳场景
On-Demand0%不可预测的短期工作负载
Reserved (RI)最高 72%1 或 3 年稳态基线工作负载
Spot最高 90%无(可被中断)容错批处理、CI/CD、大数据
Savings Plans最高 72%1 或 3 年 $/hr 承诺灵活跨实例类型/区域

Auto Scaling 配置示例

# EC2 Auto Scaling Group — CloudFormation snippet
AutoScalingGroup:
  Type: AWS::AutoScaling::AutoScalingGroup
  Properties:
    LaunchTemplate:
      LaunchTemplateId: !Ref MyLaunchTemplate
      Version: !GetAtt MyLaunchTemplate.LatestVersionNumber
    MinSize: 2
    MaxSize: 10
    DesiredCapacity: 2
    VPCZoneIdentifier:
      - !Ref PrivateSubnet1
      - !Ref PrivateSubnet2
    TargetGroupARNs:
      - !Ref MyTargetGroup
    HealthCheckType: ELB
    HealthCheckGracePeriod: 300

ScalingPolicy:
  Type: AWS::AutoScaling::ScalingPolicy
  Properties:
    AutoScalingGroupName: !Ref AutoScalingGroup
    PolicyType: TargetTrackingScaling
    TargetTrackingConfiguration:
      PredefinedMetricSpecification:
        PredefinedMetricType: ASGAverageCPUUtilization
      TargetValue: 60.0

3. S3:无限扩展的对象存储

Amazon S3(Simple Storage Service)提供行业领先的持久性(99.999999999%)、可用性和安全性。S3 存储桶可存储从字节到 5TB 的任意数量对象,广泛用于静态网站托管、数据湖、备份和媒体分发。

S3 存储类别与成本

存储类别延迟最低存储时长用途
S3 Standard毫秒级频繁访问数据
S3 Intelligent-Tiering毫秒级访问模式不确定
S3 Standard-IA毫秒级30 不频繁但需快速访问
S3 Glacier Instant毫秒级90 季度访问归档
S3 Glacier Deep Archive12-48h180 合规长期归档

S3 生命周期策略示例

# AWS CLI — S3 lifecycle configuration
aws s3api put-bucket-lifecycle-configuration \
  --bucket my-data-lake \
  --lifecycle-configuration '{
    "Rules": [
      {
        "ID": "ArchiveOldLogs",
        "Status": "Enabled",
        "Filter": { "Prefix": "logs/" },
        "Transitions": [
          { "Days": 30,  "StorageClass": "STANDARD_IA" },
          { "Days": 90,  "StorageClass": "GLACIER" },
          { "Days": 365, "StorageClass": "DEEP_ARCHIVE" }
        ],
        "Expiration": { "Days": 2555 }
      }
    ]
  }'

4. RDS 与 DynamoDB:关系型 vs NoSQL

数据库选择是架构决策中最重要的一环。AWS 提供全托管关系型数据库(RDS/Aurora)和 NoSQL 数据库(DynamoDB),各有其最佳应用场景。

RDS / Aurora 概览

Amazon RDS 支持 MySQL、PostgreSQL、MariaDB、Oracle 和 SQL Server。Aurora 是 AWS 自研的 MySQL/PostgreSQL 兼容引擎,性能是标准 MySQL 的 5 倍、PostgreSQL 的 3 倍,存储自动扩展至 128TB。

# Create a Multi-AZ PostgreSQL RDS instance
aws rds create-db-instance \
  --db-instance-identifier myapp-db \
  --db-instance-class db.r6g.xlarge \
  --engine postgres \
  --engine-version 15.4 \
  --master-username admin \
  --master-user-password "\${SECURE_PASSWORD}" \
  --allocated-storage 100 \
  --storage-type gp3 \
  --multi-az \
  --vpc-security-group-ids sg-0abc1234 \
  --db-subnet-group-name myapp-db-subnets \
  --backup-retention-period 7 \
  --storage-encrypted

DynamoDB 关键概念

DynamoDB 是完全无服务器的键值和文档数据库,提供个位数毫秒响应。核心概念包括分区键(Partition Key)、排序键(Sort Key)、全局二级索引(GSI)和本地二级索引(LSI)。容量模式分为按需(On-Demand)和预配置(Provisioned)。

# DynamoDB table with GSI — CloudFormation
OrdersTable:
  Type: AWS::DynamoDB::Table
  Properties:
    TableName: Orders
    BillingMode: PAY_PER_REQUEST    # On-Demand
    AttributeDefinitions:
      - AttributeName: PK
        AttributeType: S
      - AttributeName: SK
        AttributeType: S
      - AttributeName: GSI1PK
        AttributeType: S
    KeySchema:
      - AttributeName: PK
        KeyType: HASH
      - AttributeName: SK
        KeyType: RANGE
    GlobalSecondaryIndexes:
      - IndexName: GSI1
        KeySchema:
          - AttributeName: GSI1PK
            KeyType: HASH
          - AttributeName: SK
            KeyType: RANGE
        Projection:
          ProjectionType: ALL
    PointInTimeRecoverySpecification:
      PointInTimeRecoveryEnabled: true

RDS vs DynamoDB 决策矩阵

维度RDS / AuroraDynamoDB
数据模型关系型 / SQL键值 / 文档
延迟低毫秒级(依查询复杂度)个位数毫秒(一致)
扩展方式纵向(读扩展用只读副本)自动水平扩展
事务完整 ACID有限事务(25项/4MB)
运维需选择实例、维护窗口完全无服务器

5. Lambda:事件驱动无服务器计算

AWS Lambda 让你无需管理服务器即可运行代码。上传代码,Lambda 自动分配计算资源、执行并伸缩。你只为实际使用的计算时间付费(按毫秒计费),空闲时零成本。

Lambda 函数最佳实践

# Python Lambda with best practices
import json
import boto3
import os
from aws_lambda_powertools import Logger, Tracer, Metrics
from aws_lambda_powertools.utilities.typing import LambdaContext

# Initialize outside handler (reused across invocations)
logger = Logger()
tracer = Tracer()
metrics = Metrics()
dynamodb = boto3.resource("dynamodb")
table = dynamodb.Table(os.environ["TABLE_NAME"])

@logger.inject_lambda_context
@tracer.capture_lambda_handler
@metrics.log_metrics(capture_cold_start_metric=True)
def handler(event, context: LambdaContext):
    """Process API Gateway request."""
    try:
        body = json.loads(event["body"])
        order_id = body["order_id"]

        # Single-table design query
        response = table.get_item(
            Key={"PK": f"ORDER#{order_id}", "SK": "METADATA"}
        )

        if "Item" not in response:
            return {"statusCode": 404, "body": "Not found"}

        metrics.add_metric(name="OrderLookup", unit="Count", value=1)
        return {
            "statusCode": 200,
            "headers": {"Content-Type": "application/json"},
            "body": json.dumps(response["Item"], default=str)
        }
    except Exception as e:
        logger.exception("Failed to process request")
        return {"statusCode": 500, "body": "Internal error"}

Lambda 关键限制:最大执行时间 15 分钟、部署包 250MB(解压后)、内存 128MB-10GB、并发默认 1000(可请求提升)。冷启动延迟取决于运行时和包大小——使用 Provisioned Concurrency 或 SnapStart(Java)来消除冷启动。

6. ECS 与 EKS:容器编排

当工作负载超出 Lambda 的限制(长时间运行、大内存、自定义运行时),容器是下一个选择。AWS 提供两种编排服务:ECS(AWS 原生)和 EKS(托管 Kubernetes)。两者都支持 Fargate 无服务器启动模式。

ECS vs EKS 对比

维度ECSEKS
学习曲线低——AWS 概念高——需 K8s 知识
多云兼容仅 AWS是(标准 K8s)
控制平面费用免费~$73/
生态系统AWS 服务集成深度CNCF / Helm / Istio 等
Fargate完全支持支持(部分限制)

ECS Fargate 任务定义

# ECS Fargate Task Definition — CloudFormation
TaskDefinition:
  Type: AWS::ECS::TaskDefinition
  Properties:
    Family: myapp-api
    Cpu: "512"
    Memory: "1024"
    NetworkMode: awsvpc
    RequiresCompatibilities: [FARGATE]
    ExecutionRoleArn: !GetAtt ECSExecutionRole.Arn
    TaskRoleArn: !GetAtt ECSTaskRole.Arn
    ContainerDefinitions:
      - Name: api
        Image: !Sub "\${AWS::AccountId}.dkr.ecr.\${AWS::Region}.amazonaws.com/myapp:latest"
        PortMappings:
          - ContainerPort: 8080
        LogConfiguration:
          LogDriver: awslogs
          Options:
            awslogs-group: /ecs/myapp-api
            awslogs-region: !Ref AWS::Region
            awslogs-stream-prefix: ecs
        Environment:
          - Name: NODE_ENV
            Value: production
        Secrets:
          - Name: DB_PASSWORD
            ValueFrom: !Ref DbPasswordSecret

7. CloudFront 与 Route 53:全球内容分发与 DNS

CloudFront 是 AWS 的 CDN 服务,在全球 400+ 边缘站点缓存内容,将延迟降至个位数毫秒。Route 53 是高可用的 DNS 服务,提供域名注册、DNS 路由和健康检查三合一功能。

Route 53 路由策略

策略说明场景
Simple单一资源标准路由单一端点
Weighted按权重分配流量蓝绿/金丝雀部署
Latency-based路由到最低延迟区域多区域应用
Failover健康检查自动故障转移主备灾备
Geolocation按用户地理位置路由合规与内容本地化

CloudFront 分发配置

# CloudFront with S3 origin — CloudFormation
CloudFrontDistribution:
  Type: AWS::CloudFront::Distribution
  Properties:
    DistributionConfig:
      Origins:
        - Id: S3Origin
          DomainName: !GetAtt MyBucket.RegionalDomainName
          S3OriginConfig:
            OriginAccessIdentity: ""
          OriginAccessControlId: !Ref OAC
      DefaultCacheBehavior:
        TargetOriginId: S3Origin
        ViewerProtocolPolicy: redirect-to-https
        CachePolicyId: 658327ea-f89d-4fab-a63d-7e88639e58f6  # CachingOptimized
        Compress: true
      ViewerCertificate:
        AcmCertificateArn: !Ref Certificate
        SslSupportMethod: sni-only
        MinimumProtocolVersion: TLSv1.2_2021
      Enabled: true
      HttpVersion: http2and3
      PriceClass: PriceClass_100

8. IAM:身份与访问管理

IAM 是 AWS 安全的基础。它控制谁(身份)可以对哪些资源执行什么操作。IAM 策略是 JSON 文档,定义 Effect(Allow/Deny)、Action、Resource 和可选 Condition。遵循最小权限原则是安全运营的第一要务。

IAM 最佳实践清单

1) 为 root 账户启用 MFA 并锁定——日常不使用。2) 为每个人/服务创建独立 IAM 用户或角色。3) 使用 IAM 角色(而非 Access Key)为 EC2/Lambda 授权。4) 策略附加到组而非用户。5) 使用 AWS Organizations + SCP 实现跨账户治理。6) 定期使用 IAM Access Analyzer 审计权限。7) 使用 aws-vault 或 SSO 管理本地凭据。

# Least-privilege IAM policy example
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowS3ReadOnly",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::my-app-data",
        "arn:aws:s3:::my-app-data/*"
      ]
    },
    {
      "Sid": "AllowDynamoDBCRUD",
      "Effect": "Allow",
      "Action": [
        "dynamodb:GetItem",
        "dynamodb:PutItem",
        "dynamodb:UpdateItem",
        "dynamodb:DeleteItem",
        "dynamodb:Query"
      ],
      "Resource": "arn:aws:dynamodb:us-east-1:123456789012:table/Orders",
      "Condition": {
        "ForAllValues:StringEquals": {
          "dynamodb:LeadingKeys": ["\${aws:PrincipalTag/tenant_id}"]
        }
      }
    }
  ]
}

9. VPC:网络隔离与架构

VPC(Virtual Private Cloud)是 AWS 中的隔离虚拟网络。一个生产级 VPC 通常包含多个可用区中的公有子网和私有子网、Internet 网关、NAT 网关、安全组和网络 ACL。正确的 VPC 设计是安全和可用性的基础。

安全组 vs 网络 ACL

特性安全组网络 ACL
作用层级实例/ENI 级别子网级别
状态有状态无状态
规则类型仅允许规则允许 + 拒绝规则
评估方式所有规则一起评估按编号顺序匹配

生产级 VPC 架构

# Three-tier VPC architecture — Terraform
module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "5.5.0"

  name = "production-vpc"
  cidr = "10.0.0.0/16"

  azs              = ["us-east-1a", "us-east-1b", "us-east-1c"]
  public_subnets   = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
  private_subnets  = ["10.0.11.0/24", "10.0.12.0/24", "10.0.13.0/24"]
  database_subnets = ["10.0.21.0/24", "10.0.22.0/24", "10.0.23.0/24"]

  enable_nat_gateway     = true
  single_nat_gateway     = false  # One per AZ for HA
  one_nat_gateway_per_az = true

  enable_dns_hostnames = true
  enable_dns_support   = true

  # VPC Flow Logs
  enable_flow_log                      = true
  create_flow_log_cloudwatch_log_group = true
  create_flow_log_iam_role             = true

  tags = {
    Environment = "production"
    Terraform   = "true"
  }
}

10. SQS 与 SNS:异步消息与通知

微服务架构的关键原则是松耦合。SQS(消息队列)和 SNS(发布/订阅)是 AWS 中实现异步通信的两大核心服务。它们经常配合使用:SNS 扇出到多个 SQS 队列,实现并行处理。

SQS 标准队列 vs FIFO 队列

特性StandardFIFO
吞吐量无限制300 TPS(批量 3000)
顺序保证尽力而为严格顺序
去重至少一次精确一次
适用场景高吞吐工作负载订单处理、金融交易

SNS + SQS 扇出模式

# SNS topic fans out to multiple SQS queues
# — CloudFormation snippet
OrderTopic:
  Type: AWS::SNS::Topic
  Properties:
    TopicName: order-events

PaymentQueue:
  Type: AWS::SQS::Queue
  Properties:
    QueueName: payment-processing
    VisibilityTimeout: 300
    RedrivePolicy:
      deadLetterTargetArn: !GetAtt PaymentDLQ.Arn
      maxReceiveCount: 3

InventoryQueue:
  Type: AWS::SQS::Queue
  Properties:
    QueueName: inventory-update
    VisibilityTimeout: 300

NotificationQueue:
  Type: AWS::SQS::Queue
  Properties:
    QueueName: customer-notification

# Subscribe all queues to the topic
PaymentSub:
  Type: AWS::SNS::Subscription
  Properties:
    TopicArn: !Ref OrderTopic
    Protocol: sqs
    Endpoint: !GetAtt PaymentQueue.Arn
    FilterPolicy:
      event_type: [order_placed, order_updated]

11. CloudWatch:全栈可观测性

CloudWatch 提供指标、日志、告警和仪表盘,是 AWS 的统一可观测性平台。结合 X-Ray 进行分布式链路追踪,CloudTrail 进行 API 操作审计,构成完整的可观测性三件套。

自定义指标与告警

# Python — publish custom CloudWatch metric
import boto3
from datetime import datetime

cloudwatch = boto3.client("cloudwatch")

def publish_order_metric(order_total: float, region: str):
    cloudwatch.put_metric_data(
        Namespace="MyApp/Orders",
        MetricData=[
            {
                "MetricName": "OrderValue",
                "Dimensions": [
                    {"Name": "Region", "Value": region},
                ],
                "Timestamp": datetime.utcnow(),
                "Value": order_total,
                "Unit": "None",
                "StorageResolution": 60   # standard (1-min)
            }
        ]
    )

CloudWatch Alarm 配置

# CloudWatch Alarm — CloudFormation
HighCPUAlarm:
  Type: AWS::CloudWatch::Alarm
  Properties:
    AlarmName: high-cpu-api-cluster
    AlarmDescription: "CPU > 80% for 5 minutes"
    Namespace: AWS/ECS
    MetricName: CPUUtilization
    Dimensions:
      - Name: ClusterName
        Value: !Ref ECSCluster
      - Name: ServiceName
        Value: !GetAtt ECSService.Name
    Statistic: Average
    Period: 300
    EvaluationPeriods: 1
    Threshold: 80
    ComparisonOperator: GreaterThanThreshold
    AlarmActions:
      - !Ref OpsNotificationTopic
    OKActions:
      - !Ref OpsNotificationTopic

CloudWatch Logs Insights 提供类 SQL 查询语言,可在海量日志中快速定位问题。示例查询:

# Find top 10 slowest API requests in the last hour
fields @timestamp, @message
| filter @message like /duration/
| parse @message "duration=* ms" as duration_ms
| sort duration_ms desc
| limit 10

12. 成本优化:从账单中削减 40-60%

成本优化不是一次性工作——它是持续的运营实践。AWS 提供多种工具和策略来帮助你在不牺牲性能的情况下大幅降低支出。以下是经过验证的成本优化策略清单。

计算成本优化

策略潜在节省实施复杂度
实例右调(Right-sizing)10-30%
Savings Plans最高 72%
Spot Instances最高 90%
Graviton (ARM)20-40%低-中
自动伸缩20-50%
迁移到 Lambda/Fargate30-70%

存储与数据传输优化

存储优化策略:1) 使用 S3 Intelligent-Tiering 自动迁移访问模式不确定的数据。2) 启用 S3 生命周期策略将旧数据迁移到 Glacier。3) 使用 gp3 替代 gp2 EBS 卷(同等性能,节省 20%)。4) 定期清理未挂载的 EBS 卷和过期快照。5) 使用 CloudFront 减少数据传输费用——边缘传输比从 S3/EC2 直接传输便宜得多。

AWS 成本管理工具

# Set up a monthly budget alert with AWS CLI
aws budgets create-budget \
  --account-id 123456789012 \
  --budget '{
    "BudgetName": "MonthlySpend",
    "BudgetLimit": {
      "Amount": "1000",
      "Unit": "USD"
    },
    "BudgetType": "COST",
    "TimeUnit": "MONTHLY"
  }' \
  --notifications-with-subscribers '[
    {
      "Notification": {
        "NotificationType": "ACTUAL",
        "ComparisonOperator": "GREATER_THAN",
        "Threshold": 80,
        "ThresholdType": "PERCENTAGE"
      },
      "Subscribers": [
        {
          "SubscriptionType": "EMAIL",
          "Address": "ops-team@example.com"
        }
      ]
    }
  ]'

成本优化检查清单

  • 使用 AWS Cost Explorer 分析支出趋势(按服务、账户、标签)
  • 启用 AWS Compute Optimizer 获取实例右调建议
  • 使用 Trusted Advisor 的成本优化检查
  • 为所有资源添加成本分配标签(团队、环境、项目)
  • 设置 AWS Budgets 告警(80% / 100% / 预测超支)
  • 每月审查 Savings Plans 和 RI 覆盖率报告
  • 清理:未使用的弹性 IP、EBS 卷、旧 AMI、空负载均衡器
  • 考虑多区域 vs 单区域——数据传输费用是隐藏杀手

13. AWS Well-Architected 参考架构

AWS Well-Architected Framework 定义了五大支柱:卓越运营、安全性、可靠性、性能效率和成本优化。以下是一个典型的三层 Web 应用架构,综合运用了本指南介绍的核心服务。

# Three-tier architecture overview
#
# Internet
#   |-- Route 53 (DNS, latency-based routing)
#   |-- CloudFront (CDN, static assets + API caching)
#   |-- WAF (rate limiting, SQL injection protection)
#   |
# VPC (10.0.0.0/16)
#   |
#   |-- Public Subnets (3 AZs)
#   |     |-- ALB (Application Load Balancer)
#   |     |-- NAT Gateways
#   |
#   |-- Private Subnets (3 AZs)
#   |     |-- ECS Fargate / EKS (app containers)
#   |     |-- Lambda (async processing)
#   |     |-- SQS queues + SNS topics
#   |
#   |-- Database Subnets (3 AZs)
#         |-- Aurora PostgreSQL (Multi-AZ)
#         |-- ElastiCache Redis (cluster mode)
#         |-- DynamoDB (session store)
#
# Observability:
#   CloudWatch Metrics + Logs + Alarms
#   X-Ray distributed tracing
#   CloudTrail API audit logging
#
# Security:
#   IAM roles (least privilege)
#   KMS encryption (at rest + in transit)
#   Secrets Manager (DB creds, API keys)
#   GuardDuty (threat detection)

五大支柱快速参考

支柱关键实践核心服务
卓越运营IaC、CI/CD、RunbookCloudFormation, CodePipeline, Systems Manager
安全性最小权限、加密、审计IAM, KMS, GuardDuty, CloudTrail
可靠性多 AZ、自动恢复、备份Auto Scaling, Route 53, S3 Cross-Region
性能效率正确实例类型、缓存、CDNCloudFront, ElastiCache, Graviton
成本优化预留/Spot、右调、生命周期策略Savings Plans, Compute Optimizer, S3 IT

总结

AWS 生态系统庞大但有章可循。从计算(EC2/Lambda/ECS)到存储(S3)、数据库(RDS/DynamoDB)、网络(VPC/CloudFront/Route 53)、安全(IAM)、消息(SQS/SNS)到监控(CloudWatch),每个服务都有其最佳使用场景。关键是根据工作负载特征选择合适的服务组合,遵循 Well-Architected Framework 的五大支柱,并持续优化成本。从小处着手,按需扩展,善用托管服务——这就是 AWS 云架构的核心理念。

常见问题

What is the difference between EC2 On-Demand, Reserved, and Spot Instances?

On-Demand instances charge per second with no commitment, ideal for unpredictable workloads. Reserved Instances offer up to 72% discount for 1-3 year commitments, best for steady-state usage. Spot Instances provide up to 90% discount by using spare EC2 capacity, but can be interrupted with 2 minutes notice — perfect for fault-tolerant batch jobs, CI/CD, and data processing.

When should I use RDS vs DynamoDB?

Use RDS when you need relational data with complex joins, transactions, and SQL support (MySQL, PostgreSQL, Oracle, SQL Server). Use DynamoDB for key-value or document workloads requiring single-digit millisecond latency at any scale, such as gaming leaderboards, session stores, IoT data, and e-commerce carts. DynamoDB is fully managed and serverless, while RDS requires instance sizing and maintenance windows.

How do I choose between ECS and EKS for container orchestration?

Choose ECS if you want a simpler, AWS-native container orchestration service with deep integration into other AWS services and no additional management overhead. Choose EKS if you need Kubernetes compatibility, want to run the same workloads across AWS and other clouds, or already have Kubernetes expertise. Both support Fargate for serverless containers, eliminating the need to manage underlying EC2 instances.

What are the best practices for AWS IAM security?

Follow the principle of least privilege — grant only permissions needed. Enable MFA on all accounts, especially root. Use IAM roles instead of long-lived access keys. Implement Service Control Policies in AWS Organizations. Rotate credentials regularly. Use IAM Access Analyzer to identify unused permissions. Never embed credentials in code; use IAM roles for EC2/Lambda or AWS Secrets Manager.

How does Amazon VPC work and what are the key components?

A VPC is an isolated virtual network within AWS. Key components include subnets (public and private), route tables, Internet Gateway (for public internet access), NAT Gateway (for private subnet outbound access), security groups (stateful instance-level firewall), and NACLs (stateless subnet-level firewall). Best practice is to use multiple Availability Zones with public subnets for load balancers and private subnets for application servers and databases.

What is the difference between SQS and SNS?

SQS (Simple Queue Service) is a message queue for decoupling producers and consumers — messages are pulled by consumers and processed once. SNS (Simple Notification Service) is a pub/sub system that pushes messages to multiple subscribers simultaneously (Lambda, SQS, HTTP, email, SMS). Use SQS for task queues and work distribution, SNS for fan-out notifications. They are often used together: SNS publishes to multiple SQS queues for parallel processing.

How can I reduce my AWS bill by 40-60%?

Key strategies: 1) Use Reserved Instances or Savings Plans for steady workloads (up to 72% savings). 2) Leverage Spot Instances for fault-tolerant workloads (up to 90% savings). 3) Right-size instances using AWS Compute Optimizer. 4) Use S3 Intelligent-Tiering for automatic storage cost optimization. 5) Enable auto-scaling to match capacity to demand. 6) Delete unused EBS volumes, snapshots, and Elastic IPs. 7) Use AWS Cost Explorer and set billing alerts.

How do CloudFront and Route 53 work together for global applications?

Route 53 provides DNS resolution with health checks and routing policies (latency-based, geolocation, failover, weighted). CloudFront is a CDN that caches content at 400+ edge locations worldwide, reducing latency to single-digit milliseconds. Together, Route 53 routes users to the nearest CloudFront edge location, which serves cached content or fetches from the origin. This combination delivers global sub-100ms response times with automatic failover.

𝕏 Twitterin LinkedIn
这篇文章有帮助吗?

保持更新

获取每周开发技巧和新工具通知。

无垃圾邮件,随时退订。

试试这些相关工具

{ }JSON FormatterY→YAML to JSON ConverterB→Base64 Encoder

相关文章

AWS指南:开发者必备云服务(EC2、S3、Lambda、RDS)

掌握AWS云服务。涵盖EC2、S3、Lambda、API Gateway、RDS、DynamoDB、CloudFront、VPC、IAM、CloudFormation、AWS CLI、成本优化和AWS vs GCP vs Azure对比。

Serverless完整指南:AWS Lambda、Vercel、Cloudflare Workers、冷启动和成本优化

掌握Serverless架构。涵盖AWS Lambda函数/触发器/层、Vercel Edge Runtime、Cloudflare Workers KV/持久对象、Serverless框架(SST/SAM)、事件驱动架构、Serverless数据库以及冷启动缓解策略。

Terraform完整指南:从基础到CI/CD的基础设施即代码

掌握Terraform基础设施即代码。含HCL语法、模块、状态管理、AWS提供商、工作空间、Terraform Cloud、tfsec/checkov/Terratest测试和最佳实践完整指南。