TL;DR
Terraform turns cloud infrastructure into versioned, reviewable code using HCL. Write resources, organize them into modules, store state remotely (S3 + DynamoDB), and run terraform plan to preview changes before terraform apply commits them. Use workspaces or separate backends for environment promotion, scan with tfsec/checkov in CI, and format YAML configs with the online YAML formatter while authoring Terraform modules.
1. Terraform Core Concepts â Providers, Resources, State, and More
Terraform is a declarative Infrastructure as Code (IaC) tool: you describe the desired end state of your infrastructure, and Terraform figures out how to get there. Understanding its core building blocks is essential before writing any configuration.
Providers
A provider is a plugin that talks to a specific API â AWS, GCP, Azure, GitHub, Datadog, etc. Each provider must be declared and configured before its resources can be used.
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
required_version = ">= 1.5"
}
provider "aws" {
region = var.aws_region
# Credentials from env: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY
# Or from ~/.aws/credentials, IAM role, OIDC
}Resources, Data Sources, and Outputs
Resources create or manage real infrastructure. Data sources read existing infrastructure without creating anything. Outputs export values for use by other modules or to display after apply.
# Resource â creates an S3 bucket
resource "aws_s3_bucket" "app_assets" {
bucket = "my-app-assets-${var.environment}"
tags = local.common_tags
}
# Data source â reads an existing VPC
data "aws_vpc" "main" {
filter {
name = "tag:Name"
values = ["main-vpc"]
}
}
# Output â exports the bucket ARN
output "bucket_arn" {
value = aws_s3_bucket.app_assets.arn
description = "ARN of the app assets S3 bucket"
}
# Variables
variable "environment" {
type = string
description = "Deployment environment (dev/staging/prod)"
validation {
condition = contains(["dev", "staging", "prod"], var.environment)
error_message = "Environment must be dev, staging, or prod."
}
}
# Locals â computed values used internally
locals {
common_tags = {
Environment = var.environment
ManagedBy = "terraform"
Project = "my-app"
}
}The key mental model: Terraform builds a dependency graph from references between resources. When aws_s3_bucket references var.environment, Terraform knows to resolve the variable before creating the bucket. Circular dependencies cause plan errors.
2. HCL Syntax â Blocks, Expressions, For Loops, and Conditionals
HashiCorp Configuration Language (HCL) is a domain-specific language designed to be readable by both humans and machines. It supports rich expression syntax including string interpolation, for expressions, and conditional logic.
# String interpolation
resource "aws_instance" "web" {
ami = data.aws_ami.ubuntu.id
instance_type = "t3.${var.instance_size}" # e.g. t3.micro
tags = {
Name = "web-${var.environment}-${count.index}"
}
}
# For expression â transform a list to a map
variable "subnet_cidrs" {
default = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
}
locals {
subnet_map = {
for idx, cidr in var.subnet_cidrs :
"subnet-${idx}" => cidr
}
# Result: { "subnet-0" = "10.0.1.0/24", "subnet-1" = "10.0.2.0/24", ... }
# Filter with if clause
public_subnets = [
for s in var.subnets : s.cidr if s.public == true
]
}
# Conditional (ternary) expression
resource "aws_instance" "web" {
instance_type = var.environment == "prod" ? "t3.large" : "t3.micro"
monitoring = var.environment == "prod" ? true : false
}
# Dynamic blocks â generate repeated nested blocks
resource "aws_security_group" "web" {
name = "web-sg"
dynamic "ingress" {
for_each = var.allowed_ports
content {
from_port = ingress.value
to_port = ingress.value
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
}
}The dynamic block is one of the most powerful HCL features â it generates repeated nested configuration blocks from a list or map. This avoids copy-pasting ingress/egress rules, lifecycle hooks, or any other repeated nested blocks.
3. Modules â Reusable Infrastructure Components
Modules are Terraform's reusability mechanism. Every Terraform configuration is technically a module (the root module). Child modules are called with module blocks and can come from the Terraform Registry, GitHub, or local directories.
# Using a registry module (Terraform Registry)
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 5.0"
name = "main-vpc"
cidr = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b", "us-east-1c"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
public_subnets = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]
enable_nat_gateway = true
single_nat_gateway = var.environment != "prod"
}
# Accessing module outputs
resource "aws_instance" "app" {
subnet_id = module.vpc.private_subnets[0]
vpc_security_group_ids = [aws_security_group.app.id]
}
# Local module
module "api_gateway" {
source = "./modules/api-gateway"
environment = var.environment
lambda_arn = module.lambda.function_arn
domain_name = var.api_domain
}
# Module with for_each â multiple instances
module "microservice" {
source = "./modules/ecs-service"
for_each = var.services # map of service configs
name = each.key
image = each.value.image
cpu = each.value.cpu
memory = each.value.memory
environment = var.environment
}Module Structure Best Practice
modules/
ecs-service/
main.tf # core resources
variables.tf # input variables with descriptions + validation
outputs.tf # exported values
versions.tf # required providers + version constraints
README.md # auto-generated by terraform-docs
environments/
dev/
main.tf # calls modules with dev settings
terraform.tfvars # dev-specific values
backend.tf # dev state bucket
prod/
main.tf
terraform.tfvars
backend.tf4. State Management â Remote State, Locking, and Import
Terraform's state file is the source of truth mapping your HCL configuration to real cloud resources. For any team environment, local state is insufficient â use a remote backend with locking.
S3 Backend with DynamoDB Locking
# backend.tf
terraform {
backend "s3" {
bucket = "my-company-terraform-state"
key = "prod/networking/terraform.tfstate"
region = "us-east-1"
encrypt = true # SSE-S3 encryption
dynamodb_table = "terraform-state-lock" # prevents concurrent applies
}
}
# Create the DynamoDB table for locking (one-time setup)
resource "aws_dynamodb_table" "terraform_locks" {
name = "terraform-state-lock"
billing_mode = "PAY_PER_REQUEST"
hash_key = "LockID"
attribute {
name = "LockID"
type = "S"
}
}Importing Existing Resources
# Import an existing S3 bucket into state
terraform import aws_s3_bucket.legacy my-existing-bucket-name
# Terraform 1.5+ â import block in HCL (no CLI command needed)
import {
to = aws_s3_bucket.legacy
id = "my-existing-bucket-name"
}
# Generate config automatically (Terraform 1.5+)
terraform plan -generate-config-out=generated.tf
# State management commands
terraform state list # list all resources in state
terraform state show aws_instance.web # show resource details
terraform state mv aws_instance.old aws_instance.new # rename without destroy
terraform state rm aws_s3_bucket.orphan # remove from state (doesn't delete resource)
# Refresh state to match real world (use cautiously)
terraform apply -refresh-only5. AWS Provider â EC2, VPC, S3, Security Groups, and IAM
The AWS provider is the most widely used Terraform provider. Here are the most common resource patterns for building a production application infrastructure.
# VPC + Subnets
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
tags = { Name = "main" }
}
resource "aws_subnet" "private" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(aws_vpc.main.cidr_block, 8, count.index)
availability_zone = var.availability_zones[count.index]
}
# Security Group
resource "aws_security_group" "app" {
name = "app-${var.environment}"
vpc_id = aws_vpc.main.id
ingress {
from_port = 8080
to_port = 8080
protocol = "tcp"
security_groups = [aws_security_group.alb.id]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
# IAM Role for EC2 with instance profile
resource "aws_iam_role" "ec2_role" {
name = "ec2-app-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Principal = { Service = "ec2.amazonaws.com" }
Action = "sts:AssumeRole"
}]
})
}
resource "aws_iam_role_policy_attachment" "ssm" {
role = aws_iam_role.ec2_role.name
policy_arn = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
}
resource "aws_iam_instance_profile" "app" {
name = "app-instance-profile"
role = aws_iam_role.ec2_role.name
}
# EC2 Instance
resource "aws_instance" "app" {
ami = data.aws_ami.ubuntu.id
instance_type = var.instance_type
subnet_id = aws_subnet.private[0].id
vpc_security_group_ids = [aws_security_group.app.id]
iam_instance_profile = aws_iam_instance_profile.app.name
user_data = base64encode(templatefile("userdata.sh.tpl", {
environment = var.environment
app_version = var.app_version
}))
root_block_device {
volume_size = 20
volume_type = "gp3"
encrypted = true
delete_on_termination = true
}
tags = merge(local.common_tags, { Name = "app-${var.environment}" })
}6. Variables and Workspaces â Types, Validation, tfvars, Environment Promotion
Terraform variables parameterize your configurations. Combined with workspaces or separate backends, they enable environment-specific deployments from a single code base.
# variables.tf â typed variables with validation
variable "instance_type" {
type = string
description = "EC2 instance type"
default = "t3.micro"
validation {
condition = can(regex("^t[23]\.", var.instance_type))
error_message = "Only t2 or t3 instance types are allowed."
}
}
variable "allowed_ips" {
type = list(string)
description = "CIDR blocks allowed SSH access"
sensitive = false
}
variable "db_password" {
type = string
sensitive = true # redacted in plan output and logs
}
variable "tags" {
type = map(string)
default = {}
}
variable "feature_flags" {
type = object({
enable_cdn = bool
enable_waf = bool
replica_count = number
})
default = {
enable_cdn = false
enable_waf = false
replica_count = 1
}
}
# terraform.tfvars (do NOT commit if it has secrets)
instance_type = "t3.small"
allowed_ips = ["10.0.0.0/8", "172.16.0.0/12"]
# Environment-specific: dev.tfvars, prod.tfvars
# terraform apply -var-file=prod.tfvars
# Workspace commands
terraform workspace list # show all workspaces
terraform workspace new staging # create staging workspace
terraform workspace select prod # switch to prod
terraform workspace show # current workspace name
# Use workspace in config
resource "aws_instance" "web" {
instance_type = terraform.workspace == "prod" ? "t3.large" : "t3.micro"
}7. Provisioners and null_resource â Local-exec, Remote-exec, Destroy-time
Provisioners are a last resort in Terraform â use them only when there is no native resource or provider to accomplish the task. HashiCorp recommends Packer for image building and cloud-init for instance bootstrapping instead.
# local-exec â runs on the machine running terraform
resource "aws_instance" "web" {
ami = data.aws_ami.ubuntu.id
instance_type = "t3.micro"
provisioner "local-exec" {
command = "echo ${self.public_ip} >> inventory.txt"
}
# Destroy-time provisioner
provisioner "local-exec" {
when = destroy
command = "echo 'Destroying ${self.id}' >> destroy.log"
}
}
# remote-exec â runs on the remote instance over SSH
resource "aws_instance" "app" {
provisioner "remote-exec" {
connection {
type = "ssh"
user = "ubuntu"
private_key = file("~/.ssh/id_rsa")
host = self.public_ip
}
inline = [
"sudo apt-get update -y",
"sudo apt-get install -y nginx",
"sudo systemctl enable nginx",
]
}
}
# null_resource â run provisioners without a real resource
# Useful for running scripts when certain inputs change
resource "null_resource" "db_migration" {
triggers = {
schema_hash = filemd5("migrations/schema.sql")
}
provisioner "local-exec" {
command = "psql ${var.db_url} -f migrations/schema.sql"
}
}
# terraform_data (Terraform 1.4+ replacement for null_resource)
resource "terraform_data" "bootstrap" {
triggers_replace = [aws_instance.web.id]
provisioner "local-exec" {
command = "ansible-playbook -i ${self.input} playbook.yml"
}
input = aws_instance.web.public_ip
}8. Terraform Cloud and CI/CD â GitHub Actions, OIDC Auth, Atlantis
Automating Terraform in CI/CD ensures all infrastructure changes go through code review, are auditable, and are applied consistently. Here are two common approaches.
GitHub Actions with OIDC (No Long-lived AWS Credentials)
# .github/workflows/terraform.yml
name: Terraform
on:
push:
branches: [main]
pull_request:
branches: [main]
permissions:
id-token: write # Required for OIDC
contents: read
pull-requests: write
jobs:
terraform:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Configure AWS credentials via OIDC
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789:role/GitHubActionsRole
aws-region: us-east-1
- uses: hashicorp/setup-terraform@v3
with:
terraform_version: 1.7.0
- name: Terraform Init
run: terraform init
- name: Terraform Validate
run: terraform validate
- name: Terraform Plan
id: plan
run: terraform plan -no-color -out=tfplan
continue-on-error: true
- name: Comment Plan on PR
uses: actions/github-script@v7
if: github.event_name == 'pull_request'
with:
script: |
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `## Terraform Plan
\`\`\`
${{ steps.plan.outputs.stdout }}
\`\`\``
})
- name: Terraform Apply
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
run: terraform apply -auto-approve tfplanTerraform Cloud Workspace
# Configure TFC as backend
terraform {
cloud {
organization = "my-org"
workspaces {
name = "prod-infrastructure"
}
}
}
# Authenticate: terraform login
# Then: terraform init (migrates state to TFC)
# TFC Variables (set in UI or via API):
# - AWS_ACCESS_KEY_ID (env var, sensitive)
# - AWS_SECRET_ACCESS_KEY (env var, sensitive)
# - TF_VAR_db_password (terraform var, sensitive)9. Testing Terraform â Validate, tfsec, checkov, Terratest
A complete Terraform testing strategy has multiple layers: syntax validation, security scanning, policy enforcement, and integration tests against real infrastructure.
# Layer 1: Built-in validation
terraform fmt -check -recursive # formatting check
terraform validate # syntax + type checking
# Layer 2: Security scanning
# tfsec
brew install tfsec
tfsec . --minimum-severity MEDIUM
# checkov
pip install checkov
checkov -d . --framework terraform
# terrascan
terrascan scan -t aws -d .
# Layer 3: Terratest (Go integration tests)
# tests/vpc_test.go
package test
import (
"testing"
"github.com/gruntwork-io/terratest/modules/terraform"
"github.com/stretchr/testify/assert"
)
func TestVPCModule(t *testing.T) {
t.Parallel()
terraformOptions := &terraform.Options{
TerraformDir: "../modules/vpc",
Vars: map[string]interface{}{
"environment": "test",
"cidr_block": "10.99.0.0/16",
},
}
defer terraform.Destroy(t, terraformOptions)
terraform.InitAndApply(t, terraformOptions)
vpcID := terraform.Output(t, terraformOptions, "vpc_id")
assert.NotEmpty(t, vpcID)
}
# Layer 4: terraform test (built-in, Terraform 1.6+)
# tests/main.tftest.hcl
run "creates_vpc" {
command = plan
assert {
condition = aws_vpc.main.enable_dns_hostnames == true
error_message = "DNS hostnames must be enabled"
}
}10. Best Practices â Directory Structure, Tagging, IAM, Drift Detection
Production Terraform setups follow patterns that maximize reusability, security, and maintainability. Here are the most impactful best practices from real-world usage.
Recommended Directory Structure
infrastructure/
âââ modules/ # reusable modules (versioned separately)
â âââ vpc/
â âââ ecs-cluster/
â âââ rds-postgres/
â âââ lambda-function/
âââ environments/
â âââ dev/
â â âââ backend.tf # dev state config
â â âââ main.tf # module calls
â â âââ variables.tf
â â âââ dev.tfvars
â âââ staging/
â âââ prod/
âââ global/ # account-level resources (IAM, Route53)
âââ iam/
âââ dns/Tagging Strategy and Drift Detection
# Enforce consistent tagging via a local
locals {
required_tags = {
Environment = var.environment
Team = var.team
CostCenter = var.cost_center
ManagedBy = "terraform"
Repository = "github.com/my-org/infrastructure"
CreatedAt = timestamp() # use carefully â causes drift on every plan
}
}
# Tag all resources by merging
resource "aws_instance" "web" {
tags = merge(local.required_tags, {
Name = "web-${var.environment}"
Role = "application-server"
})
}
# Drift detection â run periodically in CI
# Detects manual changes made outside Terraform
terraform plan -detailed-exitcode
# Exit code 0: no changes
# Exit code 1: error
# Exit code 2: changes detected (drift)
# Schedule in GitHub Actions (cron)
on:
schedule:
- cron: '0 6 * * *' # daily at 6am
workflow_dispatch: # manual trigger
# Module versioning â pin to specific versions
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "= 5.1.2" # exact pin for prod; "~> 5.0" for dev
}| Practice | Why |
|---|---|
| Remote state with locking | Prevents concurrent applies corrupting state |
| Pin module and provider versions | Reproducible builds; prevents surprise upgrades |
Mark secrets as sensitive = true | Redacts values from plan output and logs |
| Use OIDC instead of IAM access keys in CI | No long-lived credentials to rotate or leak |
| Scan with tfsec/checkov in every PR | Catches open security groups, unencrypted volumes early |
| Daily drift detection in CI | Alerts when manual console changes diverge from IaC |
Quick Tool
Authoring Terraform modules often involves YAML configuration files (Kubernetes manifests, GitHub Actions workflows, Helm values). Use the online YAML formatter to validate and pretty-print YAML without leaving your browser.
Key Takeaways
- Declarative model: describe desired state; Terraform computes the diff against state and plans changes.
- Remote state is mandatory for teams: use S3 + DynamoDB or Terraform Cloud to avoid state conflicts.
- Prefer
for_eachovercountfor resources with stable identifiers to avoid destructive reindexing. - Modules + Registry: use community modules (
terraform-aws-modules) as a starting point; pin versions for stability. - Never store secrets in git: use environment variables, AWS Secrets Manager, or Terraform Cloud sensitive variables.
- Scan every PR: tfsec and checkov catch security misconfigurations before they reach production.
- OIDC in CI/CD: eliminates long-lived AWS access keys; GitHub Actions and GitLab CI both support AWS OIDC natively.
- Drift detection: run
terraform plan -detailed-exitcodedaily to catch out-of-band changes.
Frequently Asked Questions
What is Terraform and how does it differ from CloudFormation?
Terraform is an open-source IaC tool by HashiCorp using HCL, supporting 3000+ providers across all major clouds. CloudFormation is AWS-only JSON/YAML. Terraform's plan/apply workflow, multi-cloud support, and richer module ecosystem make it the most popular choice for multi-cloud or cloud-agnostic infrastructure.
What is Terraform state and why is it important?
terraform.tfstate maps your HCL to real-world resource IDs. Without it, Terraform cannot determine what exists and would attempt to recreate everything. Always use remote state with locking for team environments. Never commit state files to git â they may contain sensitive values.
What is the difference between count and for_each?
count creates indexed resources (0, 1, 2...) â removing the middle element shifts indices and triggers destroy/recreate. for_each keys resources by a stable map key â removing one key only affects that resource. Prefer for_each in almost all cases.
How do I manage Terraform secrets securely?
Use TF_VAR_name environment variables, AWS Secrets Manager data sources, HashiCorp Vault, or Terraform Cloud sensitive variables. Mark variable declarations as sensitive = true to redact them from plan output. Never put secrets in .tf files or commit *.tfvars with secret values.
What is the difference between terraform import and terraform state mv?
terraform import brings an existing cloud resource under Terraform management. You still need to write matching HCL (or use -generate-config-out in TF 1.5+). terraform state mv renames a resource address in state without touching real infrastructure â useful when refactoring module structure.
How do Terraform workspaces work?
Workspaces store separate state files in terraform.tfstate.d/<workspace>/, allowing dev/staging/prod from one configuration directory. Access the current workspace via terraform.workspace. For strong isolation (separate AWS accounts), use separate backend configurations instead of workspaces.
What Terraform security scanning tools should I use?
tfsec for fast static analysis, checkov for broad policy-as-code coverage (1000+ checks), terrascan for OPA-based policies, and Sentinel (Terraform Cloud) for enterprise policy enforcement. Integrate all into CI so every PR is scanned before plan runs.
What is Atlantis and when should I use it?
Atlantis is an open-source Terraform PR automation server. It auto-runs terraform plan on PRs and posts results as comments, then applies on merge after approval. It's ideal for self-hosted teams who want GitOps Terraform workflows without a Terraform Cloud subscription. Pair it with tfsec/checkov for security scanning.