Linux 开发者完整指南:文件系统、权限、网络、Shell 脚本等
全面掌握 Linux 开发技能:文件系统层次结构、权限与所有权、进程管理、网络命令、systemd 服务、cron 定时任务、Shell 脚本编程、包管理、SSH、iptables/防火墙、磁盘管理和系统监控。
TL;DR
Linux 是每个开发者的基础技能。掌握文件系统层次结构(/etc、/var、/home)、权限模型(chmod/chown)、进程管理(ps/top/kill)、systemd 服务、cron 定时任务、Shell 脚本编程、SSH 安全配置、iptables 防火墙、磁盘管理(df/du/lvm)和系统监控工具,将让你在服务器运维和 DevOps 中游刃有余。
核心要点
- FHS 标准定义了所有 Linux 目录的用途——/bin、/etc、/var、/home 等
- 权限由 owner/group/others 三元组和 rwx 位组成,setuid/setgid/sticky 提供特殊控制
- 使用 systemctl 管理 systemd 服务,journalctl 查看日志
- cron 五字段语法控制定时任务:分 时 日 月 周
- Shell 脚本中始终用双引号包裹变量,使用 set -euo pipefail 提高健壮性
- SSH 使用 ed25519 密钥、禁用密码认证、禁止 root 登录
- iptables 使用链(INPUT/OUTPUT/FORWARD)过滤数据包,ufw 是更友好的前端
- 组合 top/htop、vmstat、iotop、nethogs 进行全面系统监控
1. Linux 文件系统层次结构
Linux 遵循文件系统层次标准 (FHS)。理解每个顶层目录的用途是有效管理系统的基础。
| 目录 | 用途 |
|---|---|
/ | 根目录,所有文件的起点 |
/bin | 基本用户命令(ls, cp, cat) |
/sbin | 系统管理命令(fdisk, iptables, reboot) |
/etc | 系统配置文件 |
/home | 用户家目录 |
/var | 可变数据(日志、缓存、邮件) |
/tmp | 临时文件(重启后可清除) |
/usr | 用户程序和库 |
/opt | 第三方可选软件 |
/proc | 虚拟文件系统,暴露进程和内核信息 |
/dev | 设备文件节点 |
/mnt, /media | 挂载点(临时挂载和可移动设备) |
常用文件操作命令
# Navigate and list
cd /var/log # change directory
ls -lah # list all files with sizes (human-readable)
pwd # print working directory
# Create, copy, move, remove
mkdir -p /opt/myapp/config # create nested directories
cp -r src/ dest/ # recursive copy
mv oldname newname # move / rename
rm -rf /tmp/build-* # force recursive remove (use with caution!)
# Search for files
find / -name "*.conf" -type f # find all .conf files
find /var/log -mtime -1 -name "*.log" # logs modified in last 24h
locate nginx.conf # fast search (needs updatedb)
# View file contents
cat /etc/hostname # print entire file
head -n 20 /var/log/syslog # first 20 lines
tail -f /var/log/syslog # follow new lines in real time
less /etc/passwd # paginated viewer
# Disk usage of a directory
du -sh /var/log/* # summarize each sub-item
du -d 1 -h /home # depth-1 summary2. 文件权限与所有权
Linux 权限模型基于三组实体(所有者、组、其他人),每组拥有读(r)、写(w)、执行(x)三种权限。
# Understanding ls -l output
# -rwxr-xr-- 1 deploy www-data 4096 Feb 10 08:30 app.sh
# │└┬┘└┬┘└┬┘ owner group size date name
# │ │ │ └── others: r-- (4) = read only
# │ │ └────── group: r-x (5) = read + execute
# │ └────────── owner: rwx (7) = read + write + execute
# └──────────── file type: - = regular, d = directory, l = symlink
# Octal notation
chmod 755 deploy.sh # owner=rwx, group=r-x, others=r-x
chmod 644 config.yml # owner=rw-, group=r--, others=r--
chmod 700 ~/.ssh # owner=rwx, no access for others
# Symbolic notation
chmod u+x script.sh # add execute for owner
chmod g-w secret.key # remove write from group
chmod o= shared/ # remove all permissions for others
chmod a+r readme.txt # add read for all (a = all)
# Ownership
chown deploy:www-data /var/www/html # change owner and group
chown -R deploy: /opt/myapp # recursive, group unchanged
chgrp docker /var/run/docker.sock # change group only
# Special permissions
chmod 4755 /usr/bin/passwd # setuid — runs as file owner
chmod 2755 /shared/project # setgid — new files inherit group
chmod 1777 /tmp # sticky bit — only owner can delete
# Default permissions for new files
umask 022 # new files: 644, new dirs: 755
umask 077 # new files: 600, new dirs: 700 (private)ACL 扩展权限
# When standard owner/group/others is not enough
# Install: apt install acl
# Grant user "ci" read+execute on /opt/deploy
setfacl -m u:ci:rx /opt/deploy
# Grant group "devs" full access recursively
setfacl -R -m g:devs:rwx /srv/project
# View ACLs
getfacl /opt/deploy
# Remove specific ACL entry
setfacl -x u:ci /opt/deploy3. 进程管理
Linux 中每个运行的程序都是一个进程,由 PID(进程ID)标识。进程管理包括查看、控制和终止进程。
# List all processes
ps aux # all processes, full format
ps aux | grep nginx # filter by name
ps -ef --forest # tree view of process hierarchy
# Real-time monitoring
top # classic process monitor
htop # interactive, colorized (install: apt install htop)
# Signals
kill -15 1234 # SIGTERM — graceful shutdown (default)
kill -9 1234 # SIGKILL — force kill (cannot be caught)
kill -1 1234 # SIGHUP — reload configuration
killall nginx # kill all processes by name
pkill -f "node server.js" # kill by command pattern
# Background and foreground jobs
./build.sh & # run in background
jobs # list background jobs
fg %1 # bring job 1 to foreground
bg %1 # resume job 1 in background
Ctrl+Z # suspend current process
# Keep process alive after logout
nohup ./server.sh & # nohup + background
disown %1 # detach job from shell
# Process resource limits
ulimit -a # show all limits
ulimit -n 65536 # set max open files for this session
# Which process uses a port?
lsof -i :8080 # list processes on port 8080
ss -tlnp | grep 8080 # socket statistics (faster)
fuser -k 8080/tcp # kill whatever is on port 8080进程优先级与 nice
# Nice values: -20 (highest priority) to 19 (lowest)
nice -n 10 ./heavy-task.sh # start with lower priority
renice -5 -p 1234 # change running process priority
# Real-time scheduling (use with caution)
chrt -f 99 ./realtime-app # FIFO real-time priority 994. 网络管理
Linux 提供了强大的网络工具集,用于配置接口、诊断连接、传输文件和调试网络问题。
# Interface configuration (modern: ip, legacy: ifconfig)
ip addr show # list all interfaces and IPs
ip addr add 10.0.0.5/24 dev eth0 # assign IP
ip link set eth0 up # bring interface up
ip route show # show routing table
ip route add default via 10.0.0.1 # add default gateway
# DNS resolution
cat /etc/resolv.conf # current DNS servers
dig google.com # detailed DNS lookup
nslookup google.com # simpler DNS lookup
host google.com # quick A record lookup
# Connectivity testing
ping -c 4 8.8.8.8 # send 4 ICMP packets
traceroute google.com # trace packet route
mtr google.com # real-time traceroute (ping + trace)
# Port scanning and connectivity
nc -zv host 22 # check if port is open
nc -l 9999 # listen on port 9999
telnet host 80 # test TCP connection
# Download and transfer
curl -O https://example.com/file.tar.gz # download file
curl -X POST -H "Content-Type: application/json" \
-d '{"key":"val"}' https://api.example.com/data
wget -r --no-parent https://example.com/docs/ # recursive download
scp file.txt user@host:/remote/path # secure copy
rsync -avz /local/ user@host:/remote/ # incremental sync
# Socket statistics
ss -tlnp # listening TCP sockets with process info
ss -s # summary statistics
netstat -tuln # legacy equivalent of ss -tlnp网络诊断示例
# Full diagnostic workflow for a connectivity issue:
# 1. Check if interface is up
ip link show eth0
# 2. Check IP assignment
ip addr show eth0
# 3. Check gateway reachability
ping -c 2 $(ip route | awk '/default/{print $3}')
# 4. Check DNS resolution
dig +short google.com
# 5. Check specific port on remote host
nc -w 3 -zv api.example.com 443
# 6. Capture packets for deep debugging
tcpdump -i eth0 -n port 443 -c 50 -w capture.pcap5. systemd 服务管理
systemd 是现代 Linux 发行版的初始化系统和服务管理器。它控制服务的启动、停止、重启,处理依赖关系,并通过 journalctl 提供集中日志。
# Service management
systemctl start nginx # start a service
systemctl stop nginx # stop a service
systemctl restart nginx # restart
systemctl reload nginx # reload config without restart
systemctl status nginx # check status and recent logs
systemctl enable nginx # start on boot
systemctl disable nginx # do not start on boot
systemctl is-active nginx # check if running (for scripts)
systemctl list-units --type=service --state=running # all running
# Logs with journalctl
journalctl -u nginx # all logs for nginx
journalctl -u nginx --since today # today only
journalctl -u nginx -f # follow (like tail -f)
journalctl -p err # only errors and above
journalctl --disk-usage # how much disk logs use
journalctl --vacuum-size=500M # shrink logs to 500MB创建自定义 systemd 服务
# /etc/systemd/system/myapp.service
[Unit]
Description=My Node.js Application
After=network.target
Wants=network-online.target
[Service]
Type=simple
User=deploy
Group=deploy
WorkingDirectory=/opt/myapp
ExecStart=/usr/bin/node /opt/myapp/server.js
ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure
RestartSec=5
StandardOutput=journal
StandardError=journal
Environment=NODE_ENV=production
Environment=PORT=3000
# Security hardening
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/opt/myapp/data
[Install]
WantedBy=multi-user.target
# Activate the service
# sudo systemctl daemon-reload
# sudo systemctl enable --now myapp
# sudo systemctl status myappsystemd Timer(替代 cron)
# /etc/systemd/system/backup.timer
[Unit]
Description=Run backup every day at 2 AM
[Timer]
OnCalendar=*-*-* 02:00:00
Persistent=true
[Install]
WantedBy=timers.target
# /etc/systemd/system/backup.service
[Unit]
Description=Backup script
[Service]
Type=oneshot
ExecStart=/opt/scripts/backup.sh
# Enable: systemctl enable --now backup.timer
# Check: systemctl list-timers6. Cron 定时任务
cron 是 Linux 中最经典的定时任务工具。通过 crontab 文件,你可以按照精确的时间计划自动执行命令和脚本。
# Crontab syntax:
# ┌───────── minute (0-59)
# │ ┌─────── hour (0-23)
# │ │ ┌───── day of month (1-31)
# │ │ │ ┌─── month (1-12)
# │ │ │ │ ┌─ day of week (0-7, 0 and 7 = Sunday)
# │ │ │ │ │
# * * * * * command
# Edit your crontab
crontab -e
# List current crontab
crontab -l
# Examples
0 2 * * * /opt/scripts/backup.sh # daily at 2:00 AM
*/5 * * * * /opt/scripts/health-check.sh # every 5 minutes
0 0 * * 0 /opt/scripts/weekly-report.sh # Sunday midnight
30 8 1 * * /opt/scripts/monthly-audit.sh # 1st of month, 8:30 AM
0 */6 * * * /opt/scripts/sync.sh # every 6 hours
# Special strings
@reboot /opt/scripts/on-startup.sh # run once at boot
@hourly /opt/scripts/heartbeat.sh # every hour (0 * * * *)
@daily /opt/scripts/cleanup.sh # once per day
@weekly /opt/scripts/digest.sh # once per week
@monthly /opt/scripts/invoice.sh # once per month
# Redirect output for debugging
0 3 * * * /opt/scripts/etl.sh >> /var/log/etl.log 2>&1
# Use a MAILTO to get notified on errors
MAILTO=admin@example.com
0 4 * * * /opt/scripts/critical-job.sh
# System-wide cron directories
# /etc/cron.d/ custom cron files
# /etc/cron.daily/ scripts run daily by anacron
# /etc/cron.hourly/ scripts run hourly
# /etc/cron.weekly/ scripts run weekly
# /etc/cron.monthly/ scripts run monthly7. Shell 脚本编程
Shell 脚本是 Linux 自动化的基础。Bash 是最常用的 shell,理解变量、条件、循环、函数和最佳实践将大幅提升你的生产力。
脚本基础与最佳实践
#!/usr/bin/env bash
# Always start with a robust preamble:
set -euo pipefail
# -e exit on error
# -u treat unset variables as errors
# -o pipefail catch errors in piped commands
# Variables (no spaces around =)
APP_NAME="myapp"
VERSION="1.2.3"
DEPLOY_DIR="/opt/\${APP_NAME}"
# Always quote variables to handle spaces/special chars
echo "Deploying \${APP_NAME} v\${VERSION} to \${DEPLOY_DIR}"
# Command substitution
CURRENT_DATE="\$(date +%Y-%m-%d)"
GIT_HASH="\$(git rev-parse --short HEAD)"
FILE_COUNT="\$(find /var/log -name '*.log' | wc -l)"
# Arithmetic
COUNT=5
NEXT=\$((COUNT + 1))
echo "Next: \${NEXT}"条件判断
#!/usr/bin/env bash
set -euo pipefail
# String comparison
if [[ "\${ENV}" == "production" ]]; then
echo "Running in production mode"
elif [[ "\${ENV}" == "staging" ]]; then
echo "Running in staging mode"
else
echo "Running in development mode"
fi
# File tests
if [[ -f "/etc/nginx/nginx.conf" ]]; then
echo "Nginx config exists"
fi
if [[ -d "/opt/myapp" ]]; then
echo "Directory exists"
fi
if [[ ! -x "/usr/bin/docker" ]]; then
echo "Docker not installed or not executable"
exit 1
fi
# Numeric comparison
DISK_USAGE=85
if (( DISK_USAGE > 80 )); then
echo "WARNING: Disk usage at \${DISK_USAGE}%"
fi
# Logical operators
if [[ -f "app.js" ]] && [[ -f "package.json" ]]; then
echo "Node.js project detected"
fi循环与函数
#!/usr/bin/env bash
set -euo pipefail
# For loop
for server in web-01 web-02 web-03; do
echo "Deploying to \${server}..."
ssh "deploy@\${server}" "bash /opt/deploy.sh"
done
# C-style for loop
for ((i = 1; i <= 5; i++)); do
echo "Attempt \${i}"
done
# While loop (read lines from file)
while IFS= read -r line; do
echo "Processing: \${line}"
done < "servers.txt"
# While loop with counter
RETRIES=0
MAX_RETRIES=5
while (( RETRIES < MAX_RETRIES )); do
if curl -sf http://localhost:3000/health; then
echo "Service is up!"
break
fi
RETRIES=\$((RETRIES + 1))
echo "Retry \${RETRIES}/\${MAX_RETRIES}..."
sleep 2
done
# Functions
log() {
local level="\$1"
local message="\$2"
echo "[\$(date +'%Y-%m-%d %H:%M:%S')] [\${level}] \${message}"
}
deploy() {
local target="\$1"
log "INFO" "Starting deployment to \${target}"
if ! ssh "deploy@\${target}" "bash /opt/deploy.sh"; then
log "ERROR" "Deployment to \${target} failed"
return 1
fi
log "INFO" "Deployment to \${target} succeeded"
}
# Use the function
deploy "web-01" || log "WARN" "Continuing despite failure"实用脚本模式
#!/usr/bin/env bash
set -euo pipefail
# Trap for cleanup on exit
TMPDIR="\$(mktemp -d)"
trap 'rm -rf "\${TMPDIR}"' EXIT
# Argument parsing
usage() {
echo "Usage: \$0 [-e environment] [-v version] [-h]"
exit 1
}
ENV="production"
VER="latest"
while getopts "e:v:h" opt; do
case "\${opt}" in
e) ENV="\${OPTARG}" ;;
v) VER="\${OPTARG}" ;;
h) usage ;;
*) usage ;;
esac
done
echo "Deploying version \${VER} to \${ENV}"
# Here document (heredoc)
cat > "\${TMPDIR}/config.yml" <<EOF
environment: \${ENV}
version: \${VER}
timestamp: \$(date -u +%Y-%m-%dT%H:%M:%SZ)
EOF
# Arrays
SERVERS=("web-01" "web-02" "web-03")
echo "Server count: \${#SERVERS[@]}"
for s in "\${SERVERS[@]}"; do
echo " -> \${s}"
done
# String manipulation
FILENAME="backup-2026-02-27.tar.gz"
echo "\${FILENAME%.tar.gz}" # remove suffix: backup-2026-02-27
echo "\${FILENAME##*-}" # remove prefix: 27.tar.gz
echo "\${FILENAME/backup/archive}" # replace: archive-2026-02-27.tar.gz8. 包管理
Linux 发行版使用不同的包管理系统。Debian/Ubuntu 使用 APT,RHEL/CentOS/Fedora 使用 DNF/YUM,Arch 使用 pacman。
| 操作 | APT (Debian/Ubuntu) | DNF (RHEL/Fedora) | pacman (Arch) |
|---|---|---|---|
| 更新索引 | apt update | dnf check-update | pacman -Sy |
| 升级所有包 | apt upgrade | dnf upgrade | pacman -Syu |
| 安装包 | apt install nginx | dnf install nginx | pacman -S nginx |
| 卸载包 | apt remove nginx | dnf remove nginx | pacman -R nginx |
| 搜索包 | apt search keyword | dnf search keyword | pacman -Ss keyword |
| 包信息 | apt show nginx | dnf info nginx | pacman -Si nginx |
| 清理缓存 | apt autoremove | dnf autoremove | pacman -Sc |
通用包管理技巧
# Check which package provides a file
dpkg -S /usr/bin/curl # Debian/Ubuntu
rpm -qf /usr/bin/curl # RHEL/Fedora
# List installed packages
dpkg -l | grep nginx # Debian/Ubuntu
rpm -qa | grep nginx # RHEL/Fedora
# Hold a package (prevent upgrades)
apt-mark hold linux-image-generic # Debian/Ubuntu
dnf versionlock add kernel # RHEL (needs plugin)
# Snap and Flatpak (cross-distro)
snap install code --classic # VS Code via Snap
flatpak install flathub org.gimp.GIMP # GIMP via Flatpak9. SSH 安全远程访问
SSH (Secure Shell) 是远程管理 Linux 服务器的标准协议。正确配置 SSH 是服务器安全的第一道防线。
# Generate SSH key pair (Ed25519 recommended)
ssh-keygen -t ed25519 -C "dev@example.com"
# Generates:
# ~/.ssh/id_ed25519 (private key — NEVER share)
# ~/.ssh/id_ed25519.pub (public key — copy to servers)
# Copy public key to server
ssh-copy-id -i ~/.ssh/id_ed25519.pub user@192.168.1.100
# Connect
ssh user@192.168.1.100
ssh -p 2222 user@host # custom port
ssh -i ~/.ssh/mykey user@host # specific key
# SSH agent (avoid typing passphrase repeatedly)
eval "\$(ssh-agent -s)"
ssh-add ~/.ssh/id_ed25519
ssh-add -l # list loaded keysSSH 配置文件
# ~/.ssh/config — host aliases for convenience
Host prod
HostName 10.0.1.50
User deploy
Port 2222
IdentityFile ~/.ssh/id_ed25519
ForwardAgent yes
Host staging
HostName 10.0.1.51
User deploy
Port 2222
IdentityFile ~/.ssh/id_ed25519
Host bastion
HostName bastion.example.com
User admin
IdentityFile ~/.ssh/bastion_key
# Jump through bastion to reach internal servers
Host internal-*
ProxyJump bastion
User deploy
# Usage: ssh prod (instead of ssh -p 2222 deploy@10.0.1.50)SSH 服务端安全加固
# /etc/ssh/sshd_config — security hardening
Port 2222 # change from default 22
PermitRootLogin no # disable root SSH login
PasswordAuthentication no # keys only, no passwords
PubkeyAuthentication yes # enable key-based auth
MaxAuthTries 3 # limit failed attempts
ClientAliveInterval 300 # timeout idle connections
ClientAliveCountMax 2 # disconnect after 2 missed keepalives
AllowUsers deploy admin # whitelist specific users
X11Forwarding no # disable X11 forwarding
AllowTcpForwarding no # disable TCP forwarding
# Apply changes
# sudo sshd -t # test config syntax
# sudo systemctl restart sshd
# Additional: fail2ban for brute-force protection
# apt install fail2ban
# systemctl enable --now fail2banSSH 隧道与端口转发
# Local port forwarding: access remote service on local port
# Forwards localhost:5432 → remote-db-host:5432 through ssh-server
ssh -L 5432:remote-db-host:5432 user@ssh-server
# Remote port forwarding: expose local service to remote
# Makes local:3000 available as remote:8080
ssh -R 8080:localhost:3000 user@remote-server
# Dynamic SOCKS proxy
ssh -D 1080 user@proxy-server
# Then configure browser to use SOCKS5 proxy at localhost:1080
# Persistent tunnel with autossh
autossh -M 20000 -f -N -L 5432:db:5432 user@bastion10. 防火墙配置(iptables / ufw / firewalld)
Linux 防火墙是服务器安全的核心。iptables 是底层工具,ufw (Ubuntu) 和 firewalld (RHEL) 是更友好的前端。
iptables
# View current rules
iptables -L -n -v # list all rules with numbers
iptables -L -n --line-numbers # with line numbers for deletion
# Default policy: drop everything, allow only what we specify
iptables -P INPUT DROP
iptables -P FORWARD DROP
iptables -P OUTPUT ACCEPT
# Allow loopback
iptables -A INPUT -i lo -j ACCEPT
# Allow established connections
iptables -A INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
# Allow SSH (port 2222)
iptables -A INPUT -p tcp --dport 2222 -j ACCEPT
# Allow HTTP and HTTPS
iptables -A INPUT -p tcp --dport 80 -j ACCEPT
iptables -A INPUT -p tcp --dport 443 -j ACCEPT
# Rate limit SSH to prevent brute force
iptables -A INPUT -p tcp --dport 2222 -m limit --limit 3/min \
--limit-burst 5 -j ACCEPT
# Block a specific IP
iptables -A INPUT -s 203.0.113.42 -j DROP
# Delete a rule by line number
iptables -D INPUT 3
# Save rules (persist across reboot)
iptables-save > /etc/iptables/rules.v4
# Restore
iptables-restore < /etc/iptables/rules.v4
# Or install: apt install iptables-persistentufw (Ubuntu)
# Enable/disable
ufw enable
ufw disable
ufw status verbose
# Default policies
ufw default deny incoming
ufw default allow outgoing
# Allow services
ufw allow ssh # port 22
ufw allow 2222/tcp # custom SSH port
ufw allow http # port 80
ufw allow https # port 443
ufw allow from 10.0.0.0/24 # allow entire subnet
# Allow port range
ufw allow 3000:3010/tcp
# Remove a rule
ufw delete allow http
ufw status numbered # show rule numbers
ufw delete 3 # delete by number
# Application profiles
ufw app list # list known apps
ufw allow "Nginx Full" # allow both HTTP and HTTPSfirewalld (RHEL/CentOS)
# Check status
firewall-cmd --state
firewall-cmd --list-all
# Add services permanently
firewall-cmd --permanent --add-service=http
firewall-cmd --permanent --add-service=https
firewall-cmd --permanent --add-port=3000/tcp
# Remove a service
firewall-cmd --permanent --remove-service=ftp
# Reload to apply
firewall-cmd --reload
# Zone management
firewall-cmd --get-active-zones
firewall-cmd --zone=public --list-all11. 磁盘管理
磁盘管理涉及分区、文件系统创建、挂载和逻辑卷管理 (LVM)。理解这些对于服务器管理至关重要。
# View disk usage
df -h # filesystem usage (human-readable)
df -ih # inode usage
du -sh /var/log # total size of a directory
du -h --max-depth=1 / # size of each top-level directory
# Block devices and partitions
lsblk # tree view of block devices
lsblk -f # show filesystem types
fdisk -l # list all disks and partitions
blkid # show UUIDs and labels
# Partition a new disk (GPT)
gdisk /dev/sdb # interactive GPT partitioning
# Or use parted for scriptable partitioning:
parted /dev/sdb mklabel gpt
parted /dev/sdb mkpart primary ext4 0% 100%
# Create filesystem
mkfs.ext4 /dev/sdb1 # ext4 filesystem
mkfs.xfs /dev/sdb1 # XFS filesystem
# Mount
mount /dev/sdb1 /mnt/data
umount /mnt/data
# Persistent mount via /etc/fstab
# UUID=xxxx-xxxx /mnt/data ext4 defaults,noatime 0 2
# Always test: mount -a (mounts all fstab entries)LVM 逻辑卷管理
# LVM layers: Physical Volume (PV) → Volume Group (VG) → Logical Volume (LV)
# Create physical volume
pvcreate /dev/sdb1 /dev/sdc1
# Create volume group
vgcreate data-vg /dev/sdb1 /dev/sdc1
# Create logical volume (50GB)
lvcreate -L 50G -n app-lv data-vg
# Create filesystem and mount
mkfs.ext4 /dev/data-vg/app-lv
mount /dev/data-vg/app-lv /opt/app
# Extend logical volume (add 20GB)
lvextend -L +20G /dev/data-vg/app-lv
resize2fs /dev/data-vg/app-lv # resize ext4
# xfs_growfs /opt/app # resize XFS
# View LVM status
pvs # physical volumes
vgs # volume groups
lvs # logical volumes
lsblk # verify layoutSWAP 管理
# Check current swap
swapon --show
free -h
# Create a swap file (4GB)
fallocate -l 4G /swapfile
chmod 600 /swapfile
mkswap /swapfile
swapon /swapfile
# Persist in /etc/fstab
# /swapfile none swap sw 0 0
# Tune swappiness (0-100, lower = prefer RAM)
cat /proc/sys/vm/swappiness # check current
sysctl vm.swappiness=10 # set temporarily
# Persist: echo "vm.swappiness=10" >> /etc/sysctl.conf12. 系统监控
有效的系统监控可以帮助你在问题变成事故之前发现它们。Linux 提供了丰富的内置和第三方监控工具。
CPU 与内存监控
# CPU info
nproc # number of CPUs
lscpu # detailed CPU info
cat /proc/cpuinfo # raw CPU details
# Memory
free -h # memory usage (human-readable)
cat /proc/meminfo # detailed memory info
# Real-time: top / htop
top -bn1 | head -20 # batch mode, one snapshot
htop # interactive (install: apt install htop)
# vmstat: CPU, memory, IO at intervals
vmstat 2 5 # every 2 seconds, 5 iterations
# Output columns:
# r = runnable processes
# b = blocked processes
# si = swap in, so = swap out (non-zero = memory pressure)
# us = user CPU, sy = system CPU, id = idle, wa = IO wait
# mpstat: per-CPU usage
mpstat -P ALL 2 # all CPUs every 2 seconds
# sar: historical performance data
sar -u 2 10 # CPU usage every 2s, 10 times
sar -r 2 10 # memory usage
sar -b 2 10 # IO activity磁盘 I/O 监控
# iostat: disk throughput and latency
iostat -xz 2 # extended stats every 2 seconds
# Key columns:
# %util = how busy the device is (>80% = bottleneck)
# await = avg IO wait time in ms
# r/s,w/s = reads/writes per second
# iotop: per-process IO usage (needs root)
iotop -o # only show processes doing IO
# Check for disk errors
dmesg | grep -i error
smartctl -a /dev/sda # SMART data (install: smartmontools)网络监控
# Per-process network usage
nethogs # like top for network (install: apt install nethogs)
# Per-interface bandwidth
iftop -i eth0 # real-time bandwidth by connection
nload # simple bandwidth graph per interface
# Connection statistics
ss -s # summary: total, TCP, UDP connections
ss -tnp state established # all established TCP with process info
# Packet capture
tcpdump -i eth0 -n port 80 -c 100 # capture 100 HTTP packets
tcpdump -i any -w trace.pcap # write to file for Wireshark日志分析
# System logs
journalctl -p err --since "1 hour ago" # errors in last hour
journalctl -u nginx --since today -f # follow nginx logs today
dmesg -T --level=err,warn # kernel errors/warnings
# Traditional log files
tail -f /var/log/syslog # follow system log
tail -f /var/log/auth.log # authentication events
tail -f /var/log/nginx/error.log # web server errors
# Useful log analysis one-liners
# Top 10 IPs hitting the server
awk '{print $1}' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head
# Failed SSH login attempts
grep "Failed password" /var/log/auth.log | awk '{print $11}' | sort | uniq -c | sort -rn
# Log rotation config
# /etc/logrotate.d/myapp
# /var/log/myapp/*.log {
# daily
# rotate 14
# compress
# delaycompress
# missingok
# notifempty
# create 0640 deploy deploy
# postrotate
# systemctl reload myapp
# endscript
# }一体化监控脚本
#!/usr/bin/env bash
# Quick server health check
set -euo pipefail
echo "=== System Info ==="
hostname
uptime
uname -r
echo ""
echo "=== CPU Load ==="
cat /proc/loadavg
echo "CPUs: \$(nproc)"
echo ""
echo "=== Memory ==="
free -h | head -2
echo ""
echo "=== Disk Usage (>80%) ==="
df -h | awk 'NR==1 || int(\$5)>80'
echo ""
echo "=== Top 5 CPU Consumers ==="
ps aux --sort=-%cpu | head -6
echo ""
echo "=== Top 5 Memory Consumers ==="
ps aux --sort=-%mem | head -6
echo ""
echo "=== Failed Services ==="
systemctl --failed --no-pager
echo ""
echo "=== Recent Errors (last 30 min) ==="
journalctl -p err --since "30 minutes ago" --no-pager | tail -1013. 实用技巧与速查
文本处理三剑客
# grep — search text
grep -rn "TODO" src/ # recursive search with line numbers
grep -E "error|fail" app.log # extended regex (OR)
grep -c "200" access.log # count matches
grep -v "^#" config.conf # exclude comment lines
# sed — stream editor
sed -i 's/old/new/g' file.txt # in-place global replace
sed -n '10,20p' file.txt # print lines 10-20
sed '/^$/d' file.txt # delete empty lines
# awk — columnar data processing
awk '{print $1, $3}' data.txt # print columns 1 and 3
awk -F: '{print $1}' /etc/passwd # colon delimiter, usernames
awk '$3 > 1000' data.txt # filter rows where col3 > 1000
awk '{sum += $2} END {print sum}' data # sum column 2I/O 重定向与管道
# Standard streams: stdin(0), stdout(1), stderr(2)
command > file.txt # redirect stdout (overwrite)
command >> file.txt # redirect stdout (append)
command 2> error.log # redirect stderr
command &> all.log # redirect both stdout and stderr
command 2>&1 # merge stderr into stdout
# Pipes: connect stdout of one command to stdin of next
cat access.log | grep "POST" | awk '{print $1}' | sort -u | wc -l
# Process substitution
diff <(sort file1.txt) <(sort file2.txt)
# tee: write to file AND stdout simultaneously
command | tee output.log # display and save
command | tee -a output.log # display and append环境变量与 Shell 配置
# View and set environment variables
env # list all env vars
echo "\${PATH}" # print specific variable
export MY_VAR="hello" # set for this session + child processes
# Persist across sessions
# ~/.bashrc — interactive non-login shells
# ~/.bash_profile — login shells (SSH, new terminal)
# ~/.profile — POSIX, read by many shells
# /etc/environment — system-wide variables
# Common PATH modification
export PATH="\${HOME}/.local/bin:\${PATH}"
# Useful aliases (add to ~/.bashrc)
alias ll="ls -lah"
alias gs="git status"
alias dc="docker compose"
alias k="kubectl"
# Reload after editing
source ~/.bashrc总结
Linux 是每个开发者工具箱中不可或缺的一部分。从文件系统层次结构到权限管理、从进程控制到 systemd 服务、从 Shell 脚本编程到防火墙配置——掌握这些核心技能将让你在日常开发、部署和运维中事半功倍。
建议从基础命令开始,逐步深入各个专题。在实际服务器上练习是最好的学习方式——搭建一个虚拟机或云实例,动手操作本文中的每个命令,你将很快建立起扎实的 Linux 运维能力。
常见问题
What is the Linux file system hierarchy and why does it matter?
The Linux Filesystem Hierarchy Standard (FHS) organizes files into directories like /bin (essential binaries), /etc (configuration files), /home (user directories), /var (variable data like logs), /tmp (temporary files), and /usr (user programs). Understanding this layout is essential because every tool, service, and config follows these conventions, making system administration predictable.
How do Linux file permissions work with chmod and chown?
Linux permissions are defined for three entities: owner, group, and others. Each can have read (r=4), write (w=2), and execute (x=1) permissions. chmod changes permissions using octal (chmod 755 file) or symbolic (chmod u+x file) notation. chown changes ownership (chown user:group file). Special permissions include setuid (4xxx), setgid (2xxx), and sticky bit (1xxx).
How do I manage processes in Linux with ps, top, and kill?
Use ps aux to list all running processes, top or htop for real-time monitoring, and kill to send signals. kill -15 PID sends SIGTERM for graceful shutdown, kill -9 PID sends SIGKILL for forced termination. Use bg/fg to manage background jobs, and nohup or disown to keep processes running after logout.
What is systemd and how do I create a service?
systemd is the init system and service manager for most modern Linux distributions. It manages services (daemons), handles boot order, and provides logging via journalctl. Create a .service unit file in /etc/systemd/system/, define ExecStart, then run systemctl daemon-reload, systemctl enable, and systemctl start to activate it.
How do cron jobs work in Linux?
Cron is a time-based job scheduler. Edit your crontab with crontab -e. The syntax is: minute hour day-of-month month day-of-week command. For example, 0 2 * * * /scripts/backup.sh runs at 2:00 AM daily. Use @reboot for startup tasks, @hourly, @daily, @weekly for convenience. Check logs at /var/log/cron or via journalctl -u cron.
How do I configure SSH for secure remote access?
Generate an SSH key pair with ssh-keygen -t ed25519. Copy the public key to the server with ssh-copy-id user@host. Harden security by editing /etc/ssh/sshd_config: disable root login (PermitRootLogin no), disable password auth (PasswordAuthentication no), and change the default port. Use ssh-agent for key management and ~/.ssh/config for host aliases.
How do I use iptables or firewalld to configure a Linux firewall?
iptables uses chains (INPUT, OUTPUT, FORWARD) and rules to filter packets. Example: iptables -A INPUT -p tcp --dport 22 -j ACCEPT allows SSH. For persistence, use iptables-save/iptables-restore or install iptables-persistent. firewalld (on RHEL/CentOS) uses zones and provides firewall-cmd for dynamic management. ufw (on Ubuntu) offers a simpler frontend.
What are the best Linux system monitoring tools?
For real-time monitoring: top/htop for processes, vmstat/sar for CPU/memory/IO, iotop for disk IO, nethogs/iftop for network. For disk usage: df -h for filesystem usage, du -sh for directory sizes, lsblk for block devices. For logs: journalctl for systemd logs, dmesg for kernel messages. Production setups often use Prometheus + Grafana, Netdata, or Zabbix.