探索 AlmaLinux 性能极限从内核参数到应用层调优的实战指南

引言

AlmaLinux 作为 CentOS 的继任者，继承了 RHEL 的稳定性和企业级特性，广泛应用于服务器环境。然而，默认配置往往无法充分发挥硬件性能。本文将深入探讨从内核参数到应用层的全方位调优策略，帮助您将 AlmaLinux 的性能推向极限。

一、性能调优基础：监控与基准测试

1.1 常用监控工具

在调优之前，必须建立性能基线。以下是 AlmaLinux 中常用的监控工具：

# 安装 sysstat 工具包
sudo dnf install sysstat -y

# 启动 sysstat 服务并设置为开机自启
sudo systemctl enable --now sysstat

# 查看 CPU 使用率（sar 命令）
sar -u 1 5  # 每秒采样一次，共5次

# 查看内存使用情况
sar -r 1 5

# 查看磁盘 I/O
sar -d 1 5

# 实时监控系统资源（htop 需要额外安装）
sudo dnf install htop -y
htop

1.2 基准测试工具

# CPU 基准测试（安装 stress-ng）
sudo dnf install epel-release -y
sudo dnf install stress-ng -y

# 运行 CPU 压力测试（4个核心，持续60秒）
stress-ng --cpu 4 --timeout 60s --metrics-brief

# 磁盘性能测试（安装 fio）
sudo dnf install fio -y

# 创建测试文件（1GB）
fio --name=write_test --ioengine=libaio --rw=write --bs=4k --size=1G --numjobs=1 --runtime=60 --group_reporting

# 网络性能测试（安装 iperf3）
sudo dnf install iperf3 -y

# 服务器端
iperf3 -s

# 客户端（替换为服务器IP）
iperf3 -c <server_ip> -t 30

二、内核参数调优

2.1 虚拟内存管理

虚拟内存是系统性能的关键。以下是针对不同负载的调优参数：

# 查看当前参数
sysctl vm.swappiness
sysctl vm.vfs_cache_pressure

# 临时修改（重启失效）
sudo sysctl -w vm.swappiness=10
sudo sysctl -w vm.vfs_cache_pressure=50

# 永久修改（编辑 /etc/sysctl.conf）
sudo tee -a /etc/sysctl.conf << EOF
# 虚拟内存调优
vm.swappiness = 10          # 减少交换倾向，优先使用物理内存
vm.vfs_cache_pressure = 50  # 保持目录和inode缓存
vm.dirty_ratio = 15         # 系统内存脏页比例达到15%时开始同步写入
vm.dirty_background_ratio = 5 # 后台脏页比例达到5%时开始后台写入
vm.dirty_expire_centisecs = 3000 # 脏页过期时间（30秒）
vm.dirty_writeback_centisecs = 500 # 后台写入间隔（5秒）
EOF

# 应用配置
sudo sysctl -p

调优说明：

vm.swappiness=10：对于数据库服务器，减少交换倾向，避免性能下降
vm.dirty_ratio=15：平衡写入延迟和系统稳定性
vm.dirty_expire_centisecs=3000：避免过多脏页堆积

2.2 网络性能调优

# 查看当前网络参数
sysctl net.core.rmem_max
sysctl net.core.wmem_max

# 网络调优配置
sudo tee -a /etc/sysctl.conf << EOF
# 网络缓冲区调优
net.core.rmem_max = 134217728    # 最大接收缓冲区（128MB）
net.core.wmem_max = 134217728    # 最大发送缓冲区（128MB）
net.core.rmem_default = 67108864 # 默认接收缓冲区（64MB）
net.core.wmem_default = 67108864 # 默认发送缓冲区（64MB）
net.ipv4.tcp_rmem = 4096 87380 134217728 # TCP接收缓冲区（min, default, max）
net.ipv4.tcp_wmem = 4096 65536 134217728 # TCP发送缓冲区
net.ipv4.tcp_mem = 134217728 134217728 134217728 # TCP内存限制
net.core.netdev_max_backlog = 300000 # 网络设备队列长度
net.core.somaxconn = 65535 # 最大连接队列
net.ipv4.tcp_max_syn_backlog = 65535 # SYN队列长度
net.ipv4.tcp_syncookies = 1 # 防御SYN洪水攻击
net.ipv4.tcp_tw_reuse = 1 # 允许TIME_WAIT socket重用
net.ipv4.tcp_fin_timeout = 30 # FIN超时时间
net.ipv4.tcp_keepalive_time = 600 # Keepalive间隔
net.ipv4.tcp_keepalive_probes = 3 # Keepalive探测次数
net.ipv4.tcp_keepalive_intvl = 10 # Keepalive探测间隔
EOF

# 应用配置
sudo sysctl -p

调优说明：

对于高并发Web服务器，增加缓冲区大小和队列长度
net.ipv4.tcp_tw_reuse=1：对于高并发短连接场景，减少TIME_WAIT数量
net.ipv4.tcp_keepalive_*：保持长连接，避免连接泄漏

2.3 文件系统调优

# 查看当前文件系统参数
sysctl fs.file-max
sysctl fs.nr_open

# 文件系统调优配置
sudo tee -a /etc/sysctl.conf << EOF
# 文件系统调优
fs.file-max = 2097152          # 系统最大文件句柄数
fs.nr_open = 2097152           # 每个进程最大文件句柄数
fs.inotify.max_user_watches = 524288 # inotify监控文件数
fs.inotify.max_user_instances = 1024 # inotify实例数
fs.aio-max-nr = 1048576        # 异步I/O请求数
EOF

# 应用配置
sudo sysctl -p

# 修改进程级文件句柄限制
sudo tee -a /etc/security/limits.conf << EOF
* soft nofile 1048576
* hard nofile 1048576
* soft nproc 65535
* hard nproc 65535
EOF

# 需要重新登录或重启生效

2.4 CPU 调度器调优

# 查看当前CPU调度器
cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

# 安装 tuned 工具（性能调优工具）
sudo dnf install tuned -y
sudo systemctl enable --now tuned

# 查看可用的调优配置文件
tuned-adm list

# 应用性能模式（根据负载选择）
sudo tuned-adm profile throughput-performance  # 高吞吐量
# 或者
sudo tuned-adm profile latency-performance     # 低延迟
# 或者
sudo tuned-adm profile virtual-guest           # 虚拟机

# 自定义调优配置文件
sudo mkdir -p /etc/tuned/custom-profile
sudo tee /etc/tuned/custom-profile/tuned.conf << EOF
[main]
include=throughput-performance

[cpu]
governor=performance
min_perf_pct=100

[vm]
swappiness=10
EOF

# 应用自定义配置
sudo tuned-adm profile custom-profile

三、存储系统调优

3.1 I/O 调度器选择

# 查看当前I/O调度器
cat /sys/block/sda/queue/scheduler

# 查看所有块设备
lsblk

# 设置I/O调度器（临时）
echo noop > /sys/block/sda/queue/scheduler  # SSD推荐
echo deadline > /sys/block/sda/queue/scheduler  # 机械硬盘推荐
echo bfq > /sys/block/sda/queue/scheduler  # 通用推荐

# 永久设置（创建udev规则）
sudo tee /etc/udev/rules.d/60-ioscheduler.rules << EOF
# SSD使用noop调度器
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="noop"
# HDD使用deadline调度器
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="1", ATTR{queue/scheduler}="deadline"
EOF

# 重新加载udev规则
sudo udevadm control --reload-rules
sudo udevadm trigger

3.2 文件系统挂载选项

# 查看当前挂载选项
mount | grep -E "ext4|xfs"

# XFS文件系统调优（推荐用于高并发）
sudo tee -a /etc/fstab << EOF
/dev/sdb1 /data xfs defaults,noatime,nodiratime,logbufs=8,logbsize=256k,allocsize=64k 0 0
EOF

# ext4文件系统调优
sudo tee -a /etc/fstab << EOF
/dev/sdc1 /data ext4 defaults,noatime,nodiratime,data=writeback,barrier=0 0 0
EOF

# 重新挂载
sudo mount -o remount /data

调优说明：

noatime,nodiratime：禁止访问时间更新，减少I/O
data=writeback：ext4的写回模式，提高写入性能（注意数据一致性）
logbufs=8,logbsize=256k：XFS日志缓冲区调优

3.3 LVM 调优

# 查看LVM配置
sudo lvdisplay
sudo vgdisplay

# 调整LVM读取头（针对机械硬盘）
sudo lvchange --readahead 256 /dev/vg0/lv_data

# 调整LVM条带数（针对SSD RAID）
sudo lvcreate -L 1T -n lv_data -i 4 vg0 /dev/sdb /dev/sdc /dev/sdd /dev/sde

四、网络性能调优实战

4.1 TCP/IP 协议栈调优

# 创建调优脚本
sudo tee /usr/local/bin/network-tuning.sh << 'EOF'
#!/bin/bash
# 网络性能调优脚本

echo "应用网络调优参数..."

# TCP调优
sysctl -w net.ipv4.tcp_sack=1
sysctl -w net.ipv4.tcp_window_scaling=1
sysctl -w net.ipv4.tcp_timestamps=1
sysctl -w net.ipv4.tcp_congestion_control=cubic
sysctl -w net.ipv4.tcp_slow_start_after_idle=0

# 针对高带宽延迟积（BDP）网络
BDP=$(($(cat /sys/class/net/eth0/speed) * 1000 * 1000 / 8 / 100))  # 假设100ms RTT
sysctl -w net.ipv4.tcp_rmem="4096 87380 $BDP"
sysctl -w net.ipv4.tcp_wmem="4096 65536 $BDP"

# UDP调优（用于视频流等）
sysctl -w net.core.rmem_max=26214400
sysctl -w net.core.wmem_max=26214400

echo "网络调优完成"
EOF

sudo chmod +x /usr/local/bin/network-tuning.sh

4.2 高并发Web服务器调优

# Nginx 性能调优配置示例
sudo tee /etc/nginx/nginx.conf << 'EOF'
user nginx;
worker_processes auto;  # 自动设置为CPU核心数
worker_rlimit_nofile 1048576;  # 工作进程文件句柄限制

events {
    worker_connections 65535;  # 每个工作进程最大连接数
    use epoll;  # Linux高性能事件模型
    multi_accept on;  # 允许一次接受多个连接
}

http {
    # 基础设置
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    keepalive_requests 1000;
    
    # 缓冲区设置
    client_body_buffer_size 128k;
    client_max_body_size 10m;
    client_header_buffer_size 1k;
    large_client_header_buffers 4 8k;
    
    # Gzip压缩
    gzip on;
    gzip_vary on;
    gzip_min_length 1024;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_types
        text/plain
        text/css
        text/xml
        text/javascript
        application/javascript
        application/xml+rss
        application/json;
    
    # 缓存设置
    open_file_cache max=10000 inactive=30s;
    open_file_cache_valid 60s;
    open_file_cache_min_uses 2;
    open_file_cache_errors on;
    
    # 虚拟主机配置
    server {
        listen 80 backlog=65535;  # 增加backlog队列
        server_name _;
        
        # 静态资源缓存
        location ~* \.(jpg|jpeg|png|gif|ico|css|js)$ {
            expires 1y;
            add_header Cache-Control "public, immutable";
        }
        
        # API接口
        location /api/ {
            proxy_pass http://backend;
            proxy_http_version 1.1;
            proxy_set_header Connection "";
            proxy_buffering off;  # 对于API，关闭缓冲区
            proxy_read_timeout 300s;
        }
    }
}
EOF

# 调整系统级Nginx限制
sudo tee -a /etc/security/limits.conf << EOF
nginx soft nofile 1048576
nginx hard nofile 1048576
EOF

# 重启Nginx
sudo systemctl restart nginx

4.3 数据库服务器调优（以MySQL为例）

# MySQL 8.0 性能调优配置
sudo tee /etc/my.cnf.d/server.cnf << 'EOF'
[mysqld]
# 基础设置
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mysql/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

# 内存相关
innodb_buffer_pool_size = 70% of total RAM  # 例如：8GB内存设置为5.6G
innodb_buffer_pool_instances = 8  # 根据CPU核心数调整
innodb_log_file_size = 2G
innodb_log_buffer_size = 64M
innodb_flush_log_at_trx_commit = 2  # 平衡性能和数据安全
innodb_flush_method = O_DIRECT  # 绕过OS缓存，直接写入磁盘

# 连接相关
max_connections = 2000
max_connect_errors = 100000
thread_cache_size = 100
table_open_cache = 4000

# 查询缓存（MySQL 8.0已移除，使用其他缓存）
# query_cache_type = 0

# 日志相关（生产环境建议关闭）
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 2

# 临时表
tmp_table_size = 256M
max_heap_table_size = 256M

# 索引相关
innodb_file_per_table = 1
innodb_flush_neighbors = 0  # SSD优化

# 并发相关
innodb_thread_concurrency = 0  # 自动调整
innodb_read_io_threads = 8
innodb_write_io_threads = 8

# 复制相关（如果使用主从）
server-id = 1
log_bin = mysql-bin
binlog_format = ROW
expire_logs_days = 7
EOF

# 调整系统参数以支持MySQL
sudo tee -a /etc/sysctl.conf << EOF
# MySQL需要的系统参数
vm.swappiness = 1
vm.dirty_ratio = 15
vm.dirty_background_ratio = 5
fs.file-max = 2097152
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
EOF

sudo sysctl -p

# 重启MySQL
sudo systemctl restart mysqld

五、应用层调优策略

5.1 Java 应用调优

# JVM 参数调优示例（针对8GB内存服务器）
sudo tee /etc/default/tomcat << 'EOF'
JAVA_OPTS="-Xms4G -Xmx4G -XX:+UseG1GC -XX:MaxGCPauseMillis=200 
-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap 
-XX:MaxRAMPercentage=75.0 -XX:InitialRAMPercentage=50.0 
-XX:MaxMetaspaceSize=256M -XX:MetaspaceSize=128M 
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/tomcat/heapdump.hprof 
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/var/log/tomcat/gc.log 
-XX:+UseStringDeduplication -XX:+OptimizeStringConcat 
-XX:+UseCompressedOops -XX:+UseCompressedClassPointers 
-XX:ReservedCodeCacheSize=256M -XX:InitialCodeCacheSize=128M"
EOF

# 对于高并发Web应用（Spring Boot）
sudo tee /etc/default/springboot << 'EOF'
JAVA_OPTS="
-Xms2G -Xmx2G 
-XX:+UseG1GC 
-XX:MaxGCPauseMillis=100 
-XX:InitiatingHeapOccupancyPercent=45 
-XX:ConcGCThreads=4 
-XX:ParallelGCThreads=4 
-XX:MaxTenuringThreshold=15 
-XX:+UseStringDeduplication 
-XX:+UseCompressedOops 
-XX:+UseCompressedClassPointers 
-XX:+ExplicitGCInvokesConcurrent 
-XX:+DisableExplicitGC 
-XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/var/log/springboot/heapdump.hprof 
-XX:+PrintGCDetails 
-XX:+PrintGCDateStamps 
-Xloggc:/var/log/springboot/gc.log 
-XX:+UseGCLogFileRotation 
-XX:NumberOfGCLogFiles=10 
-XX:GCLogFileSize=100M"
EOF

5.2 Python 应用调优

# Gunicorn 配置调优（针对Django/Flask应用）
sudo tee /etc/gunicorn/config.py << 'EOF'
# Gunicorn 配置文件
import multiprocessing

# 基础设置
bind = "0.0.0.0:8000"
workers = multiprocessing.cpu_count() * 2 + 1  # 推荐公式
worker_class = "gevent"  # 异步IO模型
worker_connections = 1000  # 每个工作进程最大连接数
timeout = 30
keepalive = 2

# 性能相关
preload_app = True  # 预加载应用，减少内存占用
max_requests = 1000  # 工作进程处理1000个请求后重启
max_requests_jitter = 50  # 随机抖动，避免同时重启

# 日志
accesslog = "/var/log/gunicorn/access.log"
errorlog = "/var/log/gunicorn/error.log"
loglevel = "info"

# 进程相关
daemon = False
pidfile = "/var/run/gunicorn.pid"
umask = 0o007
user = "www-data"
group = "www-data"

# 优化设置
reuse_port = True  # 允许端口重用
limit_request_line = 4096
limit_request_fields = 100
limit_request_field_size = 8190
EOF

# 启动命令
gunicorn -c /etc/gunicorn/config.py myapp:app

5.3 Node.js 应用调优

# PM2 配置调优
sudo tee /etc/pm2/ecosystem.config.js << 'EOF'
module.exports = {
  apps: [{
    name: 'node-api',
    script: 'app.js',
    instances: 'max',  // 使用所有CPU核心
    exec_mode: 'cluster',  // 集群模式
    node_args: '--max-old-space-size=4096 --max-semi-space-size=128',
    env: {
      NODE_ENV: 'production',
      NODE_OPTIONS: '--max-old-space-size=4096'
    },
    env_production: {
      NODE_ENV: 'production',
      NODE_OPTIONS: '--max-old-space-size=4096'
    },
    error_file: './logs/err.log',
    out_file: './logs/out.log',
    log_file: './logs/combined.log',
    time: true,
    watch: false,
    max_memory_restart: '1G',
    min_uptime: '10s',
    max_restarts: 10,
    restart_delay: 4000,
    kill_timeout: 5000,
    wait_ready: true,
    listen_timeout: 5000,
    max_restarts: 10,
    min_uptime: '10s'
  }]
}
EOF

# 启动应用
pm2 start /etc/pm2/ecosystem.config.js --env production
pm2 save
pm2 startup

六、容器化环境调优

6.1 Docker 容器调优

# Docker 守护进程配置
sudo tee /etc/docker/daemon.json << 'EOF'
{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "10m",
    "max-file": "3"
  },
  "storage-driver": "overlay2",
  "storage-opts": [
    "overlay2.override_kernel_check=true"
  ],
  "default-ulimits": {
    "nofile": {
      "Name": "nofile",
      "Hard": 65536,
      "Soft": 65536
    }
  },
  "default-runtime": "runc",
  "runtimes": {
    "nvidia": {
      "path": "nvidia-container-runtime",
      "runtimeArgs": []
    }
  },
  "live-restore": true,
  "max-concurrent-downloads": 3,
  "max-concurrent-uploads": 5,
  "userns-remap": "",
  "group": "",
  "cgroup-parent": "",
  "default-ulimits": {},
  "insecure-registries": [],
  "registry-mirrors": [],
  "debug": false,
  "experimental": false
}
EOF

# 重启Docker
sudo systemctl restart docker

# 运行优化后的容器
sudo docker run -d \
  --name optimized-app \
  --cpus="2.0" \
  --memory="2g" \
  --memory-swap="2g" \
  --ulimit nofile=65536:65536 \
  --ulimit nproc=65535:65535 \
  --restart unless-stopped \
  -p 8080:8080 \
  myapp:latest

6.2 Kubernetes 节点调优

# Kubelet 配置调优
sudo tee /etc/kubernetes/kubelet-config.yaml << 'EOF'
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
address: "0.0.0.0"
port: 10250
readOnlyPort: 10255
cgroupDriver: systemd
clusterDNS:
  - "10.96.0.10"
clusterDomain: "cluster.local"
resolvConf: "/etc/resolv.conf"
hairpinMode: "promiscuous-bridge"
maxPods: 110
podPidsLimit: 4096
kubeAPIQPS: 50
kubeAPIBurst: 100
evictionHard:
  memory.available: "100Mi"
  nodefs.available: "10%"
  nodefs.inodesFree: "5%"
  imagefs.available: "15%"
evictionSoft:
  memory.available: "200Mi"
  nodefs.available: "15%"
  nodefs.inodesFree: "10%"
  imagefs.available: "20%"
evictionSoftGracePeriod:
  memory.available: "1m"
  nodefs.available: "1m"
  nodefs.inodesFree: "1m"
  imagefs.available: "1m"
evictionMaxPodGracePeriod: 30
evictionPressureTransitionPeriod: "5m"
kubeReserved:
  cpu: "200m"
  memory: "256Mi"
  ephemeral-storage: "1Gi"
systemReserved:
  cpu: "100m"
  memory: "128Mi"
  ephemeral-storage: "1Gi"
kubeReservedCgroup: "/kube.slice"
systemReservedCgroup: "/system.slice"
evictionHard:
  memory.available: "100Mi"
  nodefs.available: "10%"
  nodefs.inodesFree: "5%"
  imagefs.available: "15%"
EOF

# 调整系统参数以支持Kubernetes
sudo tee -a /etc/sysctl.conf << EOF
# Kubernetes需要的系统参数
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
vm.swappiness = 1
vm.overcommit_memory = 1
vm.panic_on_oom = 0
vm.oom_kill_allocating_task = 1
fs.inotify.max_user_watches = 524288
fs.inotify.max_user_instances = 1024
fs.file-max = 2097152
fs.nr_open = 2097152
EOF

sudo sysctl -p

七、性能监控与自动化调优

7.1 使用 Prometheus + Grafana 监控

# 安装 Prometheus
sudo tee /etc/prometheus/prometheus.yml << 'EOF'
global:
  scrape_interval: 15s
  evaluation_interval: 15s

rule_files:
  - "rules/*.yml"

alerting:
  alertmanagers:
    - static_configs:
        - targets:
          - localhost:9093

scrape_configs:
  - job_name: 'alma-linux'
    static_configs:
      - targets: ['localhost:9100']
    scrape_interval: 10s
    scrape_timeout: 5s
    
  - job_name: 'node-exporter'
    static_configs:
      - targets: ['localhost:9100']
      
  - job_name: 'mysql'
    static_configs:
      - targets: ['localhost:9104']
      
  - job_name: 'nginx'
    static_configs:
      - targets: ['localhost:9113']
EOF

# 安装 Node Exporter
sudo dnf install node_exporter -y
sudo systemctl enable --now node_exporter

# 安装 Grafana
sudo tee /etc/yum.repos.d/grafana.repo << 'EOF'
[grafana]
name=Grafana
baseurl=https://packages.grafana.com/oss/rpm
repo_gpgcheck=1
gpgcheck=1
enabled=1
gpgkey=https://packages.grafana.com/gpg.key
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
EOF

sudo dnf install grafana -y
sudo systemctl enable --now grafana-server

7.2 自动化调优脚本

# 创建自动化性能调优脚本
sudo tee /usr/local/bin/auto-tune.sh << 'EOF'
#!/bin/bash
# 自动化性能调优脚本

LOG_FILE="/var/log/performance-tuning.log"
TIMESTAMP=$(date +"%Y-%m-%d %H:%M:%S")

log() {
    echo "[$TIMESTAMP] $1" | tee -a $LOG_FILE
}

# 检测系统负载
check_load() {
    local load1=$(awk '{print $1}' /proc/loadavg)
    local load5=$(awk '{print $2}' /proc/loadavg)
    local load15=$(awk '{print $3}' /proc/loadavg)
    local cpu_cores=$(nproc)
    
    log "当前负载: 1min=$load1, 5min=$load5, 15min=$load15, CPU核心数=$cpu_cores"
    
    # 如果1分钟负载超过CPU核心数的2倍，触发调优
    if (( $(echo "$load1 > $cpu_cores * 2" | bc -l) )); then
        log "负载过高，触发调优"
        return 1
    fi
    return 0
}

# 调整内核参数
adjust_kernel_params() {
    log "调整内核参数..."
    
    # 根据内存调整vm参数
    total_mem=$(free -g | awk '/^Mem:/{print $2}')
    if [ $total_mem -ge 64 ]; then
        # 大内存系统
        sysctl -w vm.swappiness=5
        sysctl -w vm.vfs_cache_pressure=25
        sysctl -w vm.dirty_ratio=10
        sysctl -w vm.dirty_background_ratio=3
    elif [ $total_mem -ge 16 ]; then
        # 中等内存系统
        sysctl -w vm.swappiness=10
        sysctl -w vm.vfs_cache_pressure=50
        sysctl -w vm.dirty_ratio=15
        sysctl -w vm.dirty_background_ratio=5
    else
        # 小内存系统
        sysctl -w vm.swappiness=30
        sysctl -w vm.vfs_cache_pressure=100
        sysctl -w vm.dirty_ratio=20
        sysctl -w vm.dirty_background_ratio=10
    fi
    
    # 调整网络参数
    sysctl -w net.core.rmem_max=134217728
    sysctl -w net.core.wmem_max=134217728
    sysctl -w net.ipv4.tcp_rmem="4096 87380 134217728"
    sysctl -w net.ipv4.tcp_wmem="4096 65536 134217728"
    
    log "内核参数调整完成"
}

# 调整进程限制
adjust_limits() {
    log "调整进程限制..."
    
    # 检查当前限制
    local current_limit=$(ulimit -n)
    local target_limit=1048576
    
    if [ $current_limit -lt $target_limit ]; then
        # 修改limits.conf
        if ! grep -q "soft nofile $target_limit" /etc/security/limits.conf; then
            echo "* soft nofile $target_limit" >> /etc/security/limits.conf
            echo "* hard nofile $target_limit" >> /etc/security/limits.conf
            log "已更新limits.conf，需要重新登录生效"
        fi
    fi
    
    log "进程限制调整完成"
}

# 优化文件系统
optimize_filesystem() {
    log "优化文件系统..."
    
    # 检查并优化挂载选项
    for mount_point in $(df -h | grep -v tmpfs | grep -v devtmpfs | awk 'NR>1 {print $6}'); do
        if [ "$mount_point" != "/" ] && [ "$mount_point" != "/boot" ]; then
            # 检查是否已优化
            if ! mount | grep "$mount_point" | grep -q "noatime"; then
                log "优化挂载点: $mount_point"
                # 注意：实际生产环境需要谨慎操作，这里仅作示例
                # mount -o remount,noatime,nodiratime $mount_point
            fi
        fi
    done
    
    log "文件系统优化完成"
}

# 主函数
main() {
    log "开始自动化性能调优..."
    
    if check_load; then
        log "系统负载正常，无需紧急调优"
    else
        log "系统负载过高，执行调优"
        adjust_kernel_params
        adjust_limits
        optimize_filesystem
    fi
    
    log "自动化性能调优完成"
}

# 执行主函数
main
EOF

sudo chmod +x /usr/local/bin/auto-tune.sh

# 设置定时任务（每5分钟检查一次）
sudo tee /etc/cron.d/performance-tuning << 'EOF'
*/5 * * * * root /usr/local/bin/auto-tune.sh
EOF

八、性能调优最佳实践

8.1 调优原则

测量优先：在调优前建立性能基线，调优后验证效果
逐步调整：每次只调整一个参数，观察效果
理解业务：根据应用类型（Web、数据库、计算等）选择调优策略
监控持续：建立持续监控机制，及时发现性能问题

8.2 常见性能问题排查

# CPU 使用率高
sudo perf top  # 实时查看CPU使用情况
sudo pidstat -u 1 5  # 查看进程CPU使用

# 内存使用异常
sudo smem -s swap  # 查看内存和交换使用
sudo cat /proc/meminfo  # 详细内存信息

# 磁盘 I/O 瓶颈
sudo iotop -o  # 实时I/O监控
sudo iostat -x 1 5  # 详细磁盘统计

# 网络问题
sudo ss -s  # 查看socket统计
sudo netstat -s  # 查看网络统计
sudo tcpdump -i any -c 1000  # 抓包分析

8.3 性能调优检查清单

[ ] 系统监控工具已安装并配置
[ ] 内核参数已根据硬件和负载调整
[ ] 文件系统挂载选项已优化
[ ] 应用服务器配置已调优
[ ] 数据库参数已优化
[ ] 网络参数已调整
[ ] 监控系统已部署
[ ] 自动化调优脚本已部署
[ ] 性能基线已记录
[ ] 调优文档已更新

九、总结

AlmaLinux 性能调优是一个系统工程，需要从内核参数、存储系统、网络配置到应用层进行全面优化。通过本文提供的实战指南，您可以：

建立性能基线：使用监控工具了解系统当前状态
针对性调优：根据负载类型选择合适的调优策略
持续优化：通过自动化工具和监控系统持续改进性能
预防问题：提前发现和解决潜在的性能瓶颈

记住，性能调优不是一次性的任务，而是一个持续的过程。随着业务增长和硬件升级，需要定期重新评估和调整调优策略。建议每季度进行一次全面的性能评估，确保系统始终处于最佳状态。

最后提醒：在生产环境进行任何调优前，请务必在测试环境验证，并做好回滚准备。性能调优的目标是平衡性能、稳定性和资源利用率，而不是盲目追求单一指标的极致。

探索 AlmaLinux 性能极限 从内核参数到应用层调优的实战指南

引言

一、性能调优基础：监控与基准测试

1.1 常用监控工具

1.2 基准测试工具

二、内核参数调优

2.1 虚拟内存管理

2.2 网络性能调优

2.3 文件系统调优

2.4 CPU 调度器调优

三、存储系统调优

3.1 I/O 调度器选择

3.2 文件系统挂载选项

3.3 LVM 调优

四、网络性能调优实战

4.1 TCP/IP 协议栈调优

4.2 高并发Web服务器调优

4.3 数据库服务器调优（以MySQL为例）

五、应用层调优策略

5.1 Java 应用调优

5.2 Python 应用调优

5.3 Node.js 应用调优

六、容器化环境调优

6.1 Docker 容器调优

6.2 Kubernetes 节点调优

七、性能监控与自动化调优

7.1 使用 Prometheus + Grafana 监控

7.2 自动化调优脚本

八、性能调优最佳实践

8.1 调优原则

8.2 常见性能问题排查

8.3 性能调优检查清单

九、总结

探索 AlmaLinux 性能极限从内核参数到应用层调优的实战指南