引言
AlmaLinux作为RHEL的开源替代品,在企业级服务器环境中扮演着越来越重要的角色。然而,仅仅安装系统并不能保证最佳性能。本文将深入探讨AlmaLinux性能优化的各个方面,从底层系统调优到应用层实践,提供全面的优化策略和实战技巧。
一、系统基础性能评估
1.1 性能监控工具概览
在开始优化之前,我们需要建立性能基线。AlmaLinux提供了一系列强大的监控工具:
# 安装常用性能监控工具
sudo dnf install sysstat htop iotop perf
# 启用sysstat收集系统性能数据
sudo systemctl enable --now sysstat
1.2 关键性能指标
- CPU使用率:
top、htop、mpstat - 内存使用:
free -h、vmstat - 磁盘I/O:
iostat、iotop - 网络流量:
iftop、nload - 进程级监控:
pidstat
1.3 建立性能基线
# 生成系统性能报告
sudo sar -A > system_baseline_$(date +%Y%m%d).txt
# 持续监控CPU使用情况(每5秒采样,共10次)
mpstat -P ALL 5 10
# 监控内存使用趋势
vmstat 1 10
二、内核级性能调优
2.1 内核参数优化
2.1.1 虚拟内存管理
# 查看当前虚拟内存参数
sysctl vm.swappiness
sysctl vm.vfs_cache_pressure
# 优化虚拟内存设置(适用于数据库服务器)
sudo tee /etc/sysctl.d/99-vm-optimization.conf << EOF
# 减少交换倾向,优先使用物理内存
vm.swappiness = 10
# 控制内核回收内存的倾向
vm.vfs_cache_pressure = 50
# 增加内存映射文件的最大数量
vm.max_map_count = 262144
# 优化内存分配策略
vm.overcommit_memory = 1
vm.overcommit_ratio = 80
EOF
# 应用配置
sudo sysctl -p /etc/sysctl.d/99-vm-optimization.conf
2.1.2 文件系统优化
# 查看当前文件系统挂载选项
mount | grep -E "(ext4|xfs)"
# 优化ext4文件系统(适用于Web服务器)
sudo tune2fs -o journal_data_writeback /dev/sda1
# 优化XFS文件系统(适用于数据库服务器)
sudo xfs_admin -u /dev/sda1
sudo xfs_admin -L "database" /dev/sda1
# 调整文件系统挂载选项
sudo tee /etc/fstab << EOF
/dev/sda1 /data ext4 defaults,noatime,nodiratime,data=writeback 0 2
/dev/sdb1 /var/lib/mysql xfs defaults,noatime,nodiratime,logbufs=8,logbsize=256k 0 2
EOF
2.2 I/O调度器优化
# 查看当前I/O调度器
cat /sys/block/sda/queue/scheduler
# 为不同设备设置合适的调度器
# SSD设备使用none或mq-deadline
echo none > /sys/block/sda/queue/scheduler
# 机械硬盘使用deadline
echo deadline > /sys/block/sdb/queue/scheduler
# 持久化配置
sudo tee /etc/udev/rules.d/60-io-scheduler.rules << EOF
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="1", ATTR{queue/scheduler}="deadline"
EOF
2.3 网络栈优化
# 网络参数优化(适用于高并发Web服务器)
sudo tee /etc/sysctl.d/99-network-optimization.conf << EOF
# 增加TCP连接队列
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
# 优化TCP缓冲区
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
# TCP拥塞控制算法
net.ipv4.tcp_congestion_control = bbr
# TIME_WAIT连接重用
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 30
# 网络连接跟踪
net.netfilter.nf_conntrack_max = 2000000
net.netfilter.nf_conntrack_tcp_timeout_established = 7200
EOF
# 应用配置
sudo sysctl -p /etc/sysctl.d/99-network-optimization.conf
三、存储系统优化
3.1 磁盘I/O优化
3.1.1 RAID配置优化
# 查看RAID状态
cat /proc/mdstat
# 创建RAID 10(适用于数据库)
sudo mdadm --create /dev/md0 --level=10 --raid-devices=4 /dev/sd[b-e]
# 优化RAID参数
sudo mdadm --grow /dev/md0 --bitmap=internal
sudo mdadm --grow /dev/md0 --chunk-size=512
# 持久化RAID配置
sudo mdadm --detail --scan >> /etc/mdadm.conf
3.1.2 LVM优化
# 创建优化的LVM配置
sudo pvcreate /dev/sdb
sudo vgcreate -s 16M vg_data /dev/sdb
sudo lvcreate -L 100G -n lv_mysql vg_data
# 优化LVM参数
sudo lvchange --discards passdown /dev/vg_data/lv_mysql
sudo lvchange --zero y /dev/vg_data/lv_mysql
# 调整LVM缓存(适用于频繁访问的小文件)
sudo lvcreate --type cache --size 1G --name lv_cache vg_data /dev/sdb
sudo lvconvert --type cache-pool --poolmetadata vg_data/lv_cache vg_data/lv_mysql
3.2 文件系统优化实践
3.2.1 XFS文件系统优化
# 创建XFS文件系统并优化参数
sudo mkfs.xfs -f -d su=128k,sw=4 -l size=128m /dev/vg_data/lv_mysql
# 调整XFS参数
sudo xfs_admin -u /dev/vg_data/lv_mysql
sudo xfs_admin -L "database" /dev/vg_data/lv_mysql
# 挂载优化
sudo mount -o noatime,nodiratime,logbufs=8,logbsize=256k /dev/vg_data/lv_mysql /var/lib/mysql
3.2.2 EXT4文件系统优化
# 创建优化的EXT4文件系统
sudo mkfs.ext4 -E lazy_itable_init=0,lazy_journal_init=0 -O ^has_journal /dev/vg_data/lv_web
# 调整EXT4参数
sudo tune2fs -o journal_data_writeback /dev/vg_data/lv_web
sudo tune2fs -i 0 -c 0 /dev/vg_data/lv_web
# 挂载优化
sudo mount -o noatime,nodiratime,data=writeback /dev/vg_data/lv_web /var/www/html
四、应用层性能优化
4.1 Web服务器优化(Nginx)
4.1.1 Nginx配置优化
# /etc/nginx/nginx.conf 优化配置
user nginx;
worker_processes auto; # 自动设置为CPU核心数
worker_rlimit_nofile 65535;
events {
worker_connections 65535;
use epoll; # Linux高性能事件模型
multi_accept on;
}
http {
# 基础优化
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
keepalive_requests 1000;
# 缓冲区优化
client_body_buffer_size 128k;
client_max_body_size 10m;
client_header_buffer_size 1k;
large_client_header_buffers 4 8k;
# Gzip压缩
gzip on;
gzip_vary on;
gzip_min_length 1024;
gzip_proxied any;
gzip_comp_level 6;
gzip_types
text/plain
text/css
text/xml
text/javascript
application/javascript
application/xml+rss
application/json;
# 缓存配置
open_file_cache max=10000 inactive=30s;
open_file_cache_valid 60s;
open_file_cache_min_uses 2;
open_file_cache_errors on;
# 日志优化
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for" '
'rt=$request_time uct="$upstream_connect_time" '
'uht="$upstream_header_time" urt="$upstream_response_time"';
access_log /var/log/nginx/access.log main buffer=64k flush=5m;
# 虚拟主机配置
include /etc/nginx/conf.d/*.conf;
}
4.1.2 Nginx进程管理优化
# 使用systemd管理Nginx进程
sudo tee /etc/systemd/system/nginx.service.d/override.conf << EOF
[Service]
LimitNOFILE=65535
LimitNPROC=65535
ExecStartPre=/usr/sbin/nginx -t
ExecStart=/usr/sbin/nginx
ExecReload=/bin/kill -HUP $MAINPID
KillMode=mixed
KillSignal=SIGQUIT
TimeoutStopSec=5
PrivateTmp=true
EOF
# 重新加载systemd配置
sudo systemctl daemon-reload
4.2 数据库优化(MySQL/MariaDB)
4.2.1 MySQL配置优化
# /etc/my.cnf.d/server.cnf 优化配置
[mysqld]
# 基础配置
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mariadb/mariadb.log
pid-file=/run/mariadb/mariadb.pid
# 内存优化
innodb_buffer_pool_size = 4G # 通常设置为系统内存的50-70%
innodb_buffer_pool_instances = 8 # 与CPU核心数匹配
innodb_log_file_size = 512M
innodb_log_buffer_size = 64M
# I/O优化
innodb_flush_method = O_DIRECT
innodb_flush_log_at_trx_commit = 2 # 平衡性能与数据安全
innodb_file_per_table = 1
innodb_io_capacity = 2000 # SSD设置为2000-4000
innodb_io_capacity_max = 4000
# 连接优化
max_connections = 500
thread_cache_size = 50
table_open_cache = 2000
table_definition_cache = 1400
# 查询缓存(MySQL 8.0+已移除,MariaDB仍可用)
query_cache_type = 1
query_cache_size = 128M
query_cache_limit = 2M
# 日志优化
slow_query_log = 1
slow_query_log_file = /var/log/mariadb/slow.log
long_query_time = 2
log_queries_not_using_indexes = 1
# 复制优化(如果使用主从复制)
server_id = 1
log_bin = /var/log/mariadb/mariadb-bin
binlog_format = ROW
expire_logs_days = 7
4.2.2 MySQL性能监控与调优
# 安装MySQL性能分析工具
sudo dnf install percona-toolkit
# 分析慢查询日志
sudo pt-query-digest /var/log/mariadb/slow.log > slow_query_report.txt
# 生成MySQL配置建议
sudo pt-mysql-summary --user=root --password
# 监控InnoDB状态
mysql -e "SHOW ENGINE INNODB STATUS\G" > innodb_status.txt
4.3 应用服务器优化(Java/Python)
4.3.1 Java应用优化
# JVM参数优化(适用于Spring Boot应用)
JAVA_OPTS="
-Xms4G -Xmx4G # 堆内存设置为固定值,避免动态调整
-XX:+UseG1GC # 使用G1垃圾回收器
-XX:MaxGCPauseMillis=200 # 目标最大GC暂停时间
-XX:+UnlockExperimentalVMOptions
-XX:+UseCGroupMemoryLimitForHeap # 容器环境使用
-XX:+AlwaysPreTouch # 预热内存
-XX:+UseStringDeduplication # 字符串去重
-XX:MaxMetaspaceSize=256m # 元空间限制
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/var/log/java/heapdump.hprof
"
# 使用systemd管理Java应用
sudo tee /etc/systemd/system/myapp.service << EOF
[Unit]
Description=My Java Application
After=network.target
[Service]
Type=simple
User=myapp
WorkingDirectory=/opt/myapp
Environment=JAVA_OPTS="$JAVA_OPTS"
ExecStart=/usr/bin/java -jar /opt/myapp/app.jar
Restart=always
RestartSec=10
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=myapp
[Install]
WantedBy=multi-user.target
EOF
4.3.2 Python应用优化
# Gunicorn配置优化(适用于Django/Flask应用)
# gunicorn_config.py
import multiprocessing
# 基础配置
bind = "0.0.0.0:8000"
workers = multiprocessing.cpu_count() * 2 + 1 # 推荐公式
worker_class = "gevent" # 使用异步worker
worker_connections = 1000 # 每个worker的最大连接数
timeout = 30
keepalive = 2
# 性能优化
preload_app = True # 预加载应用
max_requests = 1000 # 每个worker处理1000个请求后重启
max_requests_jitter = 50 # 随机抖动避免同时重启
# 内存优化
limit_request_line = 4096
limit_request_fields = 100
limit_request_field_size = 8190
# 日志配置
accesslog = "/var/log/gunicorn/access.log"
errorlog = "/var/log/gunicorn/error.log"
loglevel = "info"
# 进程管理
daemon = False
pidfile = "/var/run/gunicorn.pid"
umask = 0o007
user = "myapp"
group = "myapp"
# 监控配置
statsd_host = "localhost"
statsd_port = 8125
statsd_prefix = "gunicorn"
五、容器化应用优化
5.1 Docker容器优化
5.1.1 Docker守护进程优化
# /etc/docker/daemon.json 优化配置
{
"data-root": "/var/lib/docker",
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
],
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
},
"default-ulimits": {
"nofile": {
"Name": "nofile",
"Hard": 65535,
"Soft": 65535
}
},
"exec-opts": ["native.cgroupdriver=systemd"],
"live-restore": true,
"max-concurrent-downloads": 10,
"max-concurrent-uploads": 5,
"registry-mirrors": ["https://mirror.gcr.io"]
}
# 重启Docker服务
sudo systemctl restart docker
5.1.2 容器资源限制优化
# docker-compose.yml 优化配置
version: '3.8'
services:
web:
image: nginx:alpine
deploy:
resources:
limits:
cpus: '2'
memory: 2G
reservations:
cpus: '1'
memory: 1G
ulimits:
nofile:
soft: 65535
hard: 65535
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
5.2 Kubernetes优化
5.2.1 节点优化配置
# kubelet配置优化
apiVersion: v1
kind: ConfigMap
metadata:
name: kubelet-config
namespace: kube-system
data:
kubelet.config: |
{
"kind": "KubeletConfiguration",
"apiVersion": "kubelet.config.k8s.io/v1beta1",
"address": "0.0.0.0",
"port": 10250",
"readOnlyPort": 0,
"cgroupDriver": "systemd",
"clusterDNS": ["10.96.0.10"],
"clusterDomain": "cluster.local",
"resolvConf": "/etc/resolv.conf",
"maxPods": 110,
"podsPerCore": 10,
"kubeAPIQPS": 50,
"kubeAPIBurst": 100,
"evictionHard": {
"memory.available": "100Mi",
"nodefs.available": "10%",
"nodefs.inodesFree": "5%",
"imagefs.available": "15%",
"imagefs.inodesFree": "5%"
},
"evictionSoft": {
"memory.available": "200Mi",
"nodefs.available": "15%",
"nodefs.inodesFree": "10%",
"imagefs.available": "20%",
"imagefs.inodesFree": "10%"
},
"evictionSoftGracePeriod": {
"memory.available": "2m",
"nodefs.available": "2m",
"nodefs.inodesFree": "2m",
"imagefs.available": "2m",
"imagefs.inodesFree": "2m"
},
"evictionMaxPodGracePeriod": 120,
"evictionPressureTransitionPeriod": "5m",
"kubeReserved": {
"cpu": "200m",
"memory": "256Mi"
},
"systemReserved": {
"cpu": "100m",
"memory": "128Mi"
},
"enforceNodeAllocatable": ["pods", "system-reserved", "kube-reserved"],
"featureGates": {
"RotateKubeletServerCertificate": true
}
}
5.2.2 Pod资源优化
# deployment优化配置
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
spec:
replicas: 3
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- name: web
image: nginx:alpine
resources:
requests:
cpu: "250m"
memory: "256Mi"
limits:
cpu: "500m"
memory: "512Mi"
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", "echo 'Container started'"]
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 10"]
nodeSelector:
node-type: web
tolerations:
- key: "node-role.kubernetes.io/master"
operator: "Exists"
effect: "NoSchedule"
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- web
topologyKey: kubernetes.io/hostname
六、监控与告警优化
6.1 Prometheus + Grafana监控体系
6.1.1 Prometheus配置优化
# prometheus.yml 优化配置
global:
scrape_interval: 15s
evaluation_interval: 15s
external_labels:
cluster: 'alma-cluster'
environment: 'production'
rule_files:
- "rules/*.yml"
scrape_configs:
- job_name: 'alma-linux'
static_configs:
- targets: ['localhost:9100']
scrape_interval: 30s
scrape_timeout: 10s
metrics_path: /metrics
relabel_configs:
- source_labels: [__address__]
target_label: instance
regex: '([^:]+)(?::\d+)?'
replacement: '${1}'
- job_name: 'node-exporter'
static_configs:
- targets: ['node1:9100', 'node2:9100', 'node3:9100']
scrape_interval: 15s
scrape_timeout: 5s
metrics_path: /metrics
- job_name: 'mysql'
static_configs:
- targets: ['mysql-server:9104']
scrape_interval: 30s
scrape_timeout: 10s
metrics_path: /metrics
- job_name: 'nginx'
static_configs:
- targets: ['nginx-server:9113']
scrape_interval: 30s
scrape_timeout: 10s
metrics_path: /metrics
alerting:
alertmanagers:
- static_configs:
- targets:
- alertmanager:9093
# 远程存储配置(用于长期存储)
remote_write:
- url: "http://remote-storage:9201/api/v1/write"
queue_config:
capacity: 10000
max_samples_per_send: 1000
batch_send_deadline: 5s
max_shards: 200
min_shards: 1
max_backoff: 100ms
min_backoff: 50ms
retry_on_http_429: true
6.1.2 Grafana仪表板优化
{
"dashboard": {
"title": "AlmaLinux Performance Dashboard",
"panels": [
{
"title": "CPU Usage",
"type": "graph",
"targets": [
{
"expr": "100 - (avg by (instance) (rate(node_cpu_seconds_total{mode=\"idle\"}[5m])) * 100)",
"legendFormat": "{{instance}}"
}
],
"thresholds": [
{
"value": 80,
"color": "red",
"op": "gt"
}
],
"alert": {
"conditions": [
{
"evaluator": {
"params": [80],
"type": "gt"
},
"operator": {
"type": "and"
},
"query": {
"params": ["A", "5m", "now"]
},
"reducer": {
"params": [],
"type": "avg"
},
"type": "query"
}
],
"executionErrorState": "alerting",
"frequency": "1m",
"handler": 1,
"name": "High CPU Usage",
"noDataState": "no_data",
"notifications": []
}
}
]
}
}
6.2 日志优化
6.2.1 日志收集优化
# 安装和配置Fluentd作为日志收集器
sudo dnf install fluentd fluent-plugin-elasticsearch
# 配置Fluentd
sudo tee /etc/fluentd/fluent.conf << EOF
<source>
@type tail
path /var/log/nginx/access.log
pos_file /var/log/fluentd/nginx.access.pos
tag nginx.access
format nginx
time_format %d/%b/%Y:%H:%M:%S %z
</source>
<source>
@type tail
path /var/log/mariadb/mariadb.log
pos_file /var/log/fluentd/mariadb.pos
tag mariadb
format multiline
format_firstline /^\d{4}-\d{2}-\d{2}/
format1 /^(?<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}) (?<level>\w+) (?<message>.*)/
time_format %Y-%m-%d %H:%M:%S
</source>
<filter nginx.access>
@type parser
key_name message
reserve_data true
<parse>
@type regexp
expression /^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)"(?: (?<request_time>[^ ]*))?)?$/
time_format %d/%b/%Y:%H:%M:%S %z
</parse>
</filter>
<match **>
@type elasticsearch
host elasticsearch
port 9200
logstash_format true
logstash_prefix fluentd
logstash_dateformat %Y%m%d
include_tag_key true
tag_key @log_name
flush_interval 1s
request_timeout 30s
reload_connections true
reconnect_on_error true
reload_on_failure true
sniffer_class_name Fluent::Plugin::ElasticsearchSimpleSniffer
enable_ilm true
ilm_policy_id fluentd-policy
ilm_policy_overwrite true
</match>
七、实战案例:高并发Web应用优化
7.1 场景描述
假设我们有一个基于AlmaLinux的电商网站,面临以下挑战:
- 日均PV:500万
- 峰值QPS:1000
- 数据库:MySQL 8.0
- Web服务器:Nginx + PHP-FPM
- 缓存:Redis集群
7.2 优化步骤
7.2.1 系统层优化
# 1. 内核参数优化
sudo tee /etc/sysctl.d/99-web-optimization.conf << EOF
# 网络优化
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 30
# 内存优化
vm.swappiness = 10
vm.vfs_cache_pressure = 50
vm.max_map_count = 262144
# 文件系统优化
fs.file-max = 2097152
fs.nr_open = 2097152
EOF
sudo sysctl -p /etc/sysctl.d/99-web-optimization.conf
# 2. 用户资源限制
sudo tee /etc/security/limits.d/99-web-limits.conf << EOF
* soft nofile 65535
* hard nofile 65535
* soft nproc 65535
* hard nproc 65535
webuser soft nofile 65535
webuser hard nofile 65535
EOF
7.2.2 Nginx优化
# /etc/nginx/nginx.conf
user webuser;
worker_processes auto;
worker_rlimit_nofile 65535;
events {
worker_connections 65535;
use epoll;
multi_accept on;
}
http {
# 基础优化
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
keepalive_requests 10000;
# 缓冲区优化
client_body_buffer_size 128k;
client_max_body_size 10m;
client_header_buffer_size 1k;
large_client_header_buffers 4 8k;
# Gzip压缩
gzip on;
gzip_vary on;
gzip_min_length 1024;
gzip_proxied any;
gzip_comp_level 6;
gzip_types
text/plain
text/css
text/xml
text/javascript
application/javascript
application/xml+rss
application/json;
# 缓存配置
open_file_cache max=10000 inactive=30s;
open_file_cache_valid 60s;
open_file_cache_min_uses 2;
open_file_cache_errors on;
# 日志优化
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for" '
'rt=$request_time uct="$upstream_connect_time" '
'uht="$upstream_header_time" urt="$upstream_response_time"';
access_log /var/log/nginx/access.log main buffer=64k flush=5m;
# 虚拟主机配置
include /etc/nginx/conf.d/*.conf;
}
7.2.3 PHP-FPM优化
; /etc/php-fpm.d/www.conf
[www]
user = webuser
group = webuser
; 进程管理
pm = dynamic
pm.max_children = 200
pm.start_servers = 50
pm.min_spare_servers = 30
pm.max_spare_servers = 100
pm.max_requests = 1000
; 内存限制
php_admin_value[memory_limit] = 256M
php_admin_value[post_max_size] = 10M
php_admin_value[upload_max_filesize] = 10M
; 性能优化
php_admin_value[max_execution_time] = 30
php_admin_value[max_input_time] = 60
php_admin_value[realpath_cache_size] = 4096K
php_admin_value[realpath_cache_ttl] = 600
; 错误日志
php_admin_value[error_log] = /var/log/php-fpm/error.log
php_admin_value[log_errors] = on
; 进程管理
pm.process_idle_timeout = 10s
pm.status_path = /status
ping.path = /ping
ping.response = pong
7.2.4 MySQL优化
# /etc/my.cnf.d/server.cnf
[mysqld]
# 基础配置
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mariadb/mariadb.log
pid-file=/run/mariadb/mariadb.pid
# 内存优化
innodb_buffer_pool_size = 8G # 80% of 10GB RAM
innodb_buffer_pool_instances = 8
innodb_log_file_size = 2G
innodb_log_buffer_size = 256M
# I/O优化
innodb_flush_method = O_DIRECT
innodb_flush_log_at_trx_commit = 2
innodb_file_per_table = 1
innodb_io_capacity = 4000
innodb_io_capacity_max = 8000
# 连接优化
max_connections = 500
thread_cache_size = 100
table_open_cache = 4000
table_definition_cache = 2000
# 查询优化
query_cache_type = 0 # MySQL 8.0+默认关闭
join_buffer_size = 256K
sort_buffer_size = 256K
read_buffer_size = 256K
read_rnd_buffer_size = 512K
# 日志优化
slow_query_log = 1
slow_query_log_file = /var/log/mariadb/slow.log
long_query_time = 1
log_queries_not_using_indexes = 1
# 复制优化(如果使用主从)
server_id = 1
log_bin = /var/log/mariadb/mariadb-bin
binlog_format = ROW
expire_logs_days = 7
7.2.5 Redis优化
# /etc/redis.conf 优化配置
bind 0.0.0.0
port 6379
tcp-backlog 511
timeout 0
tcp-keepalive 300
# 内存优化
maxmemory 4gb
maxmemory-policy allkeys-lru
maxmemory-samples 5
# 持久化优化
save 900 1
save 300 10
save 60 10000
rdbcompression yes
rdbchecksum yes
dbfilename dump.rdb
dir /var/lib/redis
# AOF优化
appendonly yes
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
# 性能优化
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-size -2
list-compress-depth 0
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit replica 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
aof-rewrite-incremental-fsync yes
7.2.6 监控与告警
# 安装监控组件
sudo dnf install node_exporter prometheus grafana
# 配置Prometheus告警规则
sudo tee /etc/prometheus/rules/web-app.yml << EOF
groups:
- name: web-app
rules:
- alert: HighCPUUsage
expr: 100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU usage on {{ $labels.instance }}"
description: "CPU usage is {{ $value }}% for more than 5 minutes"
- alert: HighMemoryUsage
expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 > 85
for: 5m
labels:
severity: warning
annotations:
summary: "High memory usage on {{ $labels.instance }}"
description: "Memory usage is {{ $value }}% for more than 5 minutes"
- alert: HighDiskIO
expr: rate(node_disk_io_time_seconds_total[5m]) * 100 > 80
for: 5m
labels:
severity: warning
annotations:
summary: "High disk I/O on {{ $labels.instance }}"
description: "Disk I/O is {{ $value }}% for more than 5 minutes"
- alert: HighNetworkTraffic
expr: rate(node_network_receive_bytes_total[5m]) + rate(node_network_transmit_bytes_total[5m]) > 100000000
for: 5m
labels:
severity: warning
annotations:
summary: "High network traffic on {{ $labels.instance }}"
description: "Network traffic is {{ $value }} bytes/s for more than 5 minutes"
- alert: MySQLSlowQueries
expr: rate(mysql_global_status_slow_queries[5m]) > 10
for: 5m
labels:
severity: warning
annotations:
summary: "High number of slow queries in MySQL"
description: "Slow queries rate is {{ $value }}/s for more than 5 minutes"
- alert: NginxHighErrorRate
expr: rate(nginx_http_requests_total{status=~"5.."}[5m]) / rate(nginx_http_requests_total[5m]) * 100 > 5
for: 5m
labels:
severity: critical
annotations:
summary: "High error rate in Nginx"
description: "Error rate is {{ $value }}% for more than 5 minutes"
EOF
八、性能优化最佳实践总结
8.1 优化原则
- 测量优先:在优化前建立性能基线,优化后验证效果
- 逐步优化:每次只调整一个参数,观察效果
- 理解业务:根据业务特点选择合适的优化策略
- 持续监控:建立完善的监控体系,及时发现性能问题
- 文档记录:记录所有优化配置和效果,便于回滚和复盘
8.2 常见性能问题排查流程
# 1. 快速诊断脚本
#!/bin/bash
echo "=== 系统性能快速诊断 ==="
echo "时间: $(date)"
echo ""
echo "1. CPU使用情况:"
mpstat -P ALL 1 1 | tail -n +4
echo ""
echo "2. 内存使用情况:"
free -h
echo ""
echo "3. 磁盘I/O:"
iostat -x 1 3 | tail -n +4
echo ""
echo "4. 网络连接:"
ss -s
echo ""
echo "5. 进程TOP 10 (CPU):"
ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%cpu | head -11
echo ""
echo "6. 进程TOP 10 (内存):"
ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%mem | head -11
echo ""
echo "7. 系统负载:"
uptime
echo ""
echo "8. 文件系统使用:"
df -h
echo ""
echo "9. 网络流量:"
ifstat -t -i eth0 1 3
echo ""
echo "10. 检查系统日志:"
journalctl -p err -n 20 --no-pager
8.3 性能优化检查清单
- [ ] 内核参数已根据工作负载优化
- [ ] 文件系统挂载选项已优化
- [ ] I/O调度器已根据存储类型设置
- [ ] 网络栈参数已优化
- [ ] 应用服务器配置已优化
- [ ] 数据库配置已优化
- [ ] 缓存系统已优化
- [ ] 监控体系已建立
- [ ] 告警规则已配置
- [ ] 日志系统已优化
- [ ] 备份策略已制定
- [ ] 性能测试已执行
- [ ] 优化文档已记录
九、性能优化工具箱
9.1 系统级工具
# 性能分析工具
sudo dnf install perf strace ltrace systemtap
# 网络分析工具
sudo dnf install tcpdump wireshark-ng nload iftop
# 磁盘分析工具
sudo dnf install iotop ioping hdparm
# 内存分析工具
sudo dnf install smem pmap valgrind
# 进程分析工具
sudo dnf install htop atop glances
9.2 应用级工具
# Web服务器分析
sudo dnf install ngxtop goaccess
# 数据库分析
sudo dnf install percona-toolkit mytop
# 应用性能分析
sudo dnf install py-spy (for Python)
sudo dnf install jstack jmap (for Java)
9.3 监控工具
# 监控套件
sudo dnf install prometheus grafana alertmanager
sudo dnf install node_exporter mysql_exporter nginx_exporter
# 日志分析
sudo dnf install loki promtail
sudo dnf install elasticsearch logstash kibana
十、总结
AlmaLinux性能优化是一个系统工程,需要从内核、系统、应用多个层面进行综合考虑。通过本文介绍的优化策略和实战技巧,您可以:
- 建立性能基线:使用监控工具建立系统性能基准
- 系统级优化:调整内核参数、文件系统、网络栈
- 应用级优化:优化Web服务器、数据库、应用服务器
- 容器化优化:优化Docker和Kubernetes配置
- 监控告警:建立完善的监控体系
- 持续改进:通过监控数据持续优化系统性能
记住,性能优化不是一次性的工作,而是一个持续的过程。建议定期进行性能评估,根据业务变化调整优化策略,确保系统始终处于最佳状态。
最后提醒:在进行任何优化前,务必在测试环境验证,并做好备份和回滚计划。生产环境的优化需要谨慎进行,避免因优化不当导致系统不稳定。
