在当今互联网应用中,高并发场景对数据库系统的性能提出了严峻挑战。MySQL作为最流行的开源关系型数据库,在高并发环境下容易出现性能瓶颈甚至系统崩溃。本文将从多个维度深入探讨MySQL在高并发场景下的优化策略,并提供实战指南,帮助您构建稳定、高效的数据库系统。

一、高并发场景下的MySQL性能瓶颈分析

1.1 常见性能瓶颈类型

在高并发场景下,MySQL可能面临以下主要性能瓶颈:

连接数瓶颈:当并发连接数超过max_connections设置时,新连接会被拒绝,导致应用报错。

CPU瓶颈:复杂查询、大量排序、临时表操作等会消耗大量CPU资源。

I/O瓶颈:频繁的磁盘读写操作,特别是随机I/O,会成为性能瓶颈。

内存瓶颈:InnoDB缓冲池不足导致频繁的磁盘I/O。

锁竞争:行锁、表锁竞争导致事务等待时间过长。

1.2 性能监控与诊断

在优化之前,必须先建立完善的监控体系:

-- 查看当前连接数
SHOW STATUS LIKE 'Threads_connected';

-- 查看慢查询数量
SHOW STATUS LIKE 'Slow_queries';

-- 查看InnoDB缓冲池状态
SHOW ENGINE INNODB STATUS\G

-- 查看锁等待情况
SELECT * FROM information_schema.INNODB_LOCKS;
SELECT * FROM information_schema.INNODB_LOCK_WAITS;

推荐监控工具

  • MySQL自带:Performance Schema、Slow Query Log
  • 开源工具:Percona Toolkit、Prometheus + Grafana
  • 商业工具:Datadog、New Relic

二、连接层优化策略

2.1 连接池配置优化

连接池是应用与数据库之间的缓冲层,合理配置至关重要:

Java示例(HikariCP连接池)

import com.zaxxer.hikari.HikariConfig;
import com.zaxxer.hikari.HikariDataSource;

public class DatabaseConfig {
    public static HikariDataSource createDataSource() {
        HikariConfig config = new HikariConfig();
        
        // 基础配置
        config.setJdbcUrl("jdbc:mysql://localhost:3306/mydb?useSSL=false&serverTimezone=UTC");
        config.setUsername("username");
        config.setPassword("password");
        
        // 连接池大小配置(关键参数)
        config.setMaximumPoolSize(50);      // 最大连接数
        config.setMinimumIdle(10);          // 最小空闲连接
        config.setConnectionTimeout(30000); // 连接超时时间(毫秒)
        config.setIdleTimeout(600000);      // 空闲连接超时时间
        config.setMaxLifetime(1800000);     // 连接最大存活时间
        
        // 性能优化参数
        config.setConnectionTestQuery("SELECT 1"); // 连接测试查询
        config.setLeakDetectionThreshold(60000);   // 连接泄漏检测阈值
        config.setInitializationFailTimeout(1);    // 初始化失败超时
        
        // 高并发优化
        config.setPoolName("HighConcurrencyPool");
        config.addDataSourceProperty("cachePrepStmts", "true");
        config.addDataSourceProperty("prepStmtCacheSize", "250");
        config.addDataSourceProperty("prepStmtCacheSqlLimit", "2048");
        config.addDataSourceProperty("useServerPrepStmts", "true");
        
        return new HikariDataSource(config);
    }
}

连接池大小计算公式

连接池大小 = (核心数 * 2) + 有效磁盘数

对于高并发场景,通常建议:

  • 读写分离:读库连接池可适当增大(如50-100)
  • 写库连接池:通常控制在20-50之间,避免过多连接导致锁竞争

2.2 连接数限制与管理

MySQL服务器配置

[mysqld]
# 最大连接数(根据服务器内存调整)
max_connections = 500

# 连接超时设置
wait_timeout = 600
interactive_timeout = 600

# 连接错误处理
max_connect_errors = 100000

连接数监控脚本

#!/bin/bash
# monitor_connections.sh

while true; do
    # 获取当前连接数
    connections=$(mysql -u root -p'password' -e "SHOW STATUS LIKE 'Threads_connected'" | grep Threads_connected | awk '{print $2}')
    
    # 获取最大连接数
    max_connections=$(mysql -u root -p'password' -e "SHOW VARIABLES LIKE 'max_connections'" | grep max_connections | awk '{print $2}')
    
    # 计算使用率
    usage=$((connections * 100 / max_connections))
    
    echo "$(date): 当前连接数: $connections, 最大连接数: $max_connections, 使用率: $usage%"
    
    # 如果使用率超过80%,发出警告
    if [ $usage -gt 80 ]; then
        echo "警告:连接数使用率超过80%!"
    fi
    
    sleep 60
done

三、查询优化策略

3.1 索引优化

索引设计原则

  1. 选择性高的列优先建立索引
  2. 遵循最左前缀原则
  3. 避免过多索引(每个索引都有维护成本)
  4. 覆盖索引减少回表

索引优化示例

-- 原始查询(无索引)
SELECT * FROM orders WHERE user_id = 123 AND status = 'paid' AND created_at > '2023-01-01';

-- 优化方案1:创建复合索引
CREATE INDEX idx_user_status_created ON orders(user_id, status, created_at);

-- 优化方案2:覆盖索引(避免回表)
CREATE INDEX idx_user_status_created_cover ON orders(user_id, status, created_at, amount, product_id);

-- 查看索引使用情况
EXPLAIN SELECT * FROM orders WHERE user_id = 123 AND status = 'paid' AND created_at > '2023-01-01';

索引维护脚本

-- 查找缺失索引的表
SELECT 
    table_schema,
    table_name,
    rows,
    data_length,
    index_length,
    ROUND(data_length / 1024 / 1024, 2) AS data_mb,
    ROUND(index_length / 1024 / 1024, 2) AS index_mb
FROM information_schema.tables
WHERE table_schema NOT IN ('mysql', 'information_schema', 'performance_schema', 'sys')
ORDER BY data_length DESC;

-- 查找重复索引
SELECT 
    a.table_schema,
    a.table_name,
    a.index_name AS first_index,
    b.index_name AS second_index,
    a.column_name,
    a.seq_in_index AS first_seq,
    b.seq_in_index AS second_seq
FROM information_schema.statistics a
JOIN information_schema.statistics b ON a.table_schema = b.table_schema 
    AND a.table_name = b.table_name 
    AND a.column_name = b.column_name
    AND a.index_name != b.index_name
    AND a.seq_in_index = 1
    AND b.seq_in_index = 1
WHERE a.table_schema NOT IN ('mysql', 'information_schema', 'performance_schema', 'sys')
ORDER BY a.table_schema, a.table_name;

3.2 SQL语句优化

避免全表扫描

-- 错误示例:使用函数导致索引失效
SELECT * FROM users WHERE DATE(created_at) = '2023-01-01';

-- 正确示例:直接使用范围查询
SELECT * FROM users WHERE created_at >= '2023-01-01 00:00:00' 
    AND created_at < '2023-01-02 00:00:00';

-- 错误示例:使用OR连接不同列
SELECT * FROM products WHERE category_id = 1 OR price > 100;

-- 正确示例:使用UNION ALL
SELECT * FROM products WHERE category_id = 1
UNION ALL
SELECT * FROM products WHERE price > 100 AND category_id != 1;

批量操作优化

-- 错误示例:逐条插入
INSERT INTO orders (user_id, amount, status) VALUES (1, 100, 'paid');
INSERT INTO orders (user_id, amount, status) VALUES (2, 200, 'paid');
INSERT INTO orders (user_id, amount, status) VALUES (3, 300, 'paid');

-- 正确示例:批量插入
INSERT INTO orders (user_id, amount, status) VALUES 
(1, 100, 'paid'),
(2, 200, 'paid'),
(3, 300, 'paid');

-- 批量更新
UPDATE orders 
SET status = 'shipped', 
    updated_at = NOW() 
WHERE order_id IN (1001, 1002, 1003, 1004, 1005);

3.3 分页优化

传统分页问题

-- 传统分页(深度分页性能差)
SELECT * FROM orders ORDER BY created_at DESC LIMIT 100000, 20;

优化方案1:延迟关联

-- 使用子查询先获取主键,再关联获取完整数据
SELECT o.* 
FROM orders o
JOIN (
    SELECT order_id 
    FROM orders 
    ORDER BY created_at DESC 
    LIMIT 100000, 20
) AS tmp ON o.order_id = tmp.order_id;

优化方案2:书签分页(游标分页)

-- 第一页
SELECT * FROM orders 
WHERE created_at <= '2023-12-31 23:59:59'
ORDER BY created_at DESC 
LIMIT 20;

-- 下一页(使用上一页最后一条记录的created_at作为游标)
SELECT * FROM orders 
WHERE created_at < '2023-12-31 23:59:59'  -- 上一页最后一条的created_at
ORDER BY created_at DESC 
LIMIT 20;

四、InnoDB存储引擎优化

4.1 缓冲池配置

缓冲池大小设置

[mysqld]
# 缓冲池大小(通常设置为物理内存的50%-70%)
innodb_buffer_pool_size = 8G

# 缓冲池实例数(根据CPU核心数调整)
innodb_buffer_pool_instances = 8

# 缓冲池预热
innodb_buffer_pool_load_at_startup = ON
innodb_buffer_pool_dump_at_shutdown = ON

# 页大小(默认16K,对于大表可考虑32K)
innodb_page_size = 16384

缓冲池监控

-- 查看缓冲池命中率
SELECT 
    (1 - (SUM(VARIABLE_VALUE) / @@innodb_buffer_pool_size)) * 100 AS buffer_pool_hit_rate
FROM information_schema.GLOBAL_STATUS 
WHERE VARIABLE_NAME IN ('Innodb_buffer_pool_reads', 'Innodb_buffer_pool_read_requests');

-- 查看缓冲池使用情况
SHOW ENGINE INNODB STATUS\G
-- 在输出中查找"BUFFER POOL AND MEMORY"部分

4.2 日志文件优化

重做日志配置

[mysqld]
# 重做日志文件大小(建议1-2G)
innodb_log_file_size = 2G

# 重做日志文件数量(建议2-3个)
innodb_log_files_in_group = 2

# 重做日志缓冲区大小
innodb_log_buffer_size = 64M

# 日志刷新策略(平衡性能与数据安全)
innodb_flush_log_at_trx_commit = 1  # 1: 每次提交都写磁盘(最安全)
# innodb_flush_log_at_trx_commit = 2  # 2: 每秒写磁盘(性能更好,但可能丢失1秒数据)

4.3 事务与锁优化

事务设计原则

  1. 保持事务简短
  2. 避免长事务
  3. 合理设置隔离级别

长事务监控与处理

-- 查找长时间运行的事务
SELECT 
    trx_id,
    trx_started,
    TIMESTAMPDIFF(SECOND, trx_started, NOW()) AS duration_seconds,
    trx_mysql_thread_id,
    trx_query
FROM information_schema.INNODB_TRX
WHERE TIMESTAMPDIFF(SECOND, trx_started, NOW()) > 60
ORDER BY duration_seconds DESC;

-- 查找锁等待
SELECT 
    r.trx_id AS waiting_trx_id,
    r.trx_mysql_thread_id AS waiting_thread,
    TIMESTAMPDIFF(SECOND, r.trx_wait_started, NOW()) AS wait_seconds,
    b.trx_id AS blocking_trx_id,
    b.trx_mysql_thread_id AS blocking_thread,
    b.trx_query AS blocking_query
FROM information_schema.INNODB_LOCK_WAITS w
JOIN information_schema.INNODB_TRX r ON w.requesting_trx_id = r.trx_id
JOIN information_schema.INNODB_TRX b ON w.blocking_trx_id = b.trx_id;

死锁处理策略

-- 查看死锁日志
SHOW ENGINE INNODB STATUS\G
-- 在输出中查找"LATEST DETECTED DEADLOCK"部分

-- 设置死锁检测超时(避免频繁检测消耗CPU)
innodb_deadlock_detect = ON
innodb_lock_wait_timeout = 50  # 锁等待超时时间(秒)

五、架构层面优化

5.1 读写分离

主从复制配置

# 主库配置(my.cnf)
[mysqld]
server-id = 1
log_bin = mysql-bin
binlog_format = ROW
binlog_row_image = FULL
expire_logs_days = 7

# 从库配置
[mysqld]
server-id = 2
relay_log = mysql-relay-bin
read_only = ON

应用层读写分离实现(Java示例)

import org.springframework.jdbc.datasource.lookup.AbstractRoutingDataSource;
import org.springframework.stereotype.Component;

@Component
public class DynamicDataSource extends AbstractRoutingDataSource {
    
    private static final ThreadLocal<String> dataSourceKey = new ThreadLocal<>();
    
    public static void setDataSourceKey(String key) {
        dataSourceKey.set(key);
    }
    
    public static void clearDataSourceKey() {
        dataSourceKey.remove();
    }
    
    @Override
    protected Object determineCurrentLookupKey() {
        return dataSourceKey.get();
    }
}

// 使用AOP进行读写分离
@Aspect
@Component
public class DataSourceAspect {
    
    @Around("@annotation(readOnly)")
    public Object switchDataSource(ProceedingJoinPoint joinPoint, ReadOnly readOnly) throws Throwable {
        try {
            // 设置为读库
            DynamicDataSource.setDataSourceKey("read");
            return joinPoint.proceed();
        } finally {
            DynamicDataSource.clearDataSourceKey();
        }
    }
    
    @Around("@annotation(writeOnly)")
    public Object switchToWriteDataSource(ProceedingJoinPoint joinPoint, WriteOnly writeOnly) throws Throwable {
        try {
            // 设置为写库
            DynamicDataSource.setDataSourceKey("write");
            return joinPoint.proceed();
        } finally {
            DynamicDataSource.clearDataSourceKey();
        }
    }
}

5.2 分库分表

垂直分表示例

-- 原始大表
CREATE TABLE user_info (
    user_id BIGINT PRIMARY KEY,
    username VARCHAR(50),
    email VARCHAR(100),
    phone VARCHAR(20),
    address TEXT,
    avatar TEXT,
    created_at TIMESTAMP,
    updated_at TIMESTAMP
);

-- 垂直拆分:基础信息表
CREATE TABLE user_base (
    user_id BIGINT PRIMARY KEY,
    username VARCHAR(50),
    email VARCHAR(100),
    phone VARCHAR(20),
    created_at TIMESTAMP,
    updated_at TIMESTAMP
);

-- 垂直拆分:扩展信息表
CREATE TABLE user_profile (
    user_id BIGINT PRIMARY KEY,
    address TEXT,
    avatar TEXT,
    FOREIGN KEY (user_id) REFERENCES user_base(user_id)
);

水平分表示例(按用户ID哈希)

-- 分表策略:user_0, user_1, ..., user_9
-- 分表函数
CREATE FUNCTION get_user_table_name(user_id BIGINT) RETURNS VARCHAR(20)
DETERMINISTIC
BEGIN
    DECLARE table_suffix INT;
    SET table_suffix = user_id % 10;
    RETURN CONCAT('user_', table_suffix);
END;

-- 分表查询示例
SET @table_name = get_user_table_name(123456);
SET @sql = CONCAT('SELECT * FROM ', @table_name, ' WHERE user_id = 123456');
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

分库分表中间件(ShardingSphere示例)

# sharding.yaml
dataSources:
  ds_0: !!com.zaxxer.hikari.HikariDataSource
    driverClassName: com.mysql.cj.jdbc.Driver
    jdbcUrl: jdbc:mysql://localhost:3306/db_0
    username: root
    password: password
  ds_1: !!com.zaxxer.hikari.HikariDataSource
    driverClassName: com.mysql.cj.jdbc.Driver
    jdbcUrl: jdbc:mysql://localhost:3306/db_1
    username: root
    password: password

shardingRule:
  tables:
    user:
      actualDataNodes: ds_${0..1}.user_${0..9}
      tableStrategy:
        standard:
          shardingColumn: user_id
          preciseAlgorithmClassName: com.example.UserShardingAlgorithm
      databaseStrategy:
        standard:
          shardingColumn: user_id
          preciseAlgorithmClassName: com.example.DatabaseShardingAlgorithm
  bindingTables:
    - user

5.3 缓存策略

Redis缓存示例

import redis.clients.jedis.Jedis;
import redis.clients.jedis.JedisPool;
import redis.clients.jedis.JedisPoolConfig;

public class RedisCache {
    private JedisPool jedisPool;
    
    public RedisCache() {
        JedisPoolConfig config = new JedisPoolConfig();
        config.setMaxTotal(200); // 最大连接数
        config.setMaxIdle(50);   // 最大空闲连接
        config.setMinIdle(10);   // 最小空闲连接
        config.setTestOnBorrow(true);
        config.setTestOnReturn(true);
        
        this.jedisPool = new JedisPool(config, "localhost", 6379, 2000, "password");
    }
    
    // 缓存查询结果
    public String getWithCache(String key, Supplier<String> dataLoader, int expireSeconds) {
        try (Jedis jedis = jedisPool.getResource()) {
            String cached = jedis.get(key);
            if (cached != null) {
                return cached;
            }
            
            // 缓存未命中,加载数据
            String data = dataLoader.get();
            if (data != null) {
                jedis.setex(key, expireSeconds, data);
            }
            return data;
        }
    }
    
    // 缓存穿透保护
    public String getWithBloomFilter(String key, Supplier<String> dataLoader, int expireSeconds) {
        // 使用布隆过滤器防止缓存穿透
        // 实际项目中可以使用RedisBloom或Guava BloomFilter
        return getWithCache(key, dataLoader, expireSeconds);
    }
}

缓存更新策略

  1. Cache Aside模式:先更新数据库,再删除缓存
  2. Read Through模式:应用只读缓存,缓存未命中时自动加载
  3. Write Through模式:应用写缓存,缓存自动写入数据库

六、服务器与操作系统优化

6.1 操作系统参数优化

Linux系统参数

# 编辑/etc/sysctl.conf
# 增加文件描述符限制
fs.file-max = 65535

# TCP参数优化
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 0  # 在NAT环境下可能需要关闭
net.ipv4.tcp_fin_timeout = 30

# 网络缓冲区
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216

# 应用参数
sysctl -p

文件描述符限制

# 查看当前限制
ulimit -n

# 临时修改(当前会话)
ulimit -n 65535

# 永久修改(/etc/security/limits.conf)
* soft nofile 65535
* hard nofile 65535

6.2 磁盘I/O优化

磁盘调度算法

# 查看当前调度算法
cat /sys/block/sda/queue/scheduler

# 修改为deadline(适合数据库)
echo deadline > /sys/block/sda/queue/scheduler

# 或者noop(对于SSD)
echo noop > /sys/block/sda/queue/scheduler

RAID配置建议

  • RAID 10:最佳性能与可靠性平衡
  • RAID 56:适合读多写少场景
  • RAID 0:仅用于临时数据,不推荐生产环境

6.3 内存优化

内存分配策略

[mysqld]
# 总内存分配(根据服务器总内存调整)
innodb_buffer_pool_size = 12G  # 通常为总内存的50-70%
key_buffer_size = 256M         # MyISAM表使用,如果不用MyISAM可设小
query_cache_size = 0           # MySQL 5.7+建议关闭查询缓存
tmp_table_size = 256M
max_heap_table_size = 256M
sort_buffer_size = 4M
join_buffer_size = 4M
read_buffer_size = 4M
read_rnd_buffer_size = 4M

内存监控脚本

#!/bin/bash
# monitor_memory.sh

while true; do
    # 获取MySQL内存使用
    mysql_memory=$(ps aux | grep mysqld | grep -v grep | awk '{print $6}')
    
    # 获取系统总内存
    total_memory=$(free -m | awk 'NR==2{print $2}')
    
    # 获取系统可用内存
    available_memory=$(free -m | awk 'NR==2{print $7}')
    
    # 计算使用率
    usage=$((100 - (available_memory * 100 / total_memory)))
    
    echo "$(date): MySQL内存使用: ${mysql_memory}KB, 系统内存使用率: ${usage}%"
    
    # 如果使用率超过85%,发出警告
    if [ $usage -gt 85 ]; then
        echo "警告:系统内存使用率超过85%!"
    fi
    
    sleep 60
done

七、高并发场景下的特殊优化

7.1 热点数据更新问题

问题场景:多个事务同时更新同一行数据,导致锁竞争。

解决方案1:乐观锁

-- 添加版本号字段
ALTER TABLE products ADD COLUMN version INT DEFAULT 0;

-- 更新时检查版本号
UPDATE products 
SET stock = stock - 1, 
    version = version + 1 
WHERE product_id = 123 
    AND version = 5;  -- 假设当前版本是5

-- 检查影响行数,如果为0,说明版本已变更,需要重试

解决方案2:队列化更新

// 使用消息队列串行化更新
public class StockUpdateService {
    private final KafkaTemplate<String, String> kafkaTemplate;
    
    public void updateStock(Long productId, Integer quantity) {
        StockUpdateEvent event = new StockUpdateEvent(productId, quantity);
        kafkaTemplate.send("stock-update-topic", event.toJson());
    }
    
    @KafkaListener(topics = "stock-update-topic")
    public void processStockUpdate(String message) {
        StockUpdateEvent event = StockUpdateEvent.fromJson(message);
        // 串行处理库存更新
        updateStockInDatabase(event.getProductId(), event.getQuantity());
    }
}

7.2 高并发插入优化

批量插入优化

-- 使用LOAD DATA INFILE(最快)
LOAD DATA LOCAL INFILE '/path/to/data.csv'
INTO TABLE orders
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
(user_id, amount, status, created_at);

-- 使用INSERT ... ON DUPLICATE KEY UPDATE
INSERT INTO orders (order_id, user_id, amount, status) 
VALUES 
(1001, 1, 100, 'paid'),
(1002, 2, 200, 'paid'),
(1003, 3, 300, 'paid')
ON DUPLICATE KEY UPDATE 
    amount = VALUES(amount),
    status = VALUES(status),
    updated_at = NOW();

自增ID优化

[mysqld]
# 自增ID步长(分布式环境)
auto_increment_increment = 2
auto_increment_offset = 1  # 服务器1从1开始,服务器2从2开始

7.3 高并发查询优化

查询缓存策略

// 使用Caffeine缓存(高性能本地缓存)
import com.github.benmanes.caffeine.cache.Caffeine;
import com.github.benmanes.caffeine.cache.Cache;

public class QueryCache {
    private final Cache<String, String> cache;
    
    public QueryCache() {
        this.cache = Caffeine.newBuilder()
            .maximumSize(10_000)  // 最大条目数
            .expireAfterWrite(10, TimeUnit.MINUTES)  // 写入后10分钟过期
            .recordStats()  // 记录统计信息
            .build();
    }
    
    public String get(String key, Supplier<String> dataLoader) {
        return cache.get(key, k -> dataLoader.get());
    }
}

读写分离与缓存结合

// 读请求优先走缓存,缓存未命中走读库
public User getUserById(Long userId) {
    // 1. 先查缓存
    String cacheKey = "user:" + userId;
    String cached = redisCache.get(cacheKey);
    if (cached != null) {
        return JSON.parseObject(cached, User.class);
    }
    
    // 2. 缓存未命中,查读库
    User user = userReadRepository.findById(userId);
    if (user != null) {
        // 3. 写入缓存
        redisCache.set(cacheKey, JSON.toJSONString(user), 300);
    }
    
    return user;
}

八、监控与告警体系

8.1 关键指标监控

MySQL性能指标

-- 性能概览
SHOW GLOBAL STATUS LIKE 'Threads_%';
SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_%';
SHOW GLOBAL STATUS LIKE 'Slow_queries';

-- 慢查询日志分析
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 1;  -- 超过1秒的查询记录
SET GLOBAL slow_query_log_file = '/var/log/mysql/slow.log';

-- 分析慢查询
mysqldumpslow /var/log/mysql/slow.log

Prometheus监控配置

# prometheus.yml
scrape_configs:
  - job_name: 'mysql'
    static_configs:
      - targets: ['localhost:9104']  # MySQL Exporter端口
    metrics_path: '/metrics'
    scrape_interval: 15s

Grafana仪表板

  • 连接数监控:Threads_connected / max_connections
  • 查询性能:Queries_per_second, Slow_queries
  • InnoDB状态:Buffer_pool_hit_rate, Row_lock_time
  • 复制延迟:Seconds_Behind_Master

8.2 告警规则

Prometheus告警规则

groups:
  - name: mysql_alerts
    rules:
      - alert: MySQLHighConnections
        expr: mysql_global_status_threads_connected / mysql_global_variables_max_connections > 0.8
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "MySQL连接数过高"
          description: "MySQL连接数使用率超过80%,当前值: {{ $value }}"
      
      - alert: MySQLSlowQueries
        expr: rate(mysql_global_status_slow_queries[5m]) > 10
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "MySQL慢查询增多"
          description: "过去5分钟慢查询率超过10/秒"
      
      - alert: MySQLReplicationLag
        expr: mysql_slave_lag_seconds > 30
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "MySQL复制延迟过高"
          description: "从库复制延迟超过30秒"

九、实战案例:电商系统高并发优化

9.1 场景描述

某电商平台在大促期间面临以下挑战:

  • 峰值QPS:10,000
  • 并发下单:5,000 TPS
  • 库存更新:热点商品频繁更新
  • 查询压力:商品详情页高并发查询

9.2 优化方案

1. 数据库架构优化

-- 订单表分表(按用户ID哈希)
-- 库存表独立,使用Redis+MySQL双写
-- 商品详情表使用缓存

2. 应用层优化

// 库存扣减服务
@Service
public class StockService {
    
    @Autowired
    private RedisTemplate<String, String> redisTemplate;
    
    @Autowired
    private JdbcTemplate jdbcTemplate;
    
    // 使用Redis预减库存
    public boolean deductStock(Long productId, Integer quantity) {
        String key = "stock:" + productId;
        
        // Lua脚本保证原子性
        String luaScript = 
            "if redis.call('exists', KEYS[1]) == 1 then " +
            "   local stock = tonumber(redis.call('get', KEYS[1])); " +
            "   if stock >= tonumber(ARGV[1]) then " +
            "       redis.call('decrby', KEYS[1], ARGV[1]); " +
            "       return 1; " +
            "   end; " +
            "end; " +
            "return 0;";
        
        Long result = redisTemplate.execute(
            new DefaultRedisScript<>(luaScript, Long.class),
            Collections.singletonList(key),
            quantity.toString()
        );
        
        if (result == 1) {
            // 异步更新MySQL
            asyncUpdateDatabase(productId, quantity);
            return true;
        }
        return false;
    }
    
    @Async
    public void asyncUpdateDatabase(Long productId, Integer quantity) {
        // 异步更新数据库,避免阻塞
        jdbcTemplate.update(
            "UPDATE products SET stock = stock - ? WHERE product_id = ? AND stock >= ?",
            quantity, productId, quantity
        );
    }
}

3. 缓存策略

// 多级缓存架构
public class ProductCacheService {
    
    // 本地缓存(Caffeine)
    private final Cache<Long, Product> localCache = Caffeine.newBuilder()
        .maximumSize(10000)
        .expireAfterWrite(5, TimeUnit.SECONDS)
        .build();
    
    // Redis缓存
    @Autowired
    private RedisTemplate<String, String> redisTemplate;
    
    // 数据库
    @Autowired
    private ProductRepository productRepository;
    
    public Product getProduct(Long productId) {
        // 1. 本地缓存
        Product product = localCache.getIfPresent(productId);
        if (product != null) {
            return product;
        }
        
        // 2. Redis缓存
        String redisKey = "product:" + productId;
        String cached = redisTemplate.opsForValue().get(redisKey);
        if (cached != null) {
            product = JSON.parseObject(cached, Product.class);
            localCache.put(productId, product);
            return product;
        }
        
        // 3. 数据库查询
        product = productRepository.findById(productId);
        if (product != null) {
            // 写入Redis(设置较短过期时间)
            redisTemplate.opsForValue().set(redisKey, JSON.toJSONString(product), 30, TimeUnit.SECONDS);
            // 写入本地缓存
            localCache.put(productId, product);
        }
        
        return product;
    }
}

9.3 压测与调优

JMeter压测脚本

<!-- test-plan.jmx -->
<TestPlan>
  <ThreadGroup guiclass="ThreadGroupGui" testclass="ThreadGroup" testname="下单并发测试" enabled="true">
    <stringProp name="ThreadGroup.num_threads">1000</stringProp>
    <stringProp name="ThreadGroup.ramp_time">60</stringProp>
    <stringProp name="ThreadGroup.duration">300</stringProp>
  </ThreadGroup>
  
  <HTTPSamplerProxy guiclass="HttpTestSampleGui" testclass="HTTPSamplerProxy" testname="下单请求" enabled="true">
    <stringProp name="HTTPSampler.domain">api.example.com</stringProp>
    <stringProp name="HTTPSampler.port">8080</stringProp>
    <stringProp name="HTTPSampler.path">/api/order/create</stringProp>
    <stringProp name="HTTPSampler.method">POST</stringProp>
    <stringProp name="HTTPSampler.postBody">{"productId":123,"quantity":1,"userId":456}</stringProp>
  </HTTPSamplerProxy>
</TestPlan>

压测结果分析

  • QPS:从优化前的2,000提升到8,000
  • 平均响应时间:从500ms降低到100ms
  • 错误率:从5%降低到0.1%
  • 数据库CPU:从95%降低到60%

十、总结与最佳实践

10.1 优化优先级

  1. 索引优化:成本低,效果显著
  2. 查询优化:避免全表扫描,减少数据传输
  3. 连接池优化:合理配置连接数
  4. 缓存策略:减少数据库压力
  5. 架构优化:读写分离、分库分表
  6. 服务器优化:操作系统参数调整

10.2 高并发优化清单

  • [ ] 索引是否合理?是否覆盖查询?
  • [ ] 连接池配置是否合适?
  • [ ] 缓存命中率是否>95%?
  • [ ] 慢查询是否已优化?
  • [ ] 是否有长事务?
  • [ ] 监控告警是否完善?
  • [ ] 压测是否通过?
  • [ ] 回滚方案是否准备?

10.3 持续优化建议

  1. 定期审查:每月审查慢查询日志
  2. 容量规划:根据业务增长提前扩容
  3. 技术演进:关注MySQL新版本特性
  4. 团队培训:提升团队数据库优化能力

通过以上全面的优化策略,MySQL在高并发场景下的性能可以得到显著提升,系统稳定性也会大幅增强。记住,优化是一个持续的过程,需要根据业务变化和技术发展不断调整优化策略。