一、slave延迟的分析方法:
1、查看slave 库日志的应用位置Relay_Master_Log_File
mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: xxxxxxxxxxx
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000271
Read_Master_Log_Pos: 256299454
Relay_Log_File: cnsz20pl0110-relay-bin.000361
Relay_Log_Pos: 387814618
Relay_Master_Log_File: mysql-bin.000267
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 387814455
Relay_Log_Space: 3602530484
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 11006
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_UUID: 4b343f53-1a79-11e5-ba27-a0369f5d645a
Master_Info_File: /mysql_data/dvp/data/dvp/data/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Reading event from the relay log
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
1 row in set (0.00 sec)
2、查看slave 库是否有大事务(如下输出说明有一个事物锁定了520W 行的记录)
mysql> select trx_mysql_thread_id,trx_weight from information_schema.innodb_trx;
+---------------------+------------+
| trx_mysql_thread_id | trx_weight |
+---------------------+------------+
| 2142 | 5215374 |
+---------------------+------------+
3、到出主库上去找到对应的binlog 使用 mysqlbinlog 分析日志
mysqlbinlog -vvv mysql-bin.000267 >> /mysql_data/finstr/log/dvp0267.log
4、分析上一步解析出来日志的DML 操作次数,找出那个表的DML操作最多
cat /mysql_data/finstr/log/dvp0267.log |grep DELETE |sort |uniq –c
190 ### DELETE FROM `dvp`.`rpt_cc_dm_kpi_for_month`
6723464 ### DELETE FROM `dvp`.`rpt_rz_shipment_detail_d`
cat /mysql_data/finstr/log/dvp0267.log |grep UPDATE |sort |uniq –c
cnsz17pl0198:dvp > cat /mysql_data/finstr/log/dvp0267.log |grep INSERT |sort |uniq -c
587275 ### INSERT INTO `dvp`.`rpt_bil_cbil_cust_cp`
251 ### INSERT INTO `dvp`.`rpt_cc_dm_kpi_for_month`
400 ### INSERT INTO `dvp`.`rpt_cc_dm_storage_dispatch_count`
172 ### INSERT INTO `dvp`.`rpt_cc_dw_inventory_not_intime_detail`
1118 ### INSERT INTO `dvp`.`rpt_cc_dw_receipt_not_intime_detail`
377228 ### INSERT INTO `dvp`.`rpt_cc_dw_shipment_not_intime_detail`
30804 ### INSERT INTO `dvp`.`rpt_cw_collect_record_analysis`
189058 ### INSERT INTO `dvp`.`rpt_cw_oms_exception_monitoring`
918 ### INSERT INTO `dvp`.`rpt_yy_tb_city_monitor`
11359 ### INSERT INTO `dvp`.`rpt_yy_tb_fly_monitor`
345257 ### INSERT INTO `dvp`.`rpt_yy_tb_life_cycle_monitor`
5、可以看到`dvp`.`rpt_rz_shipment_detail_d` 的DELETE 操作有670W 行,和步骤2所锁定的行数相进,所以怀疑是这个表的操作导致的延迟
6、分析该表上是否有主键或者合适的索引,(如果没有主键或者合适的索引,这个表的删除操作就是进行670W 次的全表扫描,所以才会导致slave延长)后边的思路就和oracle是一样的了。
mysql> show create table rpt_rz_shipment_detail_d \G
*************************** 1. row ***************************
Table: rpt_rz_shipment_detail_d
Create Table: CREATE TABLE `rpt_rz_shipment_detail_d` (
`DATA_DATE` varchar(50) DEFAULT NULL,.....
1 row in set (0.00 sec)
7、可以从slow 中去尝试能否找到对应的SQL
二、mysql log 突然暴涨的分析思路(和分析延迟的方法差不多)
1、到出主库上去找到对应的binlog 使用 mysqlbinlog 分析日志
mysqlbinlog -vvv mysql-bin.000267 >> /mysql_data/finstr/log/dvp0267.log
2、分析上一步解析出来日志的DML 操作次数,找出那个表的DML操作最多
cat /mysql_data/finstr/log/dvp0267.log |grep DELETE |sort |uniq –c
190 ### DELETE FROM `dvp`.`rpt_cc_dm_kpi_for_month`
6723464 ### DELETE FROM
`dvp`.`rpt_rz_shipment_detail_d`
cat /mysql_data/finstr/log/dvp0267.log |grep UPDATE |sort |uniq –c
cnsz17pl0198:dvp > cat /mysql_data/finstr/log/dvp0267.log |grep INSERT |sort |uniq -c
587275 ### INSERT INTO `dvp`.`rpt_bil_cbil_cust_cp`
251 ### INSERT INTO `dvp`.`rpt_cc_dm_kpi_for_month`
400 ### INSERT INTO `dvp`.`rpt_cc_dm_storage_dispatch_count`
172 ### INSERT INTO `dvp`.`rpt_cc_dw_inventory_not_intime_detail`
1118 ### INSERT INTO `dvp`.`rpt_cc_dw_receipt_not_intime_detail`
377228 ### INSERT INTO `dvp`.`rpt_cc_dw_shipment_not_intime_detail`
30804 ### INSERT INTO `dvp`.`rpt_cw_collect_record_analysis`
189058 ### INSERT INTO `dvp`.`rpt_cw_oms_exception_monitoring`
918 ### INSERT INTO `dvp`.`rpt_yy_tb_city_monitor`
11359 ### INSERT INTO `dvp`.`rpt_yy_tb_fly_monitor`
345257 ### INSERT INTO `dvp`.`rpt_yy_tb_life_cycle_monitor`
3、尝试在slow log 中能否找到对应的SQL、查看SQL 的操作是否合理(如存在全表的delete)
4、找服务中心及研发查明原因,看是否可以修改。