MSYQL 版本:
mysql> select version()\G
*************************** 1. row ***************************
version(): 5.6.27-log
1 row in set (0.00 sec)
1. 测试数据如下
mysql> select * from t1;
+------+------+
| id | name |
+------+------+
| 15 | a |
| 10 | b |
| 6 | c |
| 10 | d |
| 11 | f |
| 2 | zz |
+------+------+
6 rows in set (0.00 sec)
表中只有6条数据都存储在同一个数据块中。
mysql> show index from t1\G
*************************** 1. row ***************************
Table: t1
Non_unique: 0
Key_name: PRIMARY
Seq_in_index: 1
Column_name: name
Collation: A
Cardinality: 5
Sub_part: NULL
Packed: NULL
Null:
Index_type: BTREE
Comment:
Index_comment:
*************************** 2. row ***************************
Table: t1
Non_unique: 1
Key_name: idx_t1_id # 该索引是非唯一索引
Seq_in_index: 1
Column_name: id
Collation: A
Cardinality: 5
Sub_part: NULL
Packed: NULL
Null: YES
Index_type: BTREE
Comment:
Index_comment:
2 rows in set (0.00 sec)
2. 创建用来测试的存储过程
DELIMITER $$
USE `test`$$
DROP PROCEDURE IF EXISTS `p7`$$
CREATE DEFINER=`root`@`localhost` PROCEDURE `p7`()
BEGIN
DECLARE i INT;
SET i = 1;
WHILE i < 10000000000 DO
UPDATE t1 SET id=15 WHERE id=15;
commit;
SET i = i +1;
END WHILE;
END$$
DELIMITER ;
DELIMITER $$
USE `test`$$
DROP PROCEDURE IF EXISTS `p8`$$
CREATE DEFINER=`root`@`localhost` PROCEDURE `p8`()
BEGIN
DECLARE i INT;
SET i = 1;
WHILE i < 10000000000 DO
UPDATE t1 SET id=15 WHERE id=15;
commit;
SET i = i +1;
END WHILE;
END$$
DELIMITER ;
DELIMITER $$
USE `test`$$
DROP PROCEDURE IF EXISTS `p9`$$
CREATE DEFINER=`root`@`localhost` PROCEDURE `p9`()
BEGIN
DECLARE i INT;
SET i = 1;
WHILE i < 10000000000 DO
UPDATE t1 SET id=15 WHERE id=15;
commit;
SET i = i +1;
END WHILE;
END$$
DELIMITER ;
3. 分别在三个会话中调用存储过程
会话一调用 存储过程 p4
mysql> call p7();
会话二调用存储过程 p5
mysql > call p8();
会话三调用存储过程 p6
mysql > call p9();
4. 运行 pt-stalk 采集性能数据
bin/pt-stalk --collect-oprofile TRUE --dest /root/olm/pt-stalk-data-12-07 --no-stalk --run-time 300 --iterations 1 --sleep=3
5. 分析 pt-stalk 采集到的性能数据
从收集到的 TOP 输出的数据我们看到用户空间使用的CPU是22.1%,内核空间使用的CPU 有55.2%(非常高);然后我们再看下具体进程使用的CPU,看到 MYSQL 使用了149%,测试的服务器只有2个逻辑CPU,MYSQLD 进程使用了149% 的CPU,也就是说 CPU都是MSYQLD消耗的,结合CPU使用率汇总信息中用户空间22.1%、内核空间CPU 使用率 55.2%,可以判断内核这么高的CPU使用率也是 MYSQLD进行系统调用消耗的。
上图中是 oprofile 采集的MYSQLD 线程堆栈的报告,按 CPU 时间排序。
mutex_spin_wait函数是 MSYQL 线程等待 mutex 在CPU上自旋。
ut_delay 函数是在CPU上自旋时进行空转。该函数消耗MSYQLD总执行时间的18.4%排在首位,不过oprofile报告无法看出是什么函数(比如锁操作函数或缓冲池页面操作函数)调用了ut_delay。
pt-stalk 启动时采集的mutext 状态信息,内容如下:
Type Name Status
InnoDB trx0trx.cc:108 os_waits=10745
InnoDB trx0trx.cc:108 os_waits=8446
InnoDB trx0trx.cc:108 os_waits=7268
InnoDB trx0sys.cc:597 os_waits=353
InnoDB lock0lock.cc:614 os_waits=2699035
InnoDB lock0lock.cc:611 os_waits=42172
InnoDB log0log.cc:844 os_waits=1
InnoDB buf0buf.cc:1334 os_waits=5
InnoDB fil0fil.cc:1690 os_waits=1
InnoDB log0log.cc:907 os_waits=904
InnoDB trx0i_s.cc:1439 os_waits=3
InnoDB combined buf0buf.cc:1069 os_waits=134
pt-stalk 结束时采集的mutext 状态信息,内容如下:
Type Name Status
InnoDB trx0trx.cc:108 os_waits=13256
InnoDB trx0trx.cc:108 os_waits=10133
InnoDB trx0trx.cc:108 os_waits=9260
InnoDB trx0sys.cc:597 os_waits=439
InnoDB lock0lock.cc:614 os_waits=3580728 # 开始采集的值是 2699035,现在是3580728,5分钟增加了 88.196 万。
InnoDB lock0lock.cc:611 os_waits=53356
InnoDB log0log.cc:844 os_waits=1
InnoDB buf0buf.cc:1334 os_waits=5
InnoDB fil0fil.cc:1690 os_waits=1
InnoDB log0log.cc:907 os_waits=1206
InnoDB trx0i_s.cc:1439 os_waits=3
InnoDB combined buf0buf.cc:1069 os_waits=163
mutext status 数据os_waits最多的是 lock0lock.cc(保护锁对象的MUTXT)争用,os_waits 是 MYSQL线程通过自旋没有获得 mutext 而进入了睡眠状态。
TS 1481942502.009370813 2016-12-17 10:41:42
*************************** 1. row ***************************
Id: 1
User: root
Host: localhost
db: test
Command: Query
Time: 0
State: Searching rows for update
Info: UPDATE t1 SET id=15 WHERE id=15
*************************** 2. row ***************************
Id: 2
User: root
Host: localhost
db: test
Command: Query
Time: 0
State: Searching rows for update
Info: UPDATE t1 SET id=15 WHERE id=15
*************************** 3. row ***************************
Id: 3
User: root
Host: localhost
db: test
Command: Query
Time: 0
State: Searching rows for update
Info: UPDATE t1 SET id=15 WHERE id=15
*************************** 4. row ***************************
Id: 3643
User: root
Host: localhost
db: NULL
Command: Sleep
Time: 1
State:
Info: NULL
*************************** 5. row ***************************
Id: 3659
User: root
Host: localhost
db: NULL
Command: Query
Time: 0
State: init
Info: SHOW FULL PROCESSLIST
上面是 pt-stalk 采集的MYSQL线程状态,出现 mutext争用MYSQL CPU使用率很高时,MYSQL 在循环执行UPDATE 操作。
pt-stalk 采集的锁等待信息,内容如下:
...
*************************** 1. row ***************************
who_blocks: thread 1 from localhost
idle_in_trx: 0
max_wait_time: 0
num_waiters: 2
*************************** 2. row ***************************
who_blocks: thread 2 from localhost
idle_in_trx: 0
max_wait_time: 0
num_waiters: 1
*************************** 1. row ***************************
waiting_trx_id: 211608264
waiting_thread: 3
wait_time: 0
waiting_query: UPDATE t1 SET id=15 WHERE id=15
waiting_table_lock: `test`.`t1`
blocking_trx_id: 211608262
blocking_thread: 1
blocking_host:
blocking_port: localhost
idle_in_trx: 0
blocking_query: UPDATE t1 SET id=15 WHERE id=15
*************************** 2. row ***************************
waiting_trx_id: 211608263
waiting_thread: 2
wait_time: 0
waiting_query: UPDATE t1 SET id=15 WHERE id=15
waiting_table_lock: `test`.`t1`
blocking_trx_id: 211608262
blocking_thread: 1
blocking_host:
blocking_port: localhost
idle_in_trx: 0
blocking_query: UPDATE t1 SET id=15 WHERE id=15
*************************** 3. row ***************************
waiting_trx_id: 211608264
waiting_thread: 3
wait_time: 0
waiting_query: UPDATE t1 SET id=15 WHERE id=15
waiting_table_lock: `test`.`t1`
blocking_trx_id: 211608263
blocking_thread: 2
blocking_host:
blocking_port: localhost
idle_in_trx: 0
blocking_query: UPDATE t1 SET id=15 WHERE id=15
...
从上面的信息我们看到出现 CPU使用很高的情况时(内核空间CPU使用率也很高),有频繁的锁等待。
6. 使用 perf 采集MYSQLD 线程堆栈信息分析
perf record -e cpu-clock -g -p 2578
export LANG=zh_CN.GBK
perf report
_spin_unlock_irqrestore 占了采样期间内核CPU使用率的47.30%;retint_careful占了采样期间内核CPU使用率的20.85%。[kernel.kallsyms] 表示MYSQLD 调用了 LINUX 内核函数或宏。
_spin_unlock_irqrestore 宏的工作是释放LINUX 中的 MUTEX和启用CPU的本地中断并恢复local_irq_save() 所保存的中断状态。
local_irq_save() 禁止中断并中断状态保存到变量中。
retint_careful 是处理完中断准备离开中断处理程序时检查是否还有其他工作需要处理。如果有其他工作处理的情况下跳转到retint_careful执行。没有其他工作处理则首先调用swapgs命令在gs寄存器的内核与用户态值之间切换,为离开内核做准备。然后调用RESTORE_ARGS宏恢复之前SAVE_ARGS宏保存现场所压栈的那些寄存器。最后就是调用iretq离开中断;如果还有其他工作要处理首先检查是否需要重新调度,不需要重新调度就跳转到retint_signal执行。需要重新调度的话先开中断,然后调用schedule函数执行任务的调度。当调度的任务执行完之后,重新获得cpu时,跳转到retint_check处再次检查是否有工作需要处理。
接下来要理清是MYSQL的什么操作调用了 _spin_unlock_irqrestore 和 retint_careful。
perf report > perf_report_20161217.txt
该命令把进程调用堆栈生成树形结构的报告,报告如下:
# Overhead Command Shared Object
Symbol
# ........ ....... ................... .............................................................................................................................
..........................................................................................
#
47.30% mysqld [kernel.kallsyms] [k] _spin_unlock_irqrestore
|
--- _spin_unlock_irqrestore
|
|--97.97%-- pthread_cond_broadcast@@GLIBC_2.3.2
| |
| |--94.19%-- lock_wait_suspend_thread(que_thr_t*)
| | row_mysql_handle_errors(dberr_t*, trx_t*, que_thr_t*, trx_savept_t*)
| | row_search_for_mysql(unsigned char*, unsigned long, row_prebuilt_t*, unsigned long, unsigned long)
| | ha_innobase::index_read(unsigned char*, unsigned char const*, unsigned int, ha_rkey_function)
| | handler::ha_index_read_map(unsigned char*, unsigned char const*, unsigned long, ha_rkey_function)
| | handler::read_range_first(st_key_range const*, st_key_range const*, bool, bool)
| | handler::multi_range_read_next(char**)
| | DsMrr_impl::dsmrr_next(char**)
| | QUICK_RANGE_SELECT::get_next()
| | rr_quick(READ_RECORD*)
| | mysql_update(THD*, TABLE_LIST*, List&, List&, Item*, unsigned int, st_order*, unsigned long long, enum_duplicates, bo
ol, unsigned long long*, unsigned long long*)
| | mysql_execute_command(THD*)
| | sp_instr_stmt::exec_core(THD*, unsigned int*)
| | sp_lex_instr::reset_lex_and_exec_core(THD*, unsigned int*, bool)
| | sp_lex_instr::validate_lex_and_execute_core(THD*, unsigned int*, bool)
| | sp_instr_stmt::execute(THD*, unsigned int*)
| | sp_head::execute(THD*, bool)
| | sp_head::execute_procedure(THD*, List*)
| | mysql_execute_command(THD*)
| | mysql_parse(THD*, char*, unsigned int, Parser_state*)
| | dispatch_command(enum_server_command, THD*, char*, unsigned int)
| | do_command(THD*)
| | do_handle_one_connection(THD*)
| | handle_one_connection
| | pfs_spawn_thread
| | start_thread
| |
| |--4.42%-- lock_grant(ib_lock_t*)
| | lock_rec_dequeue_from_page(ib_lock_t*)
| | lock_trx_release_locks(trx_t*)
| | trx_commit_low(trx_t*, mtr_t*)
| | trx_commit_for_mysql(trx_t*)
| | innobase_commit(handlerton*, THD*, bool)
| | ha_commit_low(THD*, bool, bool)
| | MYSQL_BIN_LOG::commit(THD*, bool)
| | ha_commit_trans(THD*, bool, bool)
| | trans_commit_stmt(THD*)
| | mysql_execute_command(THD*)
| | sp_instr_stmt::exec_core(THD*, unsigned int*)
| | sp_lex_instr::reset_lex_and_exec_core(THD*, unsigned int*, bool)
| | sp_lex_instr::validate_lex_and_execute_core(THD*, unsigned int*, bool)
| | sp_instr_stmt::execute(THD*, unsigned int*)
| | sp_head::execute(THD*, bool)
| | sp_head::execute_procedure(THD*, List*)
| | mysql_execute_command(THD*)
| | mysql_parse(THD*, char*, unsigned int, Parser_state*)
| | dispatch_command(enum_server_command, THD*, char*, unsigned int)
| | do_command(THD*)
| | do_handle_one_connection(THD*)
| | handle_one_connection
| | pfs_spawn_thread
| | start_thread
| |--4.42%-- lock_grant(ib_lock_t*)
| | lock_rec_dequeue_from_page(ib_lock_t*)
| | lock_trx_release_locks(trx_t*)
| | trx_commit_low(trx_t*, mtr_t*)
| | trx_commit_for_mysql(trx_t*)
| | innobase_commit(handlerton*, THD*, bool)
| | ha_commit_low(THD*, bool, bool)
| | MYSQL_BIN_LOG::commit(THD*, bool)
| | ha_commit_trans(THD*, bool, bool)
| | trans_commit_stmt(THD*)
| | mysql_execute_command(THD*)
| | sp_instr_stmt::exec_core(THD*, unsigned int*)
| | sp_lex_instr::reset_lex_and_exec_core(THD*, unsigned int*, bool)
| | sp_lex_instr::validate_lex_and_execute_core(THD*, unsigned int*, bool)
| | sp_instr_stmt::execute(THD*, unsigned int*)
| | sp_head::execute(THD*, bool)
| | sp_head::execute_procedure(THD*, List*)
| | mysql_execute_command(THD*)
| | mysql_parse(THD*, char*, unsigned int, Parser_state*)
| | dispatch_command(enum_server_command, THD*, char*, unsigned int)
| | do_command(THD*)
| | do_handle_one_connection(THD*)
| | handle_one_connection
| | pfs_spawn_thread
| | start_thread
_spin_unlock_irqrestore 的 CPU 时间的 94.19% 是 MYSQL 的 lock_wait_suspend_thread 函数消耗了,还有4.42%是MSYQL 的 lock_grant 函数消耗的
, lock_wait_suspend_thread 实现的是MYSQL 锁等待的逻辑,lock_grant的作用是获得锁。调用关系是lock_wait_suspend_thread -> pthread_cond_broadcast -> _spin_unlock_irqrestore ;lock_grant -> pthread_cond_broadcast -> _spin_unlock_irqrestore。
一个线程在使用完 MUTEX 后释放 MUTEX 时 会调用pthread_cond_broadcast 函数唤醒在条件变量上等待 MUTEXT 的所有线程。
20.95% mysqld [kernel.kallsyms] [k] retint_careful
|
--- retint_careful
|
|--9.09%-- ut_delay(unsigned long)
|
|--3.01%-- buf_page_get_gen(unsigned long, unsigned long, unsigned long, unsigned long, buf_block_t*, unsigned long, char const*, unsigned long, mtr_t*
)
|
|--2.51%-- _int_malloc
|
|--1.91%-- cmp_dtuple_rec_with_match_low(dtuple_t const*, unsigned char const*, unsigned long const*, unsigned long, unsigned long*, unsigned long*)
|
|--1.77%-- btr_cur_search_to_nth_level(dict_index_t*, unsigned long, dtuple_t const*, unsigned long, unsigned long, btr_cur_t*, unsigned long, char con
st*, unsigned long, mtr_t*)
|
|--1.64%-- malloc
|
|--1.37%-- get_mm_parts(RANGE_OPT_PARAM*, Item_func*, Field*, Item_func::Functype, Item*, Item_result)
|
|--1.28%-- rec_get_offsets_func(unsigned char const*, dict_index_t const*, unsigned long*, unsigned long, mem_block_info_t**)
|
|--1.20%-- SQL_SELECT::test_quick_select(THD*, Bitmap<64u>, unsigned long long, unsigned long long, bool, st_order::enum_order)
|
|--1.20%-- mysql_update(THD*, TABLE_LIST*, List&, List&, Item*, unsigned int, st_order*, unsigned long long, enum_duplicates, bool, unsigne
d long long*, unsigned long long*)
|
|--1.15%-- __memset_sse2
|
|--1.11%-- page_cur_search_with_match(buf_block_t const*, dict_index_t const*, dtuple_t const*, unsigned long, unsigned long*, unsigned long*, unsigned
long*, unsigned long*, page_cur_t*)
|
|--0.98%-- sel_arg_range_seq_next(void*, st_key_multi_range*)
|
|--0.93%-- pthread_getspecific
|
|--0.93%-- ha_innobase::records_in_range(unsigned int, st_key_range*, st_key_range*)
|
|--0.92%-- memcpy
|
|--0.92%-- get_key_scans_params(PARAM*, SEL_TREE*, bool, bool, double)
|
|--0.82%-- THD::decide_logging_format(TABLE_LIST*)
|
|--0.75%-- alloc_root
|
|--0.72%-- check_quick_select(PARAM*, unsigned int, bool, SEL_ARG*, bool, unsigned int*, unsigned int*, Cost_estimate*)
|
|--0.59%-- optimize_cond(THD*, Item*, COND_EQUAL**, List*, bool, Item::cond_result*)
|
|--0.59%-- start_stage_v1
|
|--0.58%-- ha_innobase::extra(ha_extra_function)
|
|--0.55%-- build_equal_items_for_cond(THD*, Item*, COND_EQUAL*, bool)
|
|--0.55%-- row_search_for_mysql(unsigned char*, unsigned long, row_prebuilt_t*, unsigned long, unsigned long)
|
|--0.53%-- get_quick_keys(PARAM*, QUICK_RANGE_SELECT*, st_key_part*, SEL_ARG*, unsigned char*, unsigned int, unsigned char*, unsigned int)
|
|--0.53%-- start_table_lock_wait_v1
|
|--0.51%-- ha_innobase::innobase_get_index(unsigned int)
|
|--0.51%-- _int_free
|
|--0.51%-- handler::multi_range_read_info_const(unsigned int, st_range_seq_if*, void*, unsigned int, unsigned int*, unsigned int*, Cost_estimate*)
|
|--0.50%-- open_table(THD*, TABLE_LIST*, Open_table_context*)
--59.82%-- [...]
虽然 perf 报告中没有显示出MYSQL 锁等待调用 retint_careful ,但 lock_wait_suspend_thread 有做激活中断的操作,在激活中断的过程中就会调用
retint_careful 所以由此推断 lock_wait_suspend_thread 调用了 retint_careful,估计 --59.82%-- [...] 的意思是把这59.82%调用 retint_careful 的CPU时间
计算在 MYQLD 的进程中,而MYSQLD 进程主要的 CPU都是消耗在锁等待处理上,由此判断这59.82%的 retint_careful 调用时间,应该是
lock_wait_suspend_thread消耗的。
5.66% mysqld mysqld [.] ut_delay(unsigned long)
|
--- ut_delay(unsigned long)
在 perf 的调用关系图报告中没有显示其他函数调用 ut_delay ,MSYQL 的 ut_delay 函数是在 MSYQL等待 MUTEXT 时进行自旋操作的函数,在MYSQL
中所有等待 MUTEXT 的情况都会先调用 ut_delay 在用户态进行自旋,在通过用户态自旋无法获得 MUTEXT 后MYSQL才会调用 LINUX 内核函数在操作系统层进行自旋获取 MUTEXT,在前面我们看到 lock_wait_suspend_thread 消耗了 _spin_unlock_irqrestore 97%的CPU 时间做释放操作系统层 mutext的操作,由此判断 ut_delay 的CPU 时间主要是被 lock_wait_suspend_thread 消耗了。
结论:
通过分析确定MYSQLD 内核态 CPU 使用率很高的原因就是更新数据的SQL语句锁争用导致的。
mutext 获取的大致过程:
线程检查某 mutext 是否能够获取 --> 如果不能获取线自旋 --> 自旋MSYQL 系统设定次数后还无法获得 mutext MYSQL 线程 SLEEP --> MSYQL 线程 SLEEP 时有 LINUX 负责在LINUX 内核中进行自旋操作等待 mutext
LINUX 自旋时在 LINUX 内核空间消耗 CPU ,如果在LINUX 层面自旋等待 MUTEXT 的MYSQL 线程很多,或导致LINUX 内核空间CPU 使用率很高。当MYSQL 线程 SLEEP 由 LINUX 内核自旋等待 MUTEXT 在 MYSQL 对应的等待是 OS_WAITS 。
扩展阅读: