你的位置:首页 > 数据库

[数据库]index merge的一次优化


手机微博4040端口SQL优化

现象

某端口常态化延迟,通过使用pt-query-digest发现主要由于一条count(*)语句引发,具体如下:

# 13.5s user time, 40ms system time, 21.58M rss, 156.84M vsz# Current date: Fri Apr 1 17:43:05 2016# Hostname: naga64# Files: /data1/mysql4040/slow.log# Overall: 45.87k total, 53 unique, 1.01 QPS, 9.05x concurrency __________# Time range: 2016-04-01 05:05:02 to 17:43:05# Attribute     total   min   max   avg   95% stddev median# ============   ======= ======= ======= ======= ======= ======= =======# Exec time    411622s   1s  238s   9s   29s   13s   6s# Lock time      70s    0   4s   2ms  138us  57ms  76us# Rows sent     12.66M    0  1.31M 289.43  19.46 13.90k  0.99# Rows examine   310.43M    0  5.40M  6.93k 31.59k 65.56k  0.99# Query size     5.89M   17  4.14k 134.67 563.87 150.53  76.28 # Profile# Rank Query ID      Response time   Calls R/Call Apdx V/M  Item# ==== ================== ================= ===== ======= ==== ===== =====#  1 0xE74340EE1DEFEC99 317229.0380 77.1% 34627 9.1613 0.11 12.60 SELECT user_rec_?#  2 0xB9959C570826EFA4 72164.9508 17.5% 3746 19.2645 0.15 36.13 SELECT app#  3 0xECEF2B7CA2BE445C  7136.5824 1.7% 3581 1.9929 0.53 2.75 SELECT user_rec_?#  4 0x7B9529D6435F23B3  3465.0381 0.8%  137 25.2922 0.16 33.53 SELECT app#  5 0x270C8D7D3EC37561  2209.2050 0.5% 1087 2.0324 0.51 2.34 SELECT apk#  6 0x6AF45A776EDFF7A9  1921.4956 0.5%  905 2.1232 0.50 2.63 SELECT apk#  7 0x67DC38C9C5F7EEBB  1816.0314 0.4%  108 16.8151 0.08 7.32 SELECT ios_apk#  8 0x5F7E7D2BFA8FB79B  1388.2303 0.3%  518 2.6800 0.49 10.45 SELECT apk cooper#  9 0x79F2C2072394C9BB  1005.4780 0.2%  656 1.5327 0.59 1.64 SELECT user_rec_?b#  10 0x3229403E99601A69  632.3939 0.2%  81 7.8073 0.07 1.07 SELECT ios_app#  11 0x83D4C6B0BB535E12  506.5923 0.1%  15 33.7728 0.10 11.12 SELECT apk#  13 0x2F002402DBB98EE9  226.3586 0.1%  73 3.1008 0.42 4.04 SELECT app#  14 0x992F97D6C4D52DF6  219.2329 0.1%  44 4.9826 0.19 2.00 SHOW STATUS#  16 0x791C5370A1021F19  140.2855 0.0%  30 4.6762 0.25 1.87 SHOW SLAVE STATUS#  18 0x2F27EBCFABB23992  110.6802 0.0%  36 3.0744 0.40 2.47 SELECT app_recommend app#  19 0x980736573219087A  108.8593 0.0%  15 7.2573 0.00 0.45 SELECT ios_app_free ios_app#  20 0x58492BB2C89253D8   71.5322 0.0%  10 7.1532 0.05 0.57 SELECT ios_app_free ios_app#  21 0x0EB86D9E4630253A   61.5251 0.0%  27 2.2787 0.52 0.33 SELECT ios_app_recommend ios_app#  22 0x398799E91C3C2AAD   59.5222 0.0%  12 4.9602 0.33 3.46 SELECT apk cooper#  24 0x53148D850C2E022E   45.0953 0.0%  11 4.0996 0.23 1.04 SELECT ios_app#  25 0x07387FA6467B3DB9   34.6657 0.0%  17 2.0392 0.50 0.39 SELECT app_recommend app#  26 0xBD799CC975081065   31.1719 0.0%  16 1.9482 0.47 0.51 SELECT app#  27 0xB7F06103A7ADA5C0   30.4686 0.0%  13 2.3437 0.42 0.52 SELECT user_rec_?d#  30 0x188747BC3CB9728B   19.8929 0.0%  12 1.6577 0.58 0.22 SELECT app_recommend app# MISC 0xMISC        987.4775 0.2%  92 10.7335  NS  0.0 <29 ITEMS> # Query 1: 0.76 QPS, 6.97x concurrency, ID 0xE74340EE1DEFEC99 at byte 2753434# This item is included in the report because it matches --limit.# Scores: Apdex = 0.11 [1.0], V/M = 12.60# Query_time sparkline: |   ^_|# Time range: 2016-04-01 05:05:02 to 17:43:04# Attribute  pct  total   min   max   avg   95% stddev median# ============ === ======= ======= ======= ======= ======= ======= =======# Count     75  34627# Exec time   77 317229s   1s  174s   9s   23s   11s   7s# Lock time   55   39s  46us   3s   1ms  119us  46ms  73us# Rows sent   0 31.80k    0    1  0.94  0.99  0.23  0.99# Rows examine  0 22.97k    0    5  0.68  0.99  0.55  0.99# Query size  44  2.61M   76   79  79.00  76.28  0.02  76.28# String:# Databases  apps# Hosts# Users    apps_r# Query_time distribution#  1us# 10us# 100us#  1ms# 10ms# 100ms#  1s ################################################################# 10s+ ######################## Tables#  SHOW TABLE STATUS FROM `apps` LIKE 'user_rec_07'\G#  SHOW CREATE TABLE `apps`.`user_rec_07`\G# EXPLAIN /*!50100 PARTITIONS*/select count(*) as total from user_rec_07 where type=5 and weiboId='1934676487'\G

 

我们来查看一下这个表的表结构和这条语句的explain结果,看是否可以优化,具体如下:

 

localhost.apps>show create table user_rec_45;+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+| Table    | Create Table                                                                                                                                                                                                                                                                                                 |+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+| user_rec_45 | CREATE TABLE `user_rec_45` ( `id` int(11) NOT NULL AUTO_INCREMENT, `softId` int(11) NOT NULL DEFAULT '0', `weiboId` bigint(20) NOT NULL DEFAULT '0', `type` tinyint(4) NOT NULL DEFAULT '0' COMMENT '0???', `content` varchar(512) NOT NULL DEFAULT '' COMMENT '???????url??????????????', `ctime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`id`), KEY `idx_softId_weiboId` (`softId`,`weiboId`), KEY `idx_weiboId` (`weiboId`), KEY `idx_type` (`type`)) ENGINE=TokuDB AUTO_INCREMENT=3252283 DEFAULT CHARSET=utf8 ROW_FORMAT=TOKUDB_LZMA |+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+1 row in set (0.00 sec)localhost.apps>explain select count(*) as total from user_rec_07 where type=5 and weiboId=1934676487\G;*************************** 1. row ***************************      id: 1 select_type: SIMPLE    table: user_rec_07     type: index_mergepossible_keys: idx_weiboId,idx_type     key: idx_weiboId,idx_type   key_len: 8,1     ref: NULL     rows: 1    Extra: Using intersect(idx_weiboId,idx_type); Using where; Using index1 row in set (0.01 sec)

 

可以看到通过type和extra都可以发现其实是用到了index的,但是为这么还会这么慢呢?

ps:一开始看到是tokuDB的引擎,下意识的以为是tk对count()支持不好,后来实践证明,还是index的问题。

推理

这条sql的查询条件还是相当简单的,仅为2个等式,根据个人的习惯,我会先看下这2个等值条件的结果集分别是多大?

首先是weiboID的explain:

localhost.apps>explain select count(*) as total from user_rec_07 where weiboId=1934676487\G;*************************** 1. row ***************************      id: 1 select_type: SIMPLE    table: user_rec_07     type: refpossible_keys: idx_weiboId     key: idx_weiboId   key_len: 8     ref: const     rows: 18    Extra: Using index1 row in set (0.00 sec)接下来是type的explain:localhost.apps>explain select count(*) as total from user_rec_07 where type=5\G;*************************** 1. row ***************************      id: 1 select_type: SIMPLE    table: user_rec_07     type: refpossible_keys: idx_type     key: idx_type   key_len: 1     ref: const     rows: 114834    Extra: Using index1 row in set (0.00 sec)可以很明显的看到weiboID的区分度还是很好的,而type的就差很多了(需要扫描将近12w rows),但是理论上使用weiboID作为index只需要扫描18 rows左右,按说查询时间应该在5ms之内才对。

 

我们分别看下3条sql的查询时间:

2个条件:

 

localhost.apps>select count(*) as total from user_rec_45 where type=5 and weiboId='2717608261';+-------+| total |+-------+|   1 |+-------+1 row in set (0.57 sec)

 

 

weiboID作为条件:

localhost.apps>select count(*) as total from user_rec_45 where weiboId='2717608261'\G;*************************** 1. row ***************************total: 91 row in set (0.00 sec)

 

 

type作为条件:

localhost.apps>select count(*) as total from user_rec_45 where type=5\G;*************************** 1. row ***************************total: 1038381 row in set (0.19 sec)

 

可以从上面明显的看出来双条件耗时最多570ms,weiboID作为条件0ms,type作为条件190ms

根据以上的结果,我们就可以进行index的优化了。

优化

添加index的思路非常的简单,直接加一个两条件的index即可,具体SQL如下:

localhost.apps>alter table user_rec_45 drop index idx_weiboID,add index idx_weiboID_type(weiboID,type);

我们看下添加前和添加之后的区别:

添加前:localhost.apps>select count(*) as total from user_rec_45 where type=5 and weiboId='2717608261';+-------+| total |+-------+|   1 |+-------+1 row in set (0.57 sec) 添加后:localhost.apps>select count(*) as total from user_rec_45 where type=5 and weiboId='2717608261';+-------+| total |+-------+|   1 |+-------+1 row in set (0.00 sec)

 

可以看到效果非常的明显。

从服务器的负载看下:

修改之前:07:42:42 PM CPU  %usr  %nice  %sys %iowait  %irq  %soft %steal %guest  %idle07:42:43 PM all  96.00  0.00  3.38  0.00  0.00  0.62  0.00  0.00  0.0007:42:43 PM  0  91.00  0.00  5.00  0.00  0.00  4.00  0.00  0.00  0.0007:42:43 PM  1  97.98  0.00  2.02  0.00  0.00  0.00  0.00  0.00  0.0007:42:43 PM  2  98.00  0.00  2.00  0.00  0.00  0.00  0.00  0.00  0.0007:42:43 PM  3  96.00  0.00  4.00  0.00  0.00  0.00  0.00  0.00  0.0007:42:43 PM  4  95.96  0.00  3.03  0.00  0.00  1.01  0.00  0.00  0.0007:42:43 PM  5  96.00  0.00  4.00  0.00  0.00  0.00  0.00  0.00  0.0007:42:43 PM  6  97.00  0.00  3.00  0.00  0.00  0.00  0.00  0.00  0.0007:42:43 PM  7  97.00  0.00  3.00  0.00  0.00  0.00  0.00  0.00  0.00 修改之后:07:42:23 PM CPU  %usr  %nice  %sys %iowait  %irq  %soft %steal %guest  %idle07:42:24 PM all  24.25  0.00  1.12  3.50  0.00  0.12  0.00  0.00  71.0007:42:24 PM  0  16.16  0.00  2.02  18.18  0.00  1.01  0.00  0.00  62.6307:42:24 PM  1  3.03  0.00  0.00  6.06  0.00  0.00  0.00  0.00  90.9107:42:24 PM  2  90.00  0.00  0.00  1.00  0.00  0.00  0.00  0.00  9.0007:42:24 PM  3  84.00  0.00  6.00  2.00  0.00  0.00  0.00  0.00  8.0007:42:24 PM  4  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00 100.0007:42:24 PM  5  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00 100.0007:42:24 PM  6  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00 100.0007:42:24 PM  7  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00 100.00

但是为什么会这样呢? 细心的同学应该发现了,之前其实MySQL也使用了2个索引,只不过是使用的index merge,将两个单独的index合并在一起使用了,为什么差距会这么大呢?

分析

我们首先来看下index merge也就是 index intersect(indx1,index2)的定义

index_merge: This join type indicates that the Index Merge optimization is used. In this case, the key column in the output row contains a list of indexes used, and key_len contains a list of the longest key parts for the indexes used.The Index Merge method is used to retrieve rows with several range scans and to merge their results into one. The merge can produce unions, intersections, or unions-of-intersections of its underlying scans. This access method merges index scans from a single table; it does not merge scans across multiple tables.

从上面的解释我们可以看出来,index merge其实就是分别通过对两个独立的index进行过滤之后,将过滤之后的结果聚合在一起,然后在返回结果集。

在我们的这个例子中,由于type字段的过滤性不好,故返回的rows依然很多,所以造成的很多的磁盘read,导致了cpu的负载非常的高,直接就出现了延迟。

ps:其实在这个case中,并不需要加2个条件的index,只需要将type这个index干掉,直接使用weiboID这个index即可,毕竟这个index的过滤的结果集已经很小了。

或者通过关闭index intersect功能也可以。

SET [GLOBAL|SESSION] optimizer_switch="index_merge_intersection=off";

展示一下优化前后的io吞吐:

优化前----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--usr sys idl wai hiq siq| read writ| recv send| in  out | int  csw 10  1 85  4  0  0|3842k 3440k|  0   0 |  0   0 | 629 3275 71  4 14 11  0  0| 26M 2593k| 69k  47k|  0   0 | 31k 6920 72  4 11 13  0  0| 26M 3258k| 79k  47k|  0   0 | 27k 6776 69  3 14 13  0  0| 24M  12M| 56k  37k|  0   0 | 21k 7136 76  4  7 13  0  0| 27M 2523k| 56k  20k|  0   0 | 16k 7191 73  4 14 10  0  0| 25M 2199k| 102k  43k|  0   0 优化后 ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--usr sys idl wai hiq siq| read writ| recv send| in  out | int  csw 10  1 85  3  0  0|2935k 3362k|  0   0 |  0   0 | 646 2939 30  3 61  6  0  0|4313k 4330k| 129k 353k|  0   0 |8639 3288 32  3 59  5  0  0|4242k 3424k| 138k 392k|  0   0 | 11k 5410 31  2 62  5  0  0|4441k 3840k| 169k 397k|  0   0 |7913 3670 31  1 62  6  0  0|3720k 9161k| 135k 398k|  0   0 |7265 4236 32  1 61  5  0  0|4567k 3569k| 139k 368k|  0   0 |6846 4633 31  2 61  6  0  0|3972k 4199k| 135k 341k|  0   0 |9840 4845