<![CDATA[向东博客专注WEB应用构架之美 --- 构架之美，在于尽态极妍

<![CDATA[向东博客专注WEB应用构架之美 --- 构架之美，在于尽态极妍 | 应用之美，在于药到病除]]> http://www.jackxiang.com/index.php zh-cn http://www.jackxiang.com/post// <![CDATA[多列索引与多个索引的对比]]> jack <xdy108@126.com> Wed, 19 Nov 2008 10:53:32 +0000 http://www.jackxiang.com/post//
对于要经常查询的含量大量数据的数据库，建立索引是非常重要的，建立索引一般都是在where语句用得较多的列上。现在有个问题，如果一个表有多个列需要建立索引，是把所有列建成一个索引，还是对每一个列建一个索引，上篇文章做了一个介绍，这是作者得出的结论，Conclusion: For benchmarked queries we can see Multiple Column index beats Index Merge in all cases when such index can be used. It is also worth to watchout a MySQL may decide not to do Index merge (either intersection or union) but instead do full table scan or access table picking only one index on the pair.意思应该是说对多个列建索引比对每个列分别建索引更有优势，而且要知道索引建立得越多就越占磁盘空间，在更新数据的时候速度会更慢。

这是一个多列索引的问题，这个问题是如何安排列的顺序是至关重要的，比如需要对一个表里面的两个字段uid, rstatus建一个索引，那么索引的顺序是（uid, rstatus)还是(rstatus, uid)呢。在搞清楚如何安排顺序之前先了解一个概念，cardinality:金山的翻译是"集的势"，比如，Mytest表有1700条记录，rstatus字段有750个不同的记录，那么就可以说We have a cardinality of 750 for rstatus。总的规则可以说是cardinality越大的字段应该排在索引的第一位就是说索引的位置是(rstatus, uid)，因为cardinality越大那么第一次取出来的记录集就越小，再进行第二次查询的次数就越少了。不过这只是对于建两个索引的规则，如果是三个以上就没有那么简单了，具体地看原文，有比较详细的例子。还需要提出的是即使我们建了一个很有效的索引，但是查询优化器也许会选择不用它，如果它会考虑更多因素以决定这个索引是否有足够的效率。

It was also pointed out to me, that even if an efficient multi-column index is created, the query optimizer may choose to never use it. This is
because the optimizer looks at further statistics to determine if the index would be efficient enough or not.

官方文档：http://dev.mysql.com/doc/refman/5.0/en/multiple-column-indexes.html

老外的博客：

http://blog.decaresystems.ie/index.php/2007/05/21/how-to-create-a-successful-multi-column-index-from-first-principals/ 也有比较详细的介绍mysql如何使用联合索引的！

我试了一下将建立联合索引的顺序变化为KEY `u_r` (rstatus,`uid`) 出现：ref  key_len 都有变化，如下：

mysql> explain select * from mytest where uid in (1,2) and rstatus = 1;
+----+-------------+--------+-------+---------------+------+---------+------+------+-------------+
| id | select_type | table  | type  | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+--------+-------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | mytest | range | u_r           | u_r  | 8       | NULL |    2 | Using where |
+----+-------------+--------+-------+---------------+------+---------+------+------+-------------+
1 row in set (0.03 sec)

mysql> INSERT INTO `mytest` (`id`, `uid`, `rstatus`) VALUES(null, 3, 1);
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO `mytest` (`id`, `uid`, `rstatus`) VALUES(null, 4, 1);
Query OK, 1 row affected (0.00 sec)

mysql> explain select * from mytest where uid in (1,2) and rstatus = 1;
+----+-------------+--------+-------+---------------+------+---------+------+------+-------------+
| id | select_type | table  | type  | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+--------+-------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | mytest | range | u_r           | u_r  | 8       | NULL |    2 | Using where |
+----+-------------+--------+-------+---------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)

mysql> explain select * from mytest where uid in (3,4) and rstatus = 1;
+----+-------------+--------+------+---------------+------+---------+-------+------+-------------+
| id | select_type | table  | type | possible_keys | key  | key_len | ref   | rows | Extra       |
+----+-------------+--------+------+---------------+------+---------+-------+------+-------------+
|  1 | SIMPLE      | mytest | ref  | u_r           | u_r  | 4       | const |    1 | Using where |
+----+-------------+--------+------+---------------+------+---------+-------+------+-------------+
1 row in set (0.00 sec)

老外原文：

Conclusion: For benchmarked queries we can see Multiple Column index beats Index Merge in all cases when such index can be used. It is also worth to watchout a MySQL may decide not to do Index merge (either intersection or union) but instead do full table scan or access table picking only one index on the pair.

   我估计是对多个列建索引比对每个列分别建索引更有优势，mysql会根据最好的平均效率选取需不需要索引，如Mysql分析发现不用索引更快那就不必用到索引了！

in适用于外表大内表小的情况！与数据的多少也很有关系的！

]]> http://www.jackxiang.com/post//#blogcomment <![CDATA[[评论] 多列索引与多个索引的对比]]> <user@domain.com> Thu, 01 Jan 1970 00:00:00 +0000 http://www.jackxiang.com/post//#blogcomment