You are here
To UNION or not to UNION...
Recently a forum question [ 1 ] got my attention:
Is there any performance issue with Union?
I used union all sometime back and it was performance issue just to make an opinion that we should used union in query.
The question itself was not too interesting because the answer is easy: It depends. But I wanted to see if there was an improvement in this common problem over time in MySQL.
Test set-up
So I prepared a little test to simulate some of the possible scenarios:
CREATE TABLE `u` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `a` int(10) unsigned DEFAULT NULL, `b` int(10) unsigned DEFAULT NULL, `c` int(10) unsigned DEFAULT NULL, `d` int(10) unsigned DEFAULT NULL, PRIMARY KEY (`id`), KEY `a` (`a`), KEY `b` (`b`), KEY `c` (`c`), KEY `d` (`d`) ) ENGINE=InnoDB ; INSERT INTO u SELECT NULL, ROUND(RAND()*10, 0), ROUND(RAND()*10, 0), ROUND(RAND()*1000000, 0), ROUND(RAND()*1000000, 0); INSERT INTO u SELECT NULL, ROUND(RAND()*10, 0), ROUND(RAND()*10, 0), ROUND(RAND()*1000000, 0), ROUND(RAND()*1000000, 0) FROM u; ... 1 mio rows ANALYZE TABLE u;
With this table we can simulate the OR
problem with low and high selectivity.
Running the tests
We did the tests with MySQL (5.0 - 5.7), Percona Server (5.6) and MariaDB (5.5, 10.0) for the following queries:
EXPLAIN SELECT * FROM u WHERE a = 5 OR b = 5; EXPLAIN SELECT * FROM u WHERE a = 5 OR c = 500001; EXPLAIN SELECT * FROM u WHERE c = 500001 OR d = 500001;
We are interested in what the optimizer is doing and what the performance of the queries is. The following results came out:
Query 1 | Query 2 | Query 3 | |||||||
---|---|---|---|---|---|---|---|---|---|
Database version | rows | avg. time | QEP | rows | avg. time | QEP | rows | avg. time | QEP |
MySQL 5.0.92 | 194402 | 390 ms | 1 | 104876 | 230 ms | 2 | 6 | < 10 ms | 3 |
MySQL 5.1.66 | 194402 | 410 ms | 1 | 104876 | 240 ms | 2 | 6 | < 10 ms | 3 |
MySQL 5.5.24 | 194402 | 420 ms | 1 | 104876 | 370 ms | 1 | 6 | < 10 ms | 3 |
MariaDB 5.5.32 | 194402 | 460 ms | 1 | 104876 | 420 ms | 1 | 6 | < 10 ms | 3 |
MySQL 5.6.12 | 194402 | 440 ms | 2 | 104876 | 240 ms | 2 | 6 | < 10 ms | 3 |
Percona 5.6.12-60.40 | 194402 | 450 ms | 2 | 104876 | 240 ms | 2 | 6 | < 10 ms | 3 |
MySQL 5.7.1 | 194402 | 420 ms | 2 | 104876 | 220 ms | 2 | 6 | < 10 ms | 3 |
MariaDB 10.0.3 | 194402 | 450 ms | 1 | 104876 | 400 ms | 1 | 6 | < 10 ms | 3 |
Different Query Execution Plans (QEP)
- QEP 1:
+----+-------------+-------+------+---------------+------+---------+------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+---------+-------------+ | 1 | SIMPLE | u | ALL | a,b | NULL | NULL | NULL | 1049134 | Using where | +----+-------------+-------+------+---------------+------+---------+------+---------+-------------+
- QEP 2:
+----+-------------+-------+-------------+---------------+------+---------+------+--------+-------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------------+---------------+------+---------+------+--------+-------------------------------+ | 1 | SIMPLE | u | index_merge | a,c | a,c | 5,5 | NULL | nnnnnn | Using union(a,c); Using where | +----+-------------+-------+-------------+---------------+------+---------+------+--------+-------------------------------+
- QEP 3:
+----+-------------+-------+-------------+---------------+------+---------+------+------+-------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------------+---------------+------+---------+------+------+-------------------------------+ | 1 | SIMPLE | u | index_merge | c,d | c,d | 5,5 | NULL | n | Using union(c,d); Using where | +----+-------------+-------+-------------+---------------+------+---------+------+------+-------------------------------+
- Single query performance went down from 5 - 50% (in one case increased by 5%) over time (MySQL releases). But we can see some impacts on optimizer improvements.
- Newer MySQL releases are not necessarily faster for single-query performance than older ones. Most of the MySQL users are not running more than 1 or 2 concurrent queries. For them scalability improvements are not really an issue.
- There seems to be some changes in the Optimizer some for good, some for bad, depending on the release or branch/fork you are using. So test carefully when you change the release or branch/fork.
- And: Do not believe the whole marketing yelling but do your own testing...
Conclusion
- Shinguz's blog
- Log in or register to post comments