Thursday, September 1, 2011

Explain....


Explain.... It is a very simple command that I feel is one of the most overlooked commands by new MySQL users. It is also a very valuable command available for MySQL. I realize I am preaching to the choir for a lot of MySQL users. However, for everyone who uses explain, we are bound to have many who do not. 
The MySQL documentation on this is great and available here and Optimizing Queries with EXPLAIN
Developer and a dba issues will continue for years,  but we can at least start on a level playing field. When writing a query, regardless of what it is, it is a good practice is to start it with explain first. This can achieve a couple things for you.
  • It checks your syntax to help you avoid mistakes.
  • Allows you to display information from the optimizer
Using the world example data for avoiding mistakes via explain. (innodb version)

mysql [localhost] {root} (world) > explain extended SELECT City.CountryCode , City.Name , Country.Name , Country.Capital , Country.Population FROM City , Country WHERE Country.Continent = 'North America' AND City.CountryCode = Country.Code;
+----+-------------+---------+------+---------------+-------------+---------+--------------------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------+---------------+-------------+---------+--------------------+------+----------+-------------+
| 1 | SIMPLE | Country | ALL | PRIMARY | NULL | NULL | NULL | 200 | 100.00 | Using where |
| 1 | SIMPLE | City | ref | CountryCode | CountryCode | 3 | world.Country.Code | 18 | 100.00 | |
+----+-------------+---------+------+---------------+-------------+---------+--------------------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

This query does work but I would never write it this way. It works for such a small data set but a join would work better.

mysql [localhost] {root} (world) > explain extended SELECT C.CountryCode , C.Name , Y.Name , Y.Capital , Y.Population
-> FROM City C
-> INNER JOIN Country Y ON Y.Code = C.CountryCode ; WHERE Y.Continent = 'North America' ;
+----+-------------+-------+------+---------------+-------------+---------+--------------+------+----------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+-------------+---------+--------------+------+----------+-------+
| 1 | SIMPLE | Y | ALL | PRIMARY | NULL | NULL | NULL | 200 | 100.00 | |
| 1 | SIMPLE | C | ref | CountryCode | CountryCode | 3 | world.Y.Code | 18 | 100.00 | |
+----+-------------+-------+------+---------------+-------------+---------+--------------+------+----------+-------+
2 rows in set, 1 warning (0.00 sec)
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'WHERE Y.Continent = 'North America'' at line 1
mysql [localhost] {root} (world) >

An error! Explain found this simple typo. While updating query to use a join, I adjusted my table references to aliases and a semicolon was placed before the where. Simple typo, but in the real world it could have been worse. What if your typo, enabled a full table scan across a table with billions of rows of data? Real world issues are usually related to “such an easy query” issues, so it is rushed. Then code gets pushed and nothing worked as planned. We can all write, clean, effective, and optimized queries, if we pay attention and understand what we are executing.

So error fixed .
mysql [localhost] {root} (world) > explain extended SELECT C.CountryCode , C.Name , Y.Name , Y.Capital , Y.Population
-> FROM City C
-> INNER JOIN Country Y ON Y.Code = C.CountryCode WHERE Y.Continent = 'North America' ;
+----+-------------+-------+------+---------------+-------------+---------+--------------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+-------------+---------+--------------+------+----------+-------------+
| 1 | SIMPLE | Y | ALL | PRIMARY | NULL | NULL | NULL | 200 | 100.00 | Using where |
| 1 | SIMPLE | C | ref | CountryCode | CountryCode | 3 | world.Y.Code | 18 | 100.00 | |
+----+-------------+-------+------+---------------+-------------+---------+--------------+------+----------+-------------+
2 rows in set, 1 warning (0.01 sec)

Using the employee example data to display information from the optimizer via explain. (innodb version)

The real use and best use of explain is to “display information from the optimizer about the query execution plan.” (refman) When executing joins across tables you need to understand what your pulling and make sure the best indexes are being used.

mysql [localhost] {root} (employees) > explain SELECT e.first_name , e.last_name , e.gender , e.hire_date , t.title , s.salary
FROM employees e
INNER JOIN titles t ON t.emp_no = e.emp_no AND e.hire_date BETWEEN t.from_date and t.to_date
INNER JOIN salaries s ON s.emp_no = e.emp_no AND e.hire_date BETWEEN s.from_date and s.to_date
WHERE e.gender='M'
ORDER BY e.hire_date ASC;
+----+-------------+-------+------+----------------+---------+---------+--------------------+--------+----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------+---------+---------+--------------------+--------+----------------+
| 1 | SIMPLE | e | ALL | PRIMARY | NULL | NULL | NULL | 300030 | Using filesort |
| 1 | SIMPLE | t | ref | PRIMARY,emp_no | PRIMARY | 4 | employees.e.emp_no | 1 | Using where |
| 1 | SIMPLE | s | ref | PRIMARY,emp_no | PRIMARY | 4 | employees.t.emp_no | 4 | Using where |
+----+-------------+-------+------+----------------+---------+---------+--------------------+--------+----------------+
150291 rows in set (3.88 sec)

300030 rows ? NULL KEY ?  ick.

mysql [localhost] {root} (employees) > ALTER TABLE employees ADD KEY gender (gender);
mysql [localhost] {root} (employees) > explain SELECT e.first_name , e.last_name , e.gender , e.hire_date , t.title , s.salary
FROM employees e
INNER JOIN titles t ON t.emp_no = e.emp_no AND e.hire_date BETWEEN t.from_date and t.to_date
INNER JOIN salaries s ON s.emp_no = e.emp_no AND e.hire_date BETWEEN s.from_date and s.to_date
WHERE e.gender='M'
ORDER BY e.hire_date ASC;
+----+-------------+-------+------+----------------+---------+---------+--------------------+--------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------+---------+---------+--------------------+--------+-----------------------------+
| 1 | SIMPLE | e | ref | PRIMARY,gender | gender | 1 | const | 150015 | Using where; Using filesort |
| 1 | SIMPLE | t | ref | PRIMARY,emp_no | PRIMARY | 4 | employees.e.emp_no | 1 | Using where |
| 1 | SIMPLE | s | ref | PRIMARY,emp_no | PRIMARY | 4 | employees.t.emp_no | 4 | Using where |
+----+-------------+-------+------+----------------+---------+---------+--------------------+--------+-----------------------------+
3 rows in set (0.00 sec)

Now I am not doing a full table scan but using the index on gender and only across 15k rows.
Explain shows this information so we can adjust when required.

These are simple examples I realize. Adding an index also has to be reviewed and just not put on everything.

Take the time to use explain. Make it a habit.

Other blog posts about explain over the years: