laika222
4/24/2017 - 3:53 PM

EXPLAIN will show you how a SELECT query will perform, including how many rows it will reference. Use this to preview query efficiency, espe

EXPLAIN will show you how a SELECT query will perform, including how many rows it will reference. Use this to preview query efficiency, especially with JOINs. If multiple tables are queried, you multiply the number in rows for table 1 times the number of rows for table 2. Example: if table 1 has 10 rows listed, and table 2 has 2,000 rows listed, you multiply 10 * 2,000, which results in 20,000 total rows being referenced.

-- Example of EXPLAIN placed before a select statement.

EXPLAIN SELECT Events.BAT_ID, Events.H_CD, Rosters.LAST_NAME_TX FROM Events 
JOIN Rosters 
GROUP BY Events.BAT_ID HAVING COUNT(H_CD) >=1000;


-- Results from EXPLAIN query listed above. The ROWS column is important. In the example below, it shows the Events table will have 740,201 rows queried (a full table reference), and then the Rosters table will reference 6,143 rows (another full table reference). This will result in 4,547,054,743 rows being queried (740,201 x 6,143), which is a giant query! This query will take forever to complete and will hog resources. Therefore, you need to find another way to increase the efficiency of the query, such as creating an index, or in this case, creating a stored procedure with two select statements to whittle down the results before JOINing the second table (see starting line 18 for how such a procedure can drastically reduce the number of rows queried). 

/* 
+----+-------------+---------+------+---------------+------+---------+------+--------+---------------------------------+
| id | select_type | table   | type | possible_keys | key  | key_len | ref  | rows   | Extra                           |
+----+-------------+---------+------+---------------+------+---------+------+--------+---------------------------------+
|  1 | SIMPLE      | Rosters | ALL  | NULL          | NULL | NULL    | NULL |   6143 | Using temporary; Using filesort |
|  1 | SIMPLE      | Events  | ALL  | NULL          | NULL | NULL    | NULL | 740201 | Using join buffer               |
+----+-------------+---------+------+---------------+------+---------+------+--------+---------------------------------+

*/ 

-- There is a gist called 'EXAMPLE: Efficiency (Optimization) In JOIN Using TEMPORARY TABLE to Store SELECT Results *BEFORE* JOINing EXAMPLE 1' that shows how to break the query above into a stored procedure that 1) whittles down the Events table using a SELECT statement, and then 2) JOINs the second table based on that much smaller result set. If you use EXPLAIN to test the second SELECT statment in the procedure, you'll see that it queries 688,016 rows (6,143 x 112), which is MUCH smaller than the original query above that referenced 4,547,054,743 rows. Therefore, this shows how EXPLAIN can help you determine the efficiency of a query before you run it.

/*

+----+-------------+----------+------+---------------+------+---------+------+------+---------------------------------+
| id | select_type | table    | type | possible_keys | key  | key_len | ref  | rows | Extra                           |
+----+-------------+----------+------+---------------+------+---------+------+------+---------------------------------+
|  1 | PRIMARY     | Events   | ALL  | NULL          | NULL | NULL    | NULL |  112 | Using temporary; Using filesort |
|  1 | PRIMARY     | Rosters  | ALL  | NULL          | NULL | NULL    | NULL | 6143 | Using where; Using join buffer  |
+----+-------------+----------+------+---------------+------+---------+------+------+---------------------------------+

*/