You can change the time limit in seconds by typing the following command:
mysql> SET GLOBAL long_query_time = 60;
Query OK, 0 rows affected (0.08 sec)
You can also change the default location for the slow query log file, usually found at var/lib/mysql/hostname-slow.log, to any destination of your choice:
mysql> SET GLOBAL slow_query_log_file = '/somepath/filename.log';
Query OK, 0 rows affected (0.08 sec)
Once you’ve run any query that exceeds the time limit you configured, it will be logged in the slow query log by the MySQL server. You can always inspect the file to see those slow queries.
The INFORMATION_SCHEMA.PROFILING table
The INFORMATION_SCHEMA.PROFILING table stores profiling information about queries you run in a current interactive session. It’s disabled by default, but you can enable query profiling for your current session by setting the profiling session variable as shown below. Note that the profiling information is lost when a session ends:
mysql> SET SESSION profiling = 1;
Query OK, 0 rows affected, 1 warning (0.00 sec)
After that, you can select the database you want to use:
mysql> USE sakila;
Database changedCode language: PHP (php)
Next, you run your query (Make sure you do not use the EXPLAIN statement)
mysql> SELECT * FROM customer;
Run the SHOW PROFILES query
mysql> SHOW PROFILES;
+----------+------------+------------------------+
| Query_ID | Duration | Query |
+----------+------------+------------------------+
| 1 | 0.02364600 | SELECT DATABASE() |
| 2 | 0.04395425 | show databases |
| 3 | 0.00854575 | show tables |
| 4 | 0.00213000 | SELECT * FROM customer |
+----------+------------+------------------------+
4 rows in set, 1 warning (0.00 sec)
The SHOW PROFILES query fetches all the list of queries in the current session. It has three columns:
The Query_ID – a unique numerical identifier for a query
The Duration is the time taken to execute a query
The Query column shows the query that the MySQL server executed
To see more information about a particular query, we can run the following:
The EXPLAIN statement
Another tool MySQL provides for query analysis is the EXPLAIN statement. This statement shows us how the MySQL server will execute a plan. It’s quite easy to use. Anytime you want to see how MySQL will execute your query before running it, just prepend the EXPLAIN statement to it:
mysql> EXPLAIN SELECT * FROM customer\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: customer
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 599
filtered: 100.00
Extra: NULL
1 row in set, 1 warning (0.00 sec)
Ending the SQL statement with “\G” instead of “;” causes MySQL to return the query result in vertical format, which makes it easier to read tables with many long columns.
The above example tells us that the server only needs to scan 599 rows to get the result.
The EXPLAIN ANALYZE statement
The profiling techniques we have discussed won’t be complete without mentioning EXPLAIN ANALYZE.
It works similarly to the EXPLAIN statement. The key difference is that the former is more verbose. You get more statistics about the query execution.
Another difference is that EXPLAIN doesn’t execute queries, but EXPLAIN ANALYZE does. That’s because to fetch the missing statistics, it needs to communicate with the Storage engine. We’ll explain how this communication works in a later section. Meanwhile, let’s demonstrate this profiling technique. To profile a query with the EXPLAIN ANALYZE statement, you just need to prepend the query with the text like this:
mysql> EXPLAIN ANALYZE SELECT first_name, last_name, city, country
FROM customer
INNER JOIN address USING(address_id)
INNER JOIN city USING(city_id) INNER JOIN country USING(country_id);
+---------------------------------------------------------------------------------------------------------------------->
| EXPLAIN >
+---------------------------------------------------------------------------------------------------------------------->
| -> Nested loop inner join (cost=690.10 rows=599) (actual time=0.073..8.259 rows=599 loops=1)
-> Nested loop inner join (cost=480.45 rows=599) (actual time=0.064..5.635 rows=599 loops=1)
-> Nested loop inner join (cost=270.80 rows=599) (actual time=0.057..3.083 rows=599 loops=1)
-> Table scan on customer (cost=61.15 rows=599) (actual time=0.043..0.478 rows=599 loops=1)
-> Single-row index lookup on address using PRIMARY (address_id=customer.address_id) (cost=0.25 rows=1) (actual time=0.002..0.002 rows=1 loops=599)
-> Single-row index lookup on city using PRIMARY (city_id=address.city_id) (cost=0.25 rows=1) (actual time=0.002..0.002 rows=1 loops=599)
-> Single-row index lookup on country using PRIMARY (country_id=city.country_id) (cost=0.25 rows=1) (actual time=0.002..0.002 rows=1 loops=599)
|
+---------------------------------------------------------------------------------------------------------------------->
(END)
As you can see, the results show how the MySQL server executed the plan. It introduces a few more statistics:
The actual time to fetch the first row in a table (in milliseconds)
The actual time to fetch all rows in a table (in milliseconds)
The estimated costs of the query
The actual number of rows read
The actual number of loops made
The estimated cost of a query’s execution plan is the cumulative sum of the costs of all operations within the plan, such as I/O or CPU operations.
Inspecting the last_query_cost session variable
Another quick tool to find out the estimated cost of a query is by inspecting the last_query_cost session variable:
mysql> SHOW STATUS LIKE 'last_query_cost';
+-----------------+------------+
| Variable_name | Value |
+-----------------+------------+
| Last_query_cost | 520.262389 |
+-----------------+------------+
1 row in set (0.00 sec)
Optimize slow queries
Once we identify and analyze slow queries, the next thing would be to fix the slow-performing queries. Instinctively, the first thing we would do is to check the query itself. The following techniques demonstrate how we can optimize slow queries.
Only retrieve the data you need
While the server scans all the rows your query asks for, it locks any resources available in the session, and the client cannot interrupt the process without you forcefully quitting the session. Also, the client cannot selectively access data until the server has scanned all the rows.
The query you’re sending may be asking for too much data, which causes the MySQL server to waste time scanning many rows of data. But we can fix it by doing the following:
Fetch the number of rows you need only (pagination)
We can fix our query using a common database technique known as pagination. We can use the LIMIT and OFFSET clauses to specify the number of rows we need and from which row it should start fetching data.
Fetch only the columns you need
Our queries in the last example are still not fully optimized. If you observe them, you will notice we fetched our data in all the table columns using SELECT * (select all). A better way to have accessed our data would have been by naming the columns we need in our query.
Fetch the columns you need when joining tables
Just like the previous example. You can make your joins faster by only asking for the columns you need in the result set. Let’s say you want to fetch only the first and last names of all the customers in our earlier join example. We would write our query in this manner:
SELECT first_name, last_name
FROM customer
INNER JOIN address USING(address_id)
INNER JOIN city USING(city_id)
INNER JOIN country USING(country_id);
Doing so will prevent the MySQL server from scanning all the columns of the merged tables.
Use caching strategies to prevent fetching the same data repeatedly
If your application uses the same piece of data frequently, instead of running the query every time the app needs the data, you can cache the data the first time the query runs and simply reuse the data subsequently.
The only exception is if the data changes frequently or when you need your queried information to be up-to-date every time. There used to be an in-built Cache layer in MySQL called “The Query Cache,” but it was deprecated for scalability reasons. Luckily, some tools can help cache data, such as Memcached and Redis.
Avoid LIKE expressions with leading wildcards
Wildcards are used in conjunction with LIKE clauses. They are characters used to help search for data that match complex criteria.
The position of wildcard characters in the LIKE clause can cause unexpected performance behavior. Using leading wildcards has been known to be the culprit most times, and this is because MySQL is not able to utilize indexes efficiently when you have a leading wildcard:
SELECT * FROM customer WHERE first_name LIKE '%AB%'; // Leading wildcard causes poor performance
UNION vs. UNION ALL
The MySQL query optimizer doesn’t apply as many optimizations to UNION queries as other queries, but you can specify clauses such as WHERE, LIMIT, ORDER BY, etc., to help the optimizer. But an even better option would be to use the UNION ALL clause if you need to eliminate duplicate rows in your query result.
That’s because omitting ALL adds the DISTINCT option to the temporary table, which results in the MySQL server making expensive scans on all the rows to determine uniqueness. The key takeaway is to avoid using DISTINCT and UNION unless necessary.
Optimize JOIN queries
You should use INNER JOIN instead of OUTER JOIN where possible. You can also optimize your JOIN queries by ensuring that indexes are present on the columns specified in the ON and USING clauses.
GROUP BY and ORDER BY
GROUP BY and ORDER BY expressions should always refer to columns from a single table so that the server can use indexes for such operations.
Optimize COUNT() queries
The COUNT() function is a special aggregate function used in two ways. First, when you specify a column name or expression in the COUNT() function, it counts the number of times that expression has value (i.e. a non-NULL expression).
The second way to use COUNT() is for counting the number of rows in a query result using the COUNT(*) expression. It ignores the columns in the table and counts rows instead; therefore, it’s best to use this version when you only need to count the number of rows in the query result.
Help the MySQL optimizer
There are a few cases where the optimizer may not choose the best execution path. The reasons could be that MySQL does not have sufficient information about the current data and must make “educated” guesses about the data.
In such cases, there are some methods available to help MySQL:
Using the EXPLAIN statement to view information about how MySQL processed your query. I showed you how to do that earlier.
Using FORCE INDEX, USE INDEX, and IGNORE INDEX (Index hints) for the scanned table to tell MySQL that the table scans are expensive compared to using the given index. Index hints give you total control over how the optimizer chooses indexes for a query execution plan. It’s best to use this as a last resort after trying other optimization techniques discussed. The syntax for using index hints looks like this:
SELECT * FROM table1, table2 FORCE INDEX (index_of_column)
WHERE table1.col_name=table2.col_name;
Applying Optimizer Hints at the different scope levels: global, query block, table-level, or index-level to control the optimizer’s strategies based on certain criteria. Optimizer hints are usually applied per statement and can be used together with Index hints. They look like comments but are recognized by the query parser. The syntax for adding optimizer hints looks like this:
SELECT /*+ NO_RANGE_OPTIMIZATION(t3 PRIMARY, f2_idx) */ f1
FROM t3 WHERE f1 > 30 AND f1 < 33;
SELECT /*+ BKA(t1) NO_BKA(t2) */ * FROM t1 INNER JOIN t2 WHERE ...;
SELECT /*+ NO_ICP(t1, t2) */ * FROM t1 INNER JOIN t2 WHERE ...;
SELECT /*+ SEMIJOIN(FIRSTMATCH, LOOSESCAN) */ * FROM t1 ...;
EXPLAIN SELECT /*+ NO_ICP(t1) */ * FROM t1 WHERE ...;
Using the ANALYZE TABLE table_name to update the key distributions of the scanned table.
Tuning global or thread-specific system variables. For example, you can start the mysqld with the --max-seeks-for-key=1000 option or use SET max_seeks_for_key=1000 to tell the optimizer to assume that no key scan causes more than 1,000 key seeks.
Re-writing your query: Sometimes, your query might need refactoring to improve its response time. We discussed some scenarios we could do in the previous sections.
Adding Indexes to your tables: For tables that do not need frequent batch inserts or updates, you might want to use an index. Indexes also speed up JOIN operations and help MySQL retrieve data faster, thereby improving your query response time. For smaller tables, adding indexes may not be optimal. Also, you should not index everything, as it increases disk space usage and makes updating or inserting records slow.
Re-designing your schema: To access tables with the best possible speed, you must consider schema optimization. A poorly designed database schema forces MySQL to take a longer and utilize more system resources when processing your queries.
Index and indexing
Improving the Index Use
Add a Covering Index
Rewriting Complex Index Conditions (use CTE & Window Functions)
Splitting a Query Into Parts
SKIP LOCKED
Avoid Many OR or IN Conditions