:: DEVELOPER ZONE
In some cases, MySQL can use an index to satisfy an ORDER BY
clause without doing any extra sorting.
The index can also be used even if the ORDER BY doesn't match the
index exactly, as long as all the unused index parts and all the extra
are ORDER BY columns are constants in the WHERE
clause. The following queries use the index to resolve the
ORDER BY part:
SELECT * FROM t1 ORDER BYkey_part1,key_part2,... ; SELECT * FROM t1 WHEREkey_part1=constantORDER BYkey_part2; SELECT * FROM t1 ORDER BYkey_part1DESC,key_part2DESC; SELECT * FROM t1 WHEREkey_part1=1 ORDER BYkey_part1DESC,key_part2DESC;
In some cases, MySQL cannot use indexes to resolve the ORDER BY, although it still uses indexes to find the rows that
match the WHERE clause. These cases include the following:
You use ORDER BY on different keys:
SELECT * FROM t1 ORDER BYkey1,key2;
You use ORDER BY on non-consecutive key parts:
SELECT * FROM t1 WHEREkey2=constantORDER BYkey_part2;
You mix ASC and DESC:
SELECT * FROM t1 ORDER BYkey_part1DESC,key_part2ASC;
The key used to fetch the rows is not the same as the one used in
the ORDER BY:
SELECT * FROM t1 WHEREkey2=constantORDER BYkey1;
You are joining many tables, and the columns in the ORDER BY are not all from the first non-constant table that is used to
retrieve rows. (This is the first table in the EXPLAIN output that
doesn't have a const join type.)
You have different ORDER BY and GROUP BY expressions.
The type of table index used doesn't store rows in order. For example, this
is true for a HASH index in a HEAP table.
With EXPLAIN SELECT ... ORDER BY, you can check whether MySQL can use
indexes to resolve the query. It cannot if you see Using filesort in
the Extra column.
See Section 7.2.1, “EXPLAIN Syntax (Get Information About a SELECT)”.
In those cases where MySQL must sort the result, it uses the following
filesort algorithm before MySQL 4.1:
Read all rows according to key or by table scanning.
Rows that don't match the WHERE clause are skipped.
For each row, store a pair of values in a buffer (the sort key and the row
pointer). The size of the buffer is the value of the sort_buffer_size
system variable.
When the buffer gets full, run a qsort (quicksort) on it and store the result in a temporary file. Save a pointer to the sorted block. (If all pairs fit into the sort buffer, no temporary file is created.)
Repeat the preceding steps until all rows have been read.
Do a multi-merge of up to MERGEBUFF (7) regions to one block in
another temporary file. Repeat until all blocks from the first file
are in the second file.
Repeat the following until there are fewer than MERGEBUFF2 (15)
blocks left.
On the last multi-merge, only the pointer to the row (the last part of the sort key) is written to a result file.
Read the rows in sorted order by using the row pointers in the result file.
To optimize this, we read in a big block of row pointers, sort them, and use
them to read the rows in sorted order into a row buffer. The size of the
buffer is the value of the read_rnd_buffer_size system variable.
The code for this step is in the sql/records.cc source file.
One problem with this approach is that it reads rows twice: One time when
evaluating the WHERE clause, and again after sorting the pair values.
And even if the rows were accessed successively the first time (for example,
if a table scan is done), the second time they are accessed randomly. (The
sort keys are ordered, but the row positions are not.)
In MySQL 4.1 and up, a filesort optimization is used that records not
only the sort key value and row position, but also the columns required for
the query. This avoids reading the rows twice. The modified filesort
algorithm works like this:
Read the rows that match the WHERE clause, as before.
For each row, record a tuple of values consisting of the sort key value and row position, and also the columns required for the query.
Sort the tuples by sort key value
Retrieve the rows in sorted order, but read the required columns directly from the sorted tuples rather than by accessing the table a second time.
Using the modified filesort algorithm, the tuples are longer than the
pairs used in the original method, and fewer of them fit in the sort buffer
(the size of which is given by sort_buffer_size). As a result, it is
possible for the extra I/O to make the modified approach slower, not faster.
To avoid a slowdown, the optimization is used only if the total size of the
extra columns in the sort tuple does not exceed the value of the
max_length_for_sort_data system variable. (A symptom of setting the
value of this variable too high is that you should see high disk activity and
low CPU activity.)
If you want to increase ORDER BY speed, first see whether you can get
MySQL to use indexes rather than an extra sorting phase. If this is not
possible, you can try the following strategies:
Increase the size of the sort_buffer_size variable.
Increase the size of the read_rnd_buffer_size variable.
Change tmpdir to point to a dedicated filesystem with lots of empty
space. If you use MySQL 4.1 or later, this option accepts several paths
that are used in round-robin fashion. Paths should be separated by colon
characters (':') on Unix and semicolon characters (';') on
Windows, NetWare, and OS/2. You can use this feature to spread the load
across several directories. Note: The paths should be for
directories in filesystems that are located on different physical
disks, not different partitions of the same disk.
By default, MySQL sorts all GROUP BY queries as if
you specified col1, col2, ...ORDER BY in the query as well. If you
include an col1, col2, ...ORDER BY clause explicitly that contains the same column
list, MySQL optimizes it away without any speed penalty, although the sorting
still occurs. If a query includes GROUP BY but you want to avoid the
overhead of sorting the result, you can suppress sorting by specifying
ORDER BY NULL. For example:
INSERT INTO foo SELECT a, COUNT(*) FROM bar GROUP BY a ORDER BY NULL;
© 1995-2005 MySQL AB. All rights reserved.

User Comments
Add your own comment.