Rechercher dans le manuel MySQL
8.2.1.15 ORDER BY Optimization
This section describes when MySQL can use an index to satisfy
an ORDER BY
clause, the
filesort
operation used when an index
cannot be used, and execution plan information available from
the optimizer about ORDER BY
.
An ORDER BY
with and without
LIMIT
may return rows in different orders,
as discussed in Section 8.2.1.18, “LIMIT Query Optimization”.
Use of Indexes to Satisfy ORDER BY
In some cases, MySQL may use an index to satisfy an
ORDER BY
clause and avoid the extra
sorting involved in performing a filesort
operation.
The index may also be used even if the ORDER
BY
does not match the index exactly, as long as
all unused portions of the index and all extra
ORDER BY
columns are constants in the
WHERE
clause. If the index does not
contain all columns accessed by the query, the index is used
only if index access is cheaper than other access methods.
Assuming that there is an index on
(
, the
following queries may use the index to resolve the
key_part1
,
key_part2
)ORDER BY
part. Whether the optimizer
actually does so depends on whether reading the index is
more efficient than a table scan if columns not in the index
must also be read.
In this query, the index on
(
enables the optimizer to avoid sorting:key_part1
,key_part2
)However, the query uses
SELECT *
, which may select more columns thankey_part1
andkey_part2
. In that case, scanning an entire index and looking up table rows to find columns not in the index may be more expensive than scanning the table and sorting the results. If so, the optimizer probably will not use the index. IfSELECT *
selects only the index columns, the index will be used and sorting avoided.If
t1
is anInnoDB
table, the table primary key is implicitly part of the index, and the index can be used to resolve theORDER BY
for this query:In this query,
key_part1
is constant, so all rows accessed through the index are inkey_part2
order, and an index on(
avoids sorting if thekey_part1
,key_part2
)WHERE
clause is selective enough to make an index range scan cheaper than a table scan:In the next two queries, whether the index is used is similar to the same queries without
DESC
shown previously:Two columns in an
ORDER BY
can sort in the same direction (bothASC
, or bothDESC
) or in opposite directions (oneASC
, oneDESC
). A condition for index use is that the index must have the same homogeneity, but need not have the same actual direction.If a query mixes
ASC
andDESC
, the optimizer can use an index on the columns if the index also uses corresponding mixed ascending and descending columns:The optimizer can use an index on (
key_part1
,key_part2
) ifkey_part1
is descending andkey_part2
is ascending. It can also use an index on those columns (with a backward scan) ifkey_part1
is ascending andkey_part2
is descending. See Section 8.3.13, “Descending Indexes”In the next two queries,
key_part1
is compared to a constant. The index will be used if theWHERE
clause is selective enough to make an index range scan cheaper than a table scan:In the next query, the
ORDER BY
does not namekey_part1
, but all rows selected have a constantkey_part1
value, so the index can still be used:
In some cases, MySQL cannot use indexes
to resolve the ORDER BY
, although it may
still use indexes to find the rows that match the
WHERE
clause. Examples:
The query uses
ORDER BY
on different indexes:The query uses
ORDER BY
on nonconsecutive parts of an index:The index used to fetch the rows differs from the one used in the
ORDER BY
:The query uses
ORDER BY
with an expression that includes terms other than the index column name:The query joins many tables, and the columns in the
ORDER BY
are not all from the first nonconstant table that is used to retrieve rows. (This is the first table in theEXPLAIN
output that does not have aconst
join type.)The query has different
ORDER BY
andGROUP BY
expressions.There is an index on only a prefix of a column named in the
ORDER BY
clause. In this case, the index cannot be used to fully resolve the sort order. For example, if only the first 10 bytes of aCHAR(20)
column are indexed, the index cannot distinguish values past the 10th byte and afilesort
is needed.The index does not store rows in order. For example, this is true for a
HASH
index in aMEMORY
table.
Availability of an index for sorting may be affected by the
use of column aliases. Suppose that the column
t1.a
is indexed. In this statement, the
name of the column in the select list is
a
. It refers to t1.a
,
as does the reference to a
in the
ORDER BY
, so the index on
t1.a
can be used:
In this statement, the name of the column in the select list
is also a
, but it is the alias name. It
refers to ABS(a)
, as does the reference
to a
in the ORDER BY
,
so the index on t1.a
cannot be used:
In the following statement, the ORDER BY
refers to a name that is not the name of a column in the
select list. But there is a column in t1
named a
, so the ORDER
BY
refers to t1.a
and the index
on t1.a
can be used. (The resulting sort
order may be completely different from the order for
ABS(a)
, of course.)
Previously (MySQL 5.7 and lower),
GROUP BY
sorted implicitly under certain
conditions. In MySQL 8.0, that no longer
occurs, so specifying ORDER BY NULL
at
the end to suppress implicit sorting (as was done
previously) is no longer necessary. However, query results
may differ from previous MySQL versions. To produce a given
sort order, provide an ORDER BY
clause.
If an index cannot be used to satisfy an ORDER
BY
clause, MySQL performs a
filesort
operation that reads table rows
and sorts them. A filesort
constitutes an
extra sorting phase in query execution.
To obtain memory for filesort
operations,
as of MySQL 8.0.12, the optimizer allocates memory buffers
incrementally as needed, up to the size indicated by the
sort_buffer_size
system
variable, rather than allocating a fixed amount of
sort_buffer_size
bytes up
front, as was done prior to MySQL 8.0.12. This enables users
to set sort_buffer_size
to
larger values to speed up larger sorts, without concern for
excessive memory use for small sorts. (This benefit may not
occur for multiple concurrent sorts on Windows, which has a
weak multithreaded malloc
.)
A filesort
operation uses temporary disk
files as necessary if the result set is too large to fit in
memory. Some types of queries are particularly suited to
completely in-memory filesort
operations.
For example, the optimizer can use
filesort
to efficiently handle in memory,
without temporary files, the ORDER BY
operation for queries (and subqueries) of the following
form:
Such queries are common in web applications that display only a few rows from a larger result set. Examples:
For slow ORDER BY
queries for which
filesort
is not used, try lowering the
max_length_for_sort_data
system variable to a value that is appropriate to trigger a
filesort
. (A symptom of setting the value
of this variable too high is a combination of high disk
activity and low CPU activity.)
To increase ORDER BY
speed, check whether
you can get MySQL to use indexes rather than an extra
sorting phase. If this is not possible, try the following
strategies:
Increase the
sort_buffer_size
variable value. Ideally, the value should be large enough for the entire result set to fit in the sort buffer (to avoid writes to disk and merge passes).Take into account that the size of column values stored in the sort buffer is affected by the
max_sort_length
system variable value. For example, if tuples store values of long string columns and you increase the value ofmax_sort_length
, the size of sort buffer tuples increases as well and may require you to increasesort_buffer_size
.To monitor the number of merge passes (to merge temporary files), check the
Sort_merge_passes
status variable.Increase the
read_rnd_buffer_size
variable value so that more rows are read at a time.Change the
tmpdir
system variable to point to a dedicated file system with large amounts of free space. The variable value can list several paths that are used in round-robin fashion; you can use this feature to spread the load across several directories. Separate the paths by colon characters (:
) on Unix and semicolon characters (;
) on Windows. The paths should name directories in file systems located on different physical disks, not different partitions on the same disk.
With
EXPLAIN
(see Section 8.8.1, “Optimizing Queries with EXPLAIN”), you can check whether
MySQL can use indexes to resolve an ORDER
BY
clause:
In addition, if a filesort
is performed,
optimizer trace output includes a
filesort_summary
block. For example:
"filesort_summary": {
"rows": 100,
"examined_rows": 100,
"number_of_tmp_files": 0,
"peak_memory_used": 25192,
"sort_mode": "<sort_key, packed_additional_fields>"
}
peak_memory_used
indicates the maximum
memory used at any one time during the sort. This is a value
up to but not necessarily as large as the value of the
sort_buffer_size
system
variable. Prior to MySQL 8.0.12, the output shows
sort_buffer_size
instead, indicating the
value of sort_buffer_size
.
(Prior to MySQL 8.0.12, the optimizer always allocates
sort_buffer_size
bytes for
the sort buffer. As of 8.0.12, the optimizer allocates
sort-buffer memory incrementally, beginning with a small
amount and adding more as necessary, up to
sort_buffer_size
bytes.)
The sort_mode
value provides information
about the contents of tuples in the sort buffer:
<sort_key, rowid>
: This indicates that sort buffer tuples are pairs that contain the sort key value and row ID of the original table row. Tuples are sorted by sort key value and the row ID is used to read the row from the table.<sort_key, additional_fields>
: This indicates that sort buffer tuples contain the sort key value and columns referenced by the query. Tuples are sorted by sort key value and column values are read directly from the tuple.<sort_key, packed_additional_fields>
: Like the previous variant, but the additional columns are packed tightly together instead of using a fixed-length encoding.
EXPLAIN
does not distinguish
whether the optimizer does or does not perform a
filesort
in memory. Use of an in-memory
filesort
can be seen in optimizer trace
output. Look for
filesort_priority_queue_optimization
. For
information about the optimizer trace, see
MySQL
Internals: Tracing the Optimizer.
Document created the 26/06/2006, last modified the 26/10/2018
Source of the printed document:https://www.gaudry.be/en/mysql-rf-order-by-optimization.html
The infobrol is a personal site whose content is my sole responsibility. The text is available under CreativeCommons license (BY-NC-SA). More info on the terms of use and the author.
References
These references and links indicate documents consulted during the writing of this page, or which may provide additional information, but the authors of these sources can not be held responsible for the content of this page.
The author This site is solely responsible for the way in which the various concepts, and the freedoms that are taken with the reference works, are presented here. Remember that you must cross multiple source information to reduce the risk of errors.