Rechercher dans le manuel MySQL
8.2.1.12 Condition Filtering
In join processing, prefix rows are those rows passed from one table in a join to the next. In general, the optimizer attempts to put tables with low prefix counts early in the join order to keep the number of row combinations from increasing rapidly. To the extent that the optimizer can use information about conditions on rows selected from one table and passed to the next, the more accurately it can compute row estimates and choose the best execution plan.
Without condition filtering, the prefix row count for a table
is based on the estimated number of rows selected by the
WHERE
clause according to whichever access
method the optimizer chooses. Condition filtering enables the
optimizer to use other relevant conditions in the
WHERE
clause not taken into account by the
access method, and thus improve its prefix row count
estimates. For example, even though there might be an
index-based access method that can be used to select rows from
the current table in a join, there might also be additional
conditions for the table in the WHERE
clause that can filter (further restrict) the estimate for
qualifying rows passed to the next table.
A condition contributes to the filtering estimate only if:
It refers to the current table.
It depends on a constant value or values from earlier tables in the join sequence.
It was not already taken into account by the access method.
In EXPLAIN
output, the
rows
column indicates the row estimate for
the chosen access method, and the filtered
column reflects the effect of condition filtering.
filtered
values are expressed as
percentages. The maximum value is 100, which means no
filtering of rows occurred. Values decreasing from 100
indicate increasing amounts of filtering.
The prefix row count (the number of rows estimated to be
passed from the current table in a join to the next) is the
product of the rows
and
filtered
values. That is, the prefix row
count is the estimated row count, reduced by the estimated
filtering effect. For example, if rows
is
1000 and filtered
is 20%, condition
filtering reduces the estimated row count of 1000 to a prefix
row count of 1000 × 20% = 1000 × .2 = 200.
Consider the following query:
- SELECT *
Suppose that the data set has these characteristics:
The
employee
table has 1024 rows.The
department
table has 12 rows.Both tables have an index on
dept_no
.The
employee
table has an index onfirst_name
.8 rows satisfy this condition on
employee.first_name
:- employee.first_name = 'John'
150 rows satisfy this condition on
employee.hire_date
:1 row satisfies both conditions:
Without condition filtering,
EXPLAIN
produces output like
this:
+----+------------+--------+------------------+---------+---------+------+----------+
| id | table | type | possible_keys | key | ref | rows | filtered |
+----+------------+--------+------------------+---------+---------+------+----------+
| 1 | employee | ref | name,h_date,dept | name | const | 8 | 100.00 |
| 1 | department | eq_ref | PRIMARY | PRIMARY | dept_no | 1 | 100.00 |
+----+------------+--------+------------------+---------+---------+------+----------+
For employee
, the access method on the
name
index picks up the 8 rows that match a
name of 'John'
. No filtering is done
(filtered
is 100%), so all rows are prefix
rows for the next table: The prefix row count is
rows
× filtered
=
8 × 100% = 8.
With condition filtering, the optimizer additionally takes
into account conditions from the WHERE
clause not taken into account by the access method. In this
case, the optimizer uses heuristics to estimate a filtering
effect of 16.31% for the BETWEEN
condition on employee.hire_date
. As a
result, EXPLAIN
produces output
like this:
+----+------------+--------+------------------+---------+---------+------+----------+
| id | table | type | possible_keys | key | ref | rows | filtered |
+----+------------+--------+------------------+---------+---------+------+----------+
| 1 | employee | ref | name,h_date,dept | name | const | 8 | 16.31 |
| 1 | department | eq_ref | PRIMARY | PRIMARY | dept_no | 1 | 100.00 |
+----+------------+--------+------------------+---------+---------+------+----------+
Now the prefix row count is rows
×
filtered
= 8 × 16.31% = 1.3, which
more closely reflects actual data set.
Normally, the optimizer does not calculate the condition
filtering effect (prefix row count reduction) for the last
joined table because there is no next table to pass rows to.
An exception occurs for
EXPLAIN
: To provide more
information, the filtering effect is calculated for all joined
tables, including the last one.
To control whether the optimizer considers additional
filtering conditions, use the
condition_fanout_filter
flag of the
optimizer_switch
system
variable (see Section 8.9.2, “Switchable Optimizations”).
This flag is enabled by default but can be disabled to
suppress condition filtering (for example, if a particular
query is found to yield better performance without it).
If the optimizer overestimates the effect of condition filtering, performance may be worse than if condition filtering is not used. In such cases, these techniques may help:
If a column is not indexed, index it so that the optimizer has some information about the distribution of column values and can improve its row estimates.
Similarly, if no column histogram information is available, generate a histogram (see Section 8.9.6, “Optimizer Statistics”).
Change the join order. Ways to accomplish this include join-order optimizer hints (see Section 8.9.3, “Optimizer Hints”),
STRAIGHT_JOIN
immediately following theSELECT
, and theSTRAIGHT_JOIN
join operator.Disable condition filtering for the session:
Or, for a given query, using an optimizer hint:
- SELECT /*+ SET_VAR(optimizer_switch = 'condition_fanout_filter=off') */ ...
Traduction non disponible
Le manuel MySQL n'est pas encore traduit en français sur l'infobrol. Seule la version anglaise est disponible pour l'instant.
Document créé le 26/06/2006, dernière modification le 26/10/2018
Source du document imprimé : https://www.gaudry.be/mysql-rf-condition-filtering.html
L'infobrol est un site personnel dont le contenu n'engage que moi. Le texte est mis à disposition sous licence CreativeCommons(BY-NC-SA). Plus d'info sur les conditions d'utilisation et sur l'auteur.
Références
Ces références et liens indiquent des documents consultés lors de la rédaction de cette page, ou qui peuvent apporter un complément d'information, mais les auteurs de ces sources ne peuvent être tenus responsables du contenu de cette page.
L'auteur de ce site est seul responsable de la manière dont sont présentés ici les différents concepts, et des libertés qui sont prises avec les ouvrages de référence. N'oubliez pas que vous devez croiser les informations de sources multiples afin de diminuer les risques d'erreurs.