Rechercher dans le manuel MySQL
13.2.13 WITH Syntax (Common Table Expressions)
A common table expression (CTE) is a named temporary result set that exists within the scope of a single statement and that can be referred to later within that statement, possibly multiple times. The following discussion describes how to write statements that use CTEs.
For information about CTE optimization, see Section 8.2.2.4, “Optimizing Derived Tables, View References, and Common Table Expressions with Merging or Materialization”.
Additional Resources
These articles contain additional information about using CTEs in MySQL, including many examples:
To specify common table expressions, use a
WITH
clause that has one or more
comma-separated subclauses. Each subclause provides a subquery
that produces a result set, and associates a name with the
subquery. The following example defines CTEs named
cte1
and cte2
in the
WITH
clause, and refers to them
in the top-level SELECT
that
follows the WITH
clause:
In the statement containing the
WITH
clause, each CTE name can be
referenced to access the corresponding CTE result set.
A CTE name can be referenced in other CTEs, enabling CTEs to be defined based on other CTEs.
A CTE can refer to itself to define a recursive CTE. Common applications of recursive CTEs include series generation and traversal of hierarchical or tree-structured data.
Common table expressions are an optional part of the syntax for
DML statements. They are defined using a
WITH
clause:
- with_clause:
cte_name
names a single common table
expression and can be used as a table reference in the statement
containing the WITH
clause.
The subquery
part of AS
(
is called the
“subquery of the CTE” and is what produces the CTE
result set. The parentheses following subquery
)AS
are
required.
A common table expression is recursive if its subquery refers to
its own name. The RECURSIVE
keyword must be
included if any CTE in the WITH
clause is recursive. For more information, see
Recursive Common Table Expressions.
Determination of column names for a given CTE occurs as follows:
If a parenthesized list of names follows the CTE name, those names are the column names:
The number of names in the list must be the same as the number of columns in the result set.
Otherwise, the column names come from the select list of the first
SELECT
within theAS (
part:subquery
)
A WITH
clause is permitted in
these contexts:
Only one WITH
clause is permitted
at the same level. WITH
followed
by WITH
at the same level is not
permitted, so this is illegal:
To make the statement legal, use a single
WITH
clause that separates the
subclauses by a comma:
However, a statement can contain multiple
WITH
clauses if they occur at
different levels:
A WITH
clause can define one or
more common table expressions, but each CTE name must be unique
to the clause. This is illegal:
To make the statement legal, define the CTEs with unique names:
A CTE can refer to itself or to other CTEs:
A self-referencing CTE is recursive.
A CTE can refer to CTEs defined earlier in the same
WITH
clause, but not those defined later.This constraint rules out mutually-recursive CTEs, where
cte1
referencescte2
andcte2
referencescte1
. One of those references must be to a CTE defined later, which is not permitted.A CTE in a given query block can refer to CTEs defined in query blocks at a more outer level, but not CTEs defined in query blocks at a more inner level.
For resolving references to objects with the same names, derived
tables hide CTEs; and CTEs hide base tables,
TEMPORARY
tables, and views. Name resolution
occurs by searching for objects in the same query block, then
proceeding to outer blocks in turn while no object with the name
is found.
Like derived tables, a CTE cannot contain outer references prior to MySQL 8.0.14. This is a MySQL restriction that is lifted in MySQL 8.0.14, not a restriction of the SQL standard. For additional syntax considerations specific to recursive CTEs, see Recursive Common Table Expressions.
A recursive common table expression is one having a subquery that refers to its own name. For example:
- (
- SELECT 1
- )
When executed, the statement produces this result, a single column containing a simple linear sequence:
+------+
| n |
+------+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
+------+
A recursive CTE has this structure:
The
WITH
clause must begin withWITH RECURSIVE
if any CTE in theWITH
clause refers to itself. (If no CTE refers to itself,RECURSIVE
is permitted but not required.)If you forget
RECURSIVE
for a recursive CTE, this error is a likely result:ERROR 1146 (42S02): Table 'cte_name' doesn't exist
The recursive CTE subquery has two parts, separated by
UNION [ALL]
orUNION DISTINCT
:The first
SELECT
produces the initial row or rows for the CTE and does not refer to the CTE name. The secondSELECT
produces additional rows and recurses by referring to the CTE name in itsFROM
clause. Recursion ends when this part produces no new rows. Thus, a recursive CTE consists of a nonrecursiveSELECT
part followed by a recursiveSELECT
part.Each
SELECT
part can itself be a union of multipleSELECT
statements.The types of the CTE result columns are inferred from the column types of the nonrecursive
SELECT
part only, and the columns are all nullable. For type determination, the recursiveSELECT
part is ignored.If the nonrecursive and recursive parts are separated by
UNION DISTINCT
, duplicate rows are eliminated. This is useful for queries that perform transitive closures, to avoid infinite loops.Each iteration of the recursive part operates only on the rows produced by the previous iteration. If the recursive part has multiple query blocks, iterations of each query block are scheduled in unspecified order, and each query block operates on rows that have been produced either by its previous iteration or by other query blocks since that previous iteration's end.
The recursive CTE subquery shown earlier has this nonrecursive part that retrieves a single row to produce the initial row set:
- SELECT 1
The CTE subquery also has this recursive part:
At each iteration, that SELECT
produces a row with a new value one greater than the value of
n
from the previous row set. The first
iteration operates on the initial row set (1
)
and produces 1+1=2
; the second iteration
operates on the first iteration's row set (2
)
and produces 2+1=3
; and so forth. This
continues until recursion ends, which occurs when
n
is no longer less than 5.
If the recursive part of a CTE produces wider values for a column than the nonrecursive part, it may be necessary to widen the column in the nonrecursive part to avoid data truncation. Consider this statement:
- (
- )
In nonstrict SQL mode, the statement produces this output:
+------+------+
| n | str |
+------+------+
| 1 | abc |
| 2 | abc |
| 3 | abc |
+------+------+
The str
column values are all
'abc'
because the nonrecursive
SELECT
determines the column
widths. Consequently, the wider str
values
produced by the recursive SELECT
are truncated.
In strict SQL mode, the statement produces an error:
ERROR 1406 (22001): Data too long for column 'str' at row 1
To address this issue, so that the statement does not produce
truncation or errors, use CAST()
in the nonrecursive SELECT
to
make the str
column wider:
- (
- )
Now the statement produces this result, without truncation:
+------+--------------+
| n | str |
+------+--------------+
| 1 | abc |
| 2 | abcabc |
| 3 | abcabcabcabc |
+------+--------------+
Columns are accessed by name, not position, which means that columns in the recursive part can access columns in the nonrecursive part that have a different position, as this CTE illustrates:
- (
- )
Because p
in one row is derived from
q
in the previous row, and vice versa, the
positive and negative values swap positions in each successive
row of the output:
+------+------+------+
| n | p | q |
+------+------+------+
| 1 | 1 | -1 |
| 2 | -2 | 2 |
| 3 | 4 | -4 |
| 4 | -8 | 8 |
| 5 | 16 | -16 |
+------+------+------+
Some syntax constraints apply within recursive CTE subqueries:
The recursive
SELECT
part must not contain these constructs:Aggregate functions such as
SUM()
Window functions
GROUP BY
ORDER BY
LIMIT
DISTINCT
This constraint does not apply to the nonrecursive
SELECT
part of a recursive CTE. The prohibition onDISTINCT
applies only toUNION
members;UNION DISTINCT
is permitted.The recursive
SELECT
part must reference the CTE only once and only in itsFROM
clause, not in any subquery. It can reference tables other than the CTE and join them with the CTE. If used in a join like this, the CTE must not be on the right side of aLEFT JOIN
.
These constraints come from the SQL standard, other than the
MySQL-specific exclusions of ORDER BY
,
LIMIT
, and DISTINCT
.
For recursive CTEs, EXPLAIN
output rows for recursive SELECT
parts display Recursive
in the
Extra
column.
Cost estimates displayed by
EXPLAIN
represent cost per
iteration, which might differ considerably from total cost. The
optimizer cannot predict the number of iterations because it
cannot predict when the WHERE
clause will
become false.
CTE actual cost may also be affected by result set size. A CTE that produces many rows may require an internal temporary table large enough to be converted from in-memory to on-disk format and may suffer a performance penalty. If so, increasing the permitted in-memory temporary table size may improve performance; see Section 8.4.4, “Internal Temporary Table Use in MySQL”.
It is important for recursive CTEs that the recursive
SELECT
part include a condition
to terminate recursion. As a development technique to guard
against a runaway recursive CTE, you can force termination by
placing a limit on execution time:
The
cte_max_recursion_depth
system variable enforces a limit on the number of recursion levels for CTEs. The server terminates execution of any CTE that recurses more levels than the value of this variable.The
max_execution_time
system variable enforces an execution timeout forSELECT
statements executed within the current session.The
MAX_EXECUTION_TIME
optimizer hint enforces a per-query execution timeout for theSELECT
statement in which it appears.
Suppose that a recursive CTE is mistakenly written with no recursion execution termination condition:
By default,
cte_max_recursion_depth
has a
value of 1000, causing the CTE to terminate when it recurses
past 1000 levels. Applications can change the session value to
adjust for their requirements:
You can also set the global
cte_max_recursion_depth
value
to affect all sessions that begin subsequently.
For queries that execute and thus recurse slowly or in contexts
for which there is reason to set the
cte_max_recursion_depth
value
very high, another way to guard against deep recursion is to set
a per-session timeout. To do so, execute a statement like this
prior to executing the CTE statement:
Alternatively, include an optimizer hint within the CTE statement itself:
- (
- SELECT 1
- )
If a recursive query without an execution time limit enters an
infinite loop, you can terminate it from another session using
KILL QUERY
.
Within the session itself, the client program used to run the
query might provide a way to kill the query. For example, in
mysql, typing Control+C
interrupts the current statement.
As mentioned previously, recursive common table expressions (CTEs) are frequently used for series generation and traversing hierarchical or tree-structured data. This section shows some simple examples of these techniques.
Fibonacci Series Generation
A Fibonacci series begins with the two numbers 0 and 1 (or 1 and
1) and each number after that is the sum of the previous two
numbers. A recursive common table expression can generate a
Fibonacci series if each row produced by the recursive
SELECT
has access to the two
previous numbers from the series. The following CTE generates a
10-number series using 0 and 1 as the first two numbers:
- (
- )
The CTE produces this result:
+------+-------+------------+
| n | fib_n | next_fib_n |
+------+-------+------------+
| 1 | 0 | 1 |
| 2 | 1 | 1 |
| 3 | 1 | 2 |
| 4 | 2 | 3 |
| 5 | 3 | 5 |
| 6 | 5 | 8 |
| 7 | 8 | 13 |
| 8 | 13 | 21 |
| 9 | 21 | 34 |
| 10 | 34 | 55 |
+------+-------+------------+
How the CTE works:
n
is a display column to indicate that the row contains then
-th Fibonacci number. For example, the 8th Fibonacci number is 13.The
fib_n
column displays Fibonacci numbern
.The
next_fib_n
column displays the next Fibonacci number after numbern
. This column provides the next series value to the next row, so that row can produce the sum of the two previous series values in itsfib_n
column.Recursion ends when
n
reaches 10. This is an arbitrary choice, to limit the output to a small set of rows.
The preceding output shows the entire CTE result. To select just
part of it, add an appropriate WHERE
clause
to the top-level SELECT
. For
example, to select the 8th Fibonacci number, do this:
- mysql> WITH RECURSIVE fibonacci ...
- ...
- +-------+
- | fib_n |
- +-------+
- | 13 |
- +-------+
Date Series Generation
A common table expression can generate a series of successive dates, which is useful for generating summaries that include a row for all dates in the series, including dates not represented in the summarized data.
Suppose that a table of sales numbers contains these rows:
- +------------+--------+
- +------------+--------+
- | 2017-01-03 | 100.00 |
- | 2017-01-03 | 200.00 |
- | 2017-01-06 | 50.00 |
- | 2017-01-08 | 10.00 |
- | 2017-01-08 | 20.00 |
- | 2017-01-08 | 150.00 |
- | 2017-01-10 | 5.00 |
- +------------+--------+
This query summarizes the sales per day:
- FROM sales
- +------------+-----------+
- +------------+-----------+
- | 2017-01-03 | 300.00 |
- | 2017-01-06 | 50.00 |
- | 2017-01-08 | 180.00 |
- | 2017-01-10 | 5.00 |
- +------------+-----------+
However, that result contains “holes” for dates not
represented in the range of dates spanned by the table. A result
that represents all dates in the range can be produced using a
recursive CTE to generate that set of dates, joined with a
LEFT JOIN
to the sales data.
Here is the CTE to generate the date range series:
- (
- )
The CTE produces this result:
+------------+
| date |
+------------+
| 2017-01-03 |
| 2017-01-04 |
| 2017-01-05 |
| 2017-01-06 |
| 2017-01-07 |
| 2017-01-08 |
| 2017-01-09 |
| 2017-01-10 |
+------------+
How the CTE works:
Joining the CTE with a LEFT JOIN
against the
sales
table produces the sales summary with a
row for each date in the range:
- (
- )
The output looks like this:
+------------+-----------+
| date | sum_price |
+------------+-----------+
| 2017-01-03 | 300.00 |
| 2017-01-04 | 0.00 |
| 2017-01-05 | 0.00 |
| 2017-01-06 | 50.00 |
| 2017-01-07 | 0.00 |
| 2017-01-08 | 180.00 |
| 2017-01-09 | 0.00 |
| 2017-01-10 | 5.00 |
+------------+-----------+
Some points to note:
Are the queries inefficient, particularly the one with the
MAX()
subquery executed for each row in the recursiveSELECT
? Checking withEXPLAIN
shows that the subqueries are optimized away for efficiency.The use of
COALESCE()
avoids displayingNULL
in thesum_price
column on days for which no sales data occur in thesales
table.
Hierarchical Data Traversal
Recursive common table expressions are useful for traversing
data that forms a hierarchy. Consider these statements that
create a small data set that shows, for each employee in a
company, the employee name and ID number, and the ID of the
employee's manager. The top-level employee (the CEO), has a
manager ID of NULL
(no manager).
- );
- (198, "John", 333), # John has ID 198 and reports to 333 (Yasmina)
- (692, "Tarek", 333),
- (29, "Pedro", 198),
- (4610, "Sarah", 29),
- (72, "Pierre", 29),
- (123, "Adil", 692);
The resulting data set looks like this:
- +------+---------+------------+
- | id | name | manager_id |
- +------+---------+------------+
- | 29 | Pedro | 198 |
- | 72 | Pierre | 29 |
- | 123 | Adil | 692 |
- | 198 | John | 333 |
- | 692 | Tarek | 333 |
- | 4610 | Sarah | 29 |
- +------+---------+------------+
To produce the organizational chart with the management chain for each employee (that is, the path from CEO to employee), use a recursive CTE:
The CTE produces this output:
+------+---------+-----------------+
| id | name | path |
+------+---------+-----------------+
| 333 | Yasmina | 333 |
| 198 | John | 333,198 |
| 29 | Pedro | 333,198,29 |
| 4610 | Sarah | 333,198,29,4610 |
| 72 | Pierre | 333,198,29,72 |
| 692 | Tarek | 333,692 |
| 123 | Adil | 333,692,123 |
+------+---------+-----------------+
How the CTE works:
The nonrecursive
SELECT
produces the row for the CEO (the row with aNULL
manager ID).The
path
column is widened toCHAR(200)
to ensure that there is room for the longerpath
values produced by the recursiveSELECT
.Each row produced by the recursive
SELECT
finds all employees who report directly to an employee produced by a previous row. For each such employee, the row includes the employee ID and name, and the employee management chain. The chain is the manager's chain, with the employee ID added to the end.Recursion ends when employees have no others who report to them.
To find the path for a specific employee or employees, add a
WHERE
clause to the top-level
SELECT
. For example, to display
the results for Tarek and Sarah, modify that
SELECT
like this:
Common table expressions (CTEs) are similar to derived tables in some ways:
Both constructs are named.
Both constructs exist for the scope of a single statement.
Because of these similarities, CTEs and derived tables often can be used interchangeably. As a trivial example, these statements are equivalent:
However, CTEs have some advantages over derived tables:
A derived table can be referenced only a single time within a query. A CTE can be referenced multiple times. To use multiple instances of a derived table result, you must derive the result multiple times.
A CTE can be self-referencing (recursive).
One CTE can refer to another.
A CTE may be easier to read when its definition appears at the beginning of the statement rather than embedded within it.
CTEs are similar to tables created with
CREATE [TEMPORARY]
TABLE
but need not be defined or dropped explicitly.
For a CTE, you need no privileges to create tables.
Document created the 26/06/2006, last modified the 26/10/2018
Source of the printed document:https://www.gaudry.be/en/mysql-rf-with.html
The infobrol is a personal site whose content is my sole responsibility. The text is available under CreativeCommons license (BY-NC-SA). More info on the terms of use and the author.
References
These references and links indicate documents consulted during the writing of this page, or which may provide additional information, but the authors of these sources can not be held responsible for the content of this page.
The author This site is solely responsible for the way in which the various concepts, and the freedoms that are taken with the reference works, are presented here. Remember that you must cross multiple source information to reduce the risk of errors.