Rechercher dans le manuel MySQL

8.2.1.19 Function Call Optimization

MySQL functions are tagged internally as deterministic or nondeterministic. A function is nondeterministic if, given fixed values for its arguments, it can return different results for different invocations. Examples of nondeterministic functions: RAND(), UUID().

If a function is tagged nondeterministic, a reference to it in a WHERE clause is evaluated for every row (when selecting from one table) or combination of rows (when selecting from a multiple-table join).

MySQL also determines when to evaluate functions based on types of arguments, whether the arguments are table columns or constant values. A deterministic function that takes a table column as argument must be evaluated whenever that column changes value.

Nondeterministic functions may affect query performance. For example, some optimizations may not be available, or more locking might be required. The following discussion uses RAND() but applies to other nondeterministic functions as well.

Suppose that a table t has this definition:

  1. CREATE TABLE t (id INT NOT NULL PRIMARY KEY, col_a VARCHAR(100));

Consider these two queries:

  1. SELECT * FROM t WHERE id = POW(1,2);
  2. SELECT * FROM t WHERE id = FLOOR(1 + RAND() * 49);

Both queries appear to use a primary key lookup because of the equality comparison against the primary key, but that is true only for the first of them:

  • The first query always produces a maximum of one row because POW() with constant arguments is a constant value and is used for index lookup.

  • The second query contains an expression that uses the nondeterministic function RAND(), which is not constant in the query but in fact has a new value for every row of table t. Consequently, the query reads every row of the table, evaluates the predicate for each row, and outputs all rows for which the primary key matches the random value. This might be zero, one, or multiple rows, depending on the id column values and the values in the RAND() sequence.

The effects of nondeterminism are not limited to SELECT statements. This UPDATE statement uses a nondeterministic function to select rows to be modified:

  1. UPDATE t SET col_a = some_expr WHERE id = FLOOR(1 + RAND() * 49);

Presumably the intent is to update at most a single row for which the primary key matches the expression. However, it might update zero, one, or multiple rows, depending on the id column values and the values in the RAND() sequence.

The behavior just described has implications for performance and replication:

  • Because a nondeterministic function does not produce a constant value, the optimizer cannot use strategies that might otherwise be applicable, such as index lookups. The result may be a table scan.

  • InnoDB might escalate to a range-key lock rather than taking a single row lock for one matching row.

  • Updates that do not execute deterministically are unsafe for replication.

The difficulties stem from the fact that the RAND() function is evaluated once for every row of the table. To avoid multiple function evaluations, use one of these techniques:

  • Move the expression containing the nondeterministic function to a separate statement, saving the value in a variable. In the original statement, replace the expression with a reference to the variable, which the optimizer can treat as a constant value:

    1. SET @keyval = FLOOR(1 + RAND() * 49);
    2. UPDATE t SET col_a = some_expr WHERE id = @keyval;
  • Assign the random value to a variable in a derived table. This technique causes the variable to be assigned a value, once, prior to its use in the comparison in the WHERE clause:

    1. UPDATE /*+ NO_MERGE(dt) */ t, (SELECT FLOOR(1 + RAND() * 49) AS r) AS dt
    2. SET col_a = some_expr WHERE id = dt.r;

As mentioned previously, a nondeterministic expression in the WHERE clause might prevent optimizations and result in a table scan. However, it may be possible to partially optimize the WHERE clause if other expressions are deterministic. For example:

  1. SELECT * FROM t WHERE partial_key=5 AND some_column=RAND();

If the optimizer can use partial_key to reduce the set of rows selected, RAND() is executed fewer times, which diminishes the effect of nondeterminism on optimization.


Rechercher dans le manuel MySQL

Traduction non disponible

Le manuel MySQL n'est pas encore traduit en français sur l'infobrol. Seule la version anglaise est disponible pour l'instant.

Document créé le 26/06/2006, dernière modification le 26/10/2018
Source du document imprimé : https://www.gaudry.be/mysql-rf-function-optimization.html

L'infobrol est un site personnel dont le contenu n'engage que moi. Le texte est mis à disposition sous licence CreativeCommons(BY-NC-SA). Plus d'info sur les conditions d'utilisation et sur l'auteur.

Références

  1. Consulter le document html Langue du document :en Manuel MySQL : https://dev.mysql.com/

Ces références et liens indiquent des documents consultés lors de la rédaction de cette page, ou qui peuvent apporter un complément d'information, mais les auteurs de ces sources ne peuvent être tenus responsables du contenu de cette page.
L'auteur de ce site est seul responsable de la manière dont sont présentés ici les différents concepts, et des libertés qui sont prises avec les ouvrages de référence. N'oubliez pas que vous devez croiser les informations de sources multiples afin de diminuer les risques d'erreurs.

Table des matières Haut