Rechercher dans le manuel MySQL
23.2.4 HASH Partitioning
Partitioning by HASH
is used primarily to
ensure an even distribution of data among a predetermined number
of partitions. With range or list partitioning, you must specify
explicitly which partition a given column value or set of column
values should be stored in; with hash partitioning, this
decision is taken care of for you, and you need only specify a
column value or expression based on a column value to be hashed
and the number of partitions into which the partitioned table is
to be divided.
To partition a table using HASH
partitioning,
it is necessary to append to the CREATE
TABLE
statement a PARTITION BY HASH
(
clause, where
expr
)expr
is an expression that returns an
integer. This can simply be the name of a column whose type is
one of MySQL's integer types. In addition, you most likely
want to follow this with PARTITIONS
, where
num
num
is a positive integer
representing the number of partitions into which the table is to
be divided.
For simplicity, the tables in the examples that follow do not use any keys. You should be aware that, if a table has any unique keys, every column used in the partitioning expression for this table must be part of every unique key, including the primary key. See Section 23.6.1, “Partitioning Keys, Primary Keys, and Unique Keys”, for more information.
The following statement creates a table that uses hashing on the
store_id
column and is divided into 4
partitions:
If you do not include a PARTITIONS
clause,
the number of partitions defaults to 1
; using
the PARTITIONS
keyword without a number
following it results in a syntax error.
You can also use an SQL expression that returns an integer for
expr
. For instance, you might want to
partition based on the year in which an employee was hired. This
can be done as shown here:
expr
must return a nonconstant,
nonrandom integer value (in other words, it should be varying
but deterministic), and must not contain any prohibited
constructs as described in
Section 23.6, “Restrictions and Limitations on Partitioning”. You should also keep
in mind that this expression is evaluated each time a row is
inserted or updated (or possibly deleted); this means that very
complex expressions may give rise to performance issues,
particularly when performing operations (such as batch inserts)
that affect a great many rows at one time.
The most efficient hashing function is one which operates upon a single table column and whose value increases or decreases consistently with the column value, as this allows for “pruning” on ranges of partitions. That is, the more closely that the expression varies with the value of the column on which it is based, the more efficiently MySQL can use the expression for hash partitioning.
For example, where date_col
is a column of
type DATE
, then the expression
TO_DAYS(date_col)
is said to vary
directly with the value of date_col
, because
for every change in the value of date_col
,
the value of the expression changes in a consistent manner. The
variance of the expression
YEAR(date_col)
with respect to
date_col
is not quite as direct as that of
TO_DAYS(date_col)
, because not
every possible change in date_col
produces an
equivalent change in
YEAR(date_col)
. Even so,
YEAR(date_col)
is a good
candidate for a hashing function, because it varies directly
with a portion of date_col
and there is no
possible change in date_col
that produces a
disproportionate change in
YEAR(date_col)
.
By way of contrast, suppose that you have a column named
int_col
whose type is
INT
. Now consider the expression
POW(5-int_col,3) + 6
. This would
be a poor choice for a hashing function because a change in the
value of int_col
is not guaranteed to produce
a proportional change in the value of the expression. Changing
the value of int_col
by a given amount can
produce widely differing changes in the value of the expression.
For example, changing int_col
from
5
to 6
produces a change
of -1
in the value of the expression, but
changing the value of int_col
from
6
to 7
produces a change
of -7
in the expression value.
In other words, the more closely the graph of the column value
versus the value of the expression follows a straight line as
traced by the equation
y=
where
c
xc
is some nonzero constant, the
better the expression is suited to hashing. This has to do with
the fact that the more nonlinear an expression is, the more
uneven the distribution of data among the partitions it tends to
produce.
In theory, pruning is also possible for expressions involving more than one column value, but determining which of such expressions are suitable can be quite difficult and time-consuming. For this reason, the use of hashing expressions involving multiple columns is not particularly recommended.
When PARTITION BY HASH
is used, the storage
engine determines which partition of
num
partitions to use based on the
modulus of the result of the expression. In other words, for a
given expression expr
, the partition
in which the record is stored is partition number
N
, where
. Suppose that table
N
=
MOD(expr
,
num
)t1
is defined as follows, so that it has 4
partitions:
If you insert a record into t1
whose
col3
value is
'2005-09-15'
, then the partition in which it
is stored is determined as follows:
MySQL 8.0 also supports a variant of
HASH
partitioning known as
linear hashing which
employs a more complex algorithm for determining the placement
of new rows inserted into the partitioned table. See
Section 23.2.4.1, “LINEAR HASH Partitioning”, for a description of
this algorithm.
The user-supplied expression is evaluated each time a record is inserted or updated. It may also—depending on the circumstances—be evaluated when records are deleted.
Nederlandse vertaling
U hebt gevraagd om deze site in het Nederlands te bezoeken. Voor nu wordt alleen de interface vertaald, maar nog niet alle inhoud.Als je me wilt helpen met vertalingen, is je bijdrage welkom. Het enige dat u hoeft te doen, is u op de site registreren en mij een bericht sturen waarin u wordt gevraagd om u toe te voegen aan de groep vertalers, zodat u de gewenste pagina's kunt vertalen. Een link onderaan elke vertaalde pagina geeft aan dat u de vertaler bent en heeft een link naar uw profiel.
Bij voorbaat dank.
Document heeft de 26/06/2006 gemaakt, de laatste keer de 26/10/2018 gewijzigd
Bron van het afgedrukte document:https://www.gaudry.be/nl/mysql-rf-partitioning-hash.html
De infobrol is een persoonlijke site waarvan de inhoud uitsluitend mijn verantwoordelijkheid is. De tekst is beschikbaar onder CreativeCommons-licentie (BY-NC-SA). Meer info op de gebruiksvoorwaarden en de auteur.
Referenties
Deze verwijzingen en links verwijzen naar documenten die geraadpleegd zijn tijdens het schrijven van deze pagina, of die aanvullende informatie kunnen geven, maar de auteurs van deze bronnen kunnen niet verantwoordelijk worden gehouden voor de inhoud van deze pagina.
De auteur Deze site is als enige verantwoordelijk voor de manier waarop de verschillende concepten, en de vrijheden die met de referentiewerken worden genomen, hier worden gepresenteerd. Vergeet niet dat u meerdere broninformatie moet doorgeven om het risico op fouten te verkleinen.