Rechercher dans le manuel MySQL

10.8.5 The binary Collation Compared to _bin Collations

This section describes how the binary collation for binary strings compares to the _bin collations for nonbinary strings.

Binary strings (as stored using the BINARY, VARBINARY, and BLOB data types) have a character set and collation named binary. Binary strings are sequences of bytes and the numeric values of those bytes determine comparison and sort order.

Nonbinary strings (as stored using the CHAR, VARCHAR, and TEXT data types) have a character set and collation other than binary. A given nonbinary character set can have several collations, each of which defines a particular comparison and sort order for the characters in the set. One of these is the binary collation for the character set, indicated by a _bin suffix in the collation name. For example, the binary collations for latin1 and utf8 are named latin1_bin and utf8_bin, respectively.

The binary collation differs from the _bin collations in several respects.

The unit for comparison and sorting.  Binary strings are sequences of bytes. For the binary collation, comparison and sorting are based on numeric byte values. Nonbinary strings are sequences of characters, which might be multibyte. Collations for nonbinary strings define an ordering of the character values for comparison and sorting. For the _bin collation, this ordering is based on numeric character code values, which is similar to ordering for binary strings except that character code values might be multibyte.

Character set conversion.  A nonbinary string has a character set and is automatically converted to another character set in many cases, even when the string has a _bin collation:

  • When assigning column values from another column that has a different character set:

    1. UPDATE t1 SET utf8_bin_column=latin1_column;
    2. INSERT INTO t1 (latin1_column) SELECT utf8_bin_column FROM t2;
  • When assigning column values for INSERT or UPDATE using a string literal:

    1. SET NAMES latin1;
    2. INSERT INTO t1 (utf8_bin_column) VALUES ('string-in-latin1');
  • When sending results from the server to a client:

    1. SET NAMES latin1;
    2. SELECT utf8_bin_column FROM t2;

For binary string columns, no conversion occurs. For the preceding cases, the string value is copied byte-wise.

Lettercase conversion.  Collations for nonbinary character sets provide information about lettercase of characters, so characters in a nonbinary string can be converted from one lettercase to another, even for _bin collations that ignore lettercase for ordering:

  1. mysql> SET NAMES latin1 COLLATE latin1_bin;
  2. mysql> SELECT LOWER('aA'), UPPER('zZ');
  3. +-------------+-------------+
  4. | LOWER('aA') | UPPER('zZ') |
  5. +-------------+-------------+
  6. | aa          | ZZ          |
  7. +-------------+-------------+

The concept of lettercase does not apply to bytes in a binary string. To perform lettercase conversion, the string must be converted to a nonbinary string:

  1. mysql> SET NAMES binary;
  2. mysql> SELECT LOWER('aA'), LOWER(CONVERT('aA' USING latin1));
  3. +-------------+-----------------------------------+
  4. | LOWER('aA') | LOWER(CONVERT('aA' USING latin1)) |
  5. +-------------+-----------------------------------+
  6. | aA          | aa                                |
  7. +-------------+-----------------------------------+

Trailing space handling in comparisons.  Most MySQL collations have a pad attribute of PAD SPACE. The exceptions are Unicode collations based on UCA 9.0.0 and higher, which have a pad attribute of NO PAD. (see Section 10.10.1, “Unicode Character Sets”).

To determine the pad attribute for a collation, use the INFORMATION_SCHEMA COLLATIONS table, which has a PAD_ATTRIBUTE column.

The pad attribute determines how trailing spaces are treated for comparison of nonbinary strings (CHAR, VARCHAR, and TEXT values). NO PAD collations treat spaces at the end of strings like any other character. For PAD SPACE collations, trailing spaces are insignificant in comparisons; strings are compared without regard to any trailing spaces:

  1. mysql> SET NAMES utf8 COLLATE utf8_bin;
  2. mysql> SELECT 'a ' = 'a';
  3. +------------+
  4. | 'a ' = 'a' |
  5. +------------+
  6. |          1 |
  7. +------------+

For binary strings, all characters are significant in comparisons, including trailing spaces:

  1. mysql> SET NAMES binary;
  2. mysql> SELECT 'a ' = 'a';
  3. +------------+
  4. | 'a ' = 'a' |
  5. +------------+
  6. |          0 |
  7. +------------+

Trailing space handling for inserts and retrievals.  CHAR(N) columns store nonbinary strings. Values shorter than N characters are extended with spaces on insertion. For retrieval, trailing spaces are removed.

BINARY(N) columns store binary strings. Values shorter than N bytes are extended with 0x00 bytes on insertion. For retrieval, nothing is removed; a value of the declared length is always returned.

  1. mysql> CREATE TABLE t1 (
  2.          a CHAR(10) CHARACTER SET utf8 COLLATE utf8_bin,
  3.          b BINARY(10)
  4.        );
  5. mysql> INSERT INTO t1 VALUES ('a','a');
  6. mysql> SELECT HEX(a), HEX(b) FROM t1;
  7. +--------+----------------------+
  8. | HEX(a) | HEX(b)               |
  9. +--------+----------------------+
  10. | 61     | 61000000000000000000 |
  11. +--------+----------------------+

Rechercher dans le manuel MySQL

Traduction non disponible

Le manuel MySQL n'est pas encore traduit en français sur l'infobrol. Seule la version anglaise est disponible pour l'instant.

Document créé le 26/06/2006, dernière modification le 26/10/2018
Source du document imprimé : https://www.gaudry.be/mysql-rf-charset-binary-collations.html

L'infobrol est un site personnel dont le contenu n'engage que moi. Le texte est mis à disposition sous licence CreativeCommons(BY-NC-SA). Plus d'info sur les conditions d'utilisation et sur l'auteur.

Références

  1. Consulter le document html Langue du document :en Manuel MySQL : https://dev.mysql.com/

Ces références et liens indiquent des documents consultés lors de la rédaction de cette page, ou qui peuvent apporter un complément d'information, mais les auteurs de ces sources ne peuvent être tenus responsables du contenu de cette page.
L'auteur de ce site est seul responsable de la manière dont sont présentés ici les différents concepts, et des libertés qui sont prises avec les ouvrages de référence. N'oubliez pas que vous devez croiser les informations de sources multiples afin de diminuer les risques d'erreurs.

Table des matières Haut