Rechercher dans le manuel MySQL

10.8.5 The binary Collation Compared to _bin Collations

This section describes how the binary collation for binary strings compares to _bin collations for nonbinary strings.

Binary strings (as stored using the BINARY, VARBINARY, and BLOB data types) have a character set and collation named binary. Binary strings are sequences of bytes and the numeric values of those bytes determine comparison and sort order.

Nonbinary strings (as stored using the CHAR, VARCHAR, and TEXT data types) have a character set and collation other than binary. A given nonbinary character set can have several collations, each of which defines a particular comparison and sort order for the characters in the set. One of these is the binary collation for the character set, indicated by a _bin suffix in the collation name. For example, the binary collations for utf8 and latin1 are named utf8_bin and latin1_bin, respectively. (utf8mb4 is an exception that has two binary collations; see Section 10.10.1, “Unicode Character Sets”.)

The binary collation differs from _bin collations in several respects, discussed in the following sections:

The Unit for Comparison and Sorting

Binary strings are sequences of bytes. For the binary collation, comparison and sorting are based on numeric byte values. Nonbinary strings are sequences of characters, which might be multibyte. Collations for nonbinary strings define an ordering of the character values for comparison and sorting. For _bin collations, this ordering is based on numeric character code values, which is similar to ordering for binary strings except that character code values might be multibyte.

Inhaltsverzeichnis Haut

Character Set Conversion

A nonbinary string has a character set and is automatically converted to another character set in many cases, even when the string has a _bin collation:

  • When assigning column values to another column that has a different character set:

    1. UPDATE t1 SET utf8_bin_column=latin1_column;
    2. INSERT INTO t1 (latin1_column) SELECT utf8_bin_column FROM t2;
  • When assigning column values for INSERT or UPDATE using a string literal:

    1. SET NAMES latin1;
    2. INSERT INTO t1 (utf8_bin_column) VALUES ('string-in-latin1');
  • When sending results from the server to a client:

    1. SET NAMES latin1;
    2. SELECT utf8_bin_column FROM t2;

For binary string columns, no conversion occurs. For cases similar to those preceding, the string value is copied byte-wise.

Inhaltsverzeichnis Haut

Lettercase Conversion

Collations for nonbinary character sets provide information about lettercase of characters, so characters in a nonbinary string can be converted from one lettercase to another, even for _bin collations that ignore lettercase for ordering:

  1. mysql> SET NAMES utf8mb4 COLLATE utf8mb4_bin;
  2. mysql> SELECT LOWER('aA'), UPPER('zZ');
  3. +-------------+-------------+
  4. | LOWER('aA') | UPPER('zZ') |
  5. +-------------+-------------+
  6. | aa          | ZZ          |
  7. +-------------+-------------+

The concept of lettercase does not apply to bytes in a binary string. To perform lettercase conversion, the string must be converted to a nonbinary string:

  1. mysql> SET NAMES binary;
  2. mysql> SELECT LOWER('aA'), LOWER(CONVERT('aA' USING utf8mb4));
  3. +-------------+------------------------------------+
  4. | LOWER('aA') | LOWER(CONVERT('aA' USING utf8mb4)) |
  5. +-------------+------------------------------------+
  6. | aA          | aa                                 |
  7. +-------------+------------------------------------+

Inhaltsverzeichnis Haut

Trailing Space Handling in Comparisons

Many MySQL collations have a pad attribute of PAD SPACE. The Unicode collations based on UCA 9.0.0 and higher have a pad attribute of NO PAD; see Section 10.10.1, “Unicode Character Sets”.

The pad attribute determines how trailing spaces are treated for comparison of nonbinary strings (CHAR, VARCHAR, and TEXT values):

  • For PAD SPACE collations, trailing spaces are insignificant in comparisons; strings are compared without regard to any trailing spaces.

  • NO PAD collations treat spaces at the end of strings like any other character.

The differing behaviors can be demonstrated using the two utf8mb4 binary collations, one of which is PAD SPACE, the other of which is NO PAD. The example also shows how to use the INFORMATION_SCHEMA COLLATIONS table to determine the pad attribute for collations.

  1. mysql> SELECT COLLATION_NAME, PAD_ATTRIBUTE
  2.        FROM INFORMATION_SCHEMA.COLLATIONS
  3.        WHERE COLLATION_NAME LIKE 'utf8mb4%bin';
  4. +------------------+---------------+
  5. | COLLATION_NAME   | PAD_ATTRIBUTE |
  6. +------------------+---------------+
  7. | utf8mb4_bin      | PAD SPACE     |
  8. | utf8mb4_0900_bin | NO PAD        |
  9. +------------------+---------------+
  10. mysql> SET NAMES utf8mb4 COLLATE utf8mb4_bin;
  11. mysql> SELECT 'a ' = 'a';
  12. +------------+
  13. | 'a ' = 'a' |
  14. +------------+
  15. |          1 |
  16. +------------+
  17. mysql> SET NAMES utf8mb4 COLLATE utf8mb4_0900_bin;
  18. mysql> SELECT 'a ' = 'a';
  19. +------------+
  20. | 'a ' = 'a' |
  21. +------------+
  22. |          0 |
  23. +------------+

For binary strings, all characters are significant in comparisons, including trailing spaces:

  1. mysql> SET NAMES binary;
  2. mysql> SELECT 'a ' = 'a';
  3. +------------+
  4. | 'a ' = 'a' |
  5. +------------+
  6. |          0 |
  7. +------------+

Inhaltsverzeichnis Haut

Trailing Space Handling for Inserts and Retrievals

CHAR(N) columns store nonbinary strings. For inserts, values shorter than N characters are extended with spaces. For retrievals, trailing spaces are removed.

BINARY(N) columns store binary strings. For inserts, values shorter than N bytes are extended with 0x00 bytes. For retrievals, nothing is removed; a value of the declared length is always returned.

  1. mysql> CREATE TABLE t1 (
  2.          a CHAR(10) CHARACTER SET utf8 COLLATE utf8_bin,
  3.          b BINARY(10)
  4.        );
  5. mysql> INSERT INTO t1 VALUES ('a','a');
  6. mysql> SELECT HEX(a), HEX(b) FROM t1;
  7. +--------+----------------------+
  8. | HEX(a) | HEX(b)               |
  9. +--------+----------------------+
  10. | 61     | 61000000000000000000 |
  11. +--------+----------------------+

Suchen Sie im MySQL-Handbuch

Deutsche Übersetzung

Sie haben gebeten, diese Seite auf Deutsch zu besuchen. Momentan ist nur die Oberfläche übersetzt, aber noch nicht der gesamte Inhalt.

Wenn Sie mir bei Übersetzungen helfen wollen, ist Ihr Beitrag willkommen. Alles, was Sie tun müssen, ist, sich auf der Website zu registrieren und mir eine Nachricht zu schicken, in der Sie gebeten werden, Sie der Gruppe der Übersetzer hinzuzufügen, die Ihnen die Möglichkeit gibt, die gewünschten Seiten zu übersetzen. Ein Link am Ende jeder übersetzten Seite zeigt an, dass Sie der Übersetzer sind und einen Link zu Ihrem Profil haben.

Vielen Dank im Voraus.

Dokument erstellt 26/06/2006, zuletzt geändert 26/10/2018
Quelle des gedruckten Dokuments:https://www.gaudry.be/de/mysql-rf-charset-binary-collations.html

Die Infobro ist eine persönliche Seite, deren Inhalt in meiner alleinigen Verantwortung liegt. Der Text ist unter der CreativeCommons-Lizenz (BY-NC-SA) verfügbar. Weitere Informationen auf die Nutzungsbedingungen und dem Autor.

Referenzen

  1. Zeigen Sie - html-Dokument Sprache des Dokuments:en Manuel MySQL : https://dev.mysql.com/

Diese Verweise und Links verweisen auf Dokumente, die während des Schreibens dieser Seite konsultiert wurden, oder die zusätzliche Informationen liefern können, aber die Autoren dieser Quellen können nicht für den Inhalt dieser Seite verantwortlich gemacht werden.
Der Autor Diese Website ist allein dafür verantwortlich, wie die verschiedenen Konzepte und Freiheiten, die mit den Nachschlagewerken gemacht werden, hier dargestellt werden. Denken Sie daran, dass Sie mehrere Quellinformationen austauschen müssen, um das Risiko von Fehlern zu reduzieren.

Inhaltsverzeichnis Haut