Rechercher dans le manuel MySQL
6.4.8.4 MySQL Enterprise Data Masking and De-Identification User-Defined Function Reference
The MySQL Enterprise Data Masking and De-Identification plugin library includes several user-defined functions (UDFs), which may be grouped into these categories:
These UDFs treat string arguments as binary strings, which means
they are implicitly case sensitive. In addition, string UDF
return values are binary strings. If a string return value
should be in a different character set, convert it. The
following example shows how to convert the result of
gen_rnd_email()
to the
utf8mb4
character set:
It may also be necessary to convert string arguments, as illustrated in Using Masked Data for Customer Identification.
Data Masking Functions
Each function in this section performs a masking operation on its string argument and returns the masked result.
mask_inner(
str
,margin1
,margin2
[,mask_char
])Masks the interior part of a string, leaving the ends untouched, and returns the result. An optional masking character can be specified.
Arguments:
str
: The string to mask.margin1
: A nonnegative integer that specifies the number of characters on the left end of the string to remain unmasked. If the value is 0, no left end characters remain unmasked.margin2
: A nonnegative integer that specifies the number of characters on the right end of the string to remain unmasked. If the value is 0, no right end characters remain unmasked.mask_char
: (Optional) The single character to use for masking. The default is'X'
ifmask_char
is not given.Because UDF string arguments are treated as binary strings, the masking character must be a single-byte character. Attempts to use a multibyte character produce an error.
Return value:
The masked string, or
NULL
if either margin is negative.If the sum of the margin values is larger than the argument length, no masking occurs and the argument is returned unchanged.
Example:
- +----------------------------+---------------------------+
- | mask_inner('abcdef', 1, 2) | mask_inner('abcdef',0, 5) |
- +----------------------------+---------------------------+
- | aXXXef | Xbcdef |
- +----------------------------+---------------------------+
- +---------------------------------+--------------------------------+
- | mask_inner('abcdef', 1, 2, '*') | mask_inner('abcdef',0, 5, '#') |
- +---------------------------------+--------------------------------+
- | a***ef | #bcdef |
- +---------------------------------+--------------------------------+
mask_outer(
str
,margin1
,margin2
[,mask_char
])Masks the left and right ends of a string, leaving the interior unmasked, and returns the result. An optional masking character can be specified.
Arguments:
str
: The string to mask.margin1
: A nonnegative integer that specifies the number of characters on the left end of the string to mask. If the value is 0, no left end characters are masked.margin2
: A nonnegative integer that specifies the number of characters on the right end of the string to mask. If the value is 0, no right end characters are masked.mask_char
: (Optional) The single character to use for masking. The default is'X'
ifmask_char
is not given.Because UDF string arguments are treated as binary strings, the masking character must be a single-byte character. Attempts to use a multibyte character produce an error.
Return value:
The masked string, or
NULL
if either margin is negative.If the sum of the margin values is larger than the argument length, the entire argument is masked.
Example:
- +----------------------------+---------------------------+
- | mask_outer('abcdef', 1, 2) | mask_outer('abcdef',0, 5) |
- +----------------------------+---------------------------+
- | XbcdXX | aXXXXX |
- +----------------------------+---------------------------+
- +---------------------------------+--------------------------------+
- | mask_outer('abcdef', 1, 2, '*') | mask_outer('abcdef',0, 5, '#') |
- +---------------------------------+--------------------------------+
- | *bcd** | a##### |
- +---------------------------------+--------------------------------+
Masks a payment card Primary Account Number and returns the number with all but the last four digits replaced by
'X'
characters.Arguments:
str
: The string to mask. The string must be a suitable length for the Primary Account Number, but is not otherwise checked.
Return value:
The masked payment number as a string. If the argument is shorter than required, it is returned unchanged.
Example:
- +-------------------------+
- | mask_pan(gen_rnd_pan()) |
- +-------------------------+
- | XXXXXXXXXXXX9102 |
- +-------------------------+
- +---------------------------+
- | mask_pan(gen_rnd_pan(19)) |
- +---------------------------+
- | XXXXXXXXXXXXXXX8268 |
- +---------------------------+
- +-----------------+
- | mask_pan('a*Z') |
- +-----------------+
- | a*Z |
- +-----------------+
Masks a payment card Primary Account Number and returns the number with all but the first six and last four digits replaced by
'X'
characters. The first six digits indicate the payment card issuer.Arguments:
str
: The string to mask. The string must be a suitable length for the Primary Account Number, but is not otherwise checked.
Return value:
The masked payment number as a string. If the argument is shorter than required, it is returned unchanged.
Example:
- +---------------------------------+
- | mask_pan_relaxed(gen_rnd_pan()) |
- +---------------------------------+
- | 551279XXXXXX3108 |
- +---------------------------------+
- +-----------------------------------+
- | mask_pan_relaxed(gen_rnd_pan(19)) |
- +-----------------------------------+
- | 462634XXXXXXXXX6739 |
- +-----------------------------------+
- +-------------------------+
- | mask_pan_relaxed('a*Z') |
- +-------------------------+
- | a*Z |
- +-------------------------+
Masks a U.S. Social Security number and returns the number with all but the last four digits replaced by
'X'
characters.Arguments:
str
: The string to mask. The string must be 11 characters long, but is not otherwise checked.
Return value:
The masked Social Security number as a string, or
NULL
if the argument is not the correct length.Example:
- +-------------------------+-------------------------+
- | mask_ssn('909-63-6922') | mask_ssn('abcdefghijk') |
- +-------------------------+-------------------------+
- | XXX-XX-6922 | XXX-XX-hijk |
- +-------------------------+-------------------------+
- +-----------------+
- | mask_ssn('909') |
- +-----------------+
- +-----------------+
The functions in this section generate random values for
different types of data. When possible, generated values have
characteristics reserved for demonstration or test values, to
avoid having them mistaken for legitimate data. For example,
gen_rnd_us_phone()
returns a
U.S. phone number that uses the 555 area code, which is not
assigned to phone numbers in actual use. Individual function
descriptions describe any exceptions to this principle.
Generates a random number chosen from a specified range.
Arguments:
lower
: An integer that specifies the lower boundary of the range.upper
: An integer that specifies the upper boundary of the range, which must not be less than the lower boundary.
Return value:
A random integer in the range from
lower
toupper
, inclusive, orNULL
if theupper
argument is less thanlower
.Example:
- +---------------------+------------------------+
- | gen_range(100, 200) | gen_range(-1000, -800) |
- +---------------------+------------------------+
- | 177 | -917 |
- +---------------------+------------------------+
- +-----------------+
- | gen_range(1, 0) |
- +-----------------+
- +-----------------+
Generates a random email address in the
example.com
domain.Arguments:
None.
Return value:
A random email address as a string.
Example:
- +---------------------------+
- | gen_rnd_email() |
- +---------------------------+
- | ijocv.mwvhhuf@example.com |
- +---------------------------+
Generates a random payment card Primary Account Number. The number passes the Luhn check (an algorithm that performs a checksum verification against a check digit).
WarningValues returned from
gen_rnd_pan()
should be used only for test purposes, and are not suitable for publication. There is no way to guarantee that a given return value is not assigned to a legitimate payment account. Should it be necessary to publish agen_rnd_pan()
result, consider masking it withmask_pan()
ormask_pan_relaxed()
.Arguments:
size
: (Optional) An integer that specifies the size of the result. The default is 16 ifsize
is not given. If given,size
must be an integer in the range from 12 to 19.
Return value:
A random payment number as a string, or
NULL
if asize
argument outside the permitted range is given.Example:
- +-------------------------+
- | mask_pan(gen_rnd_pan()) |
- +-------------------------+
- | XXXXXXXXXXXX5805 |
- +-------------------------+
- +---------------------------+
- | mask_pan(gen_rnd_pan(19)) |
- +---------------------------+
- | XXXXXXXXXXXXXXX5067 |
- +---------------------------+
- +---------------------------------+
- | mask_pan_relaxed(gen_rnd_pan()) |
- +---------------------------------+
- | 398403XXXXXX9547 |
- +---------------------------------+
- +-----------------------------------+
- | mask_pan_relaxed(gen_rnd_pan(19)) |
- +-----------------------------------+
- | 578416XXXXXXXXX6509 |
- +-----------------------------------+
- +-----------------+-----------------+
- | gen_rnd_pan(11) | gen_rnd_pan(20) |
- +-----------------+-----------------+
- +-----------------+-----------------+
Generates a random U.S. Social Security number in
format. TheAAA
-BB
-CCCC
AAA
part is greater than 900 and theBB
part is less than 70, which are characteristics not used for legitimate Social Security numbers.Arguments:
None.
Return value:
A random Social Security number as a string.
Example:
- +---------------+
- | gen_rnd_ssn() |
- +---------------+
- | 951-26-0058 |
- +---------------+
Generates a random U.S. phone number in
1-555-
format. The 555 area code is not used for legitimate phone numbers.AAA
-BBBB
Arguments:
None.
Return value:
A random U.S. phone number as a string.
Example:
- +--------------------+
- | gen_rnd_us_phone() |
- +--------------------+
- | 1-555-682-5423 |
- +--------------------+
The functions in this section manipulate dictionaries of terms
and perform generation and masking operations based on them.
Some of these functions require the
SUPER
privilege.
When a dictionary is loaded, it becomes part of the dictionary registry and is assigned a name to be used by other dictionary functions. Dictionaries are loaded from plain text files containing one term per line. Empty lines are ignored. To be valid, a dictionary file must contain at least one nonempty line.
gen_blacklist(
str
,dictionary_name
,replacement_dictionary_name
)Replaces a term present in one dictionary with a term from a second dictionary and returns the replacement term. This masks the original term by substitution.
Arguments:
str
: A string that indicates the term to replace.dictionary_name
: A string that names the dictionary containing the term to replace.replacement_dictionary_name
: A string that names the dictionary from which to choose the replacement term.
Return value:
A string randomly chosen from
replacement_dictionary_name
as a replacement forstr
, orstr
if it does not appear indictionary_name
, orNULL
if either dictionary name is not in the dictionary registry.If the term to replace appears in both dictionaries, it is possible for the return value to be the same term.
Example:
- +---------------------------------------------------+
- | gen_blacklist('Berlin', 'DE_Cities', 'US_Cities') |
- +---------------------------------------------------+
- | Phoenix |
- +---------------------------------------------------+
gen_dictionary(
dictionary_name
)Returns a random term from a dictionary.
Arguments:
dictionary_name
: A string that names the dictionary from which to choose the term.
Return value:
A random term from the dictionary as a string, or
NULL
if the dictionary name is not in the dictionary registry.Example:
- +--------------------------+
- | gen_dictionary('mydict') |
- +--------------------------+
- | My term |
- +--------------------------+
- +--------------------------------+
- | gen_dictionary('no-such-dict') |
- +--------------------------------+
- +--------------------------------+
gen_dictionary_drop(
dictionary_name
)Removes a dictionary from the dictionary registry.
This function requires the
SUPER
privilege.Arguments:
dictionary_name
: A string that names the dictionary to remove from the dictionary registry.
Return value:
A string that indicates whether the drop operation succeeded.
Dictionary removed
indicates success.Dictionary removal error
indicates failure.Example:
- +-------------------------------+
- | gen_dictionary_drop('mydict') |
- +-------------------------------+
- | Dictionary removed |
- +-------------------------------+
- +-------------------------------------+
- | gen_dictionary_drop('no-such-dict') |
- +-------------------------------------+
- | Dictionary removal error |
- +-------------------------------------+
gen_dictionary_load(
dictionary_path
,dictionary_name
)Loads a file into the dictionary registry and assigns the dictionary a name to be used with other functions that require a dictionary name argument.
This function requires the
SUPER
privilege.ImportantDictionaries are not persistent. Any dictionary used by applications must be loaded for each server startup.
Once loaded into the registry, a dictionary is used as is, even if the underlying dictionary file changes. To reload a dictionary, first drop it with
gen_dictionary_drop()
, then load it again withgen_dictionary_load()
.Arguments:
dictionary_path
: A string that specifies the path name of the dictionary file.dictionary_name
: A string that provides a name for the dictionary.
Return value:
A string that indicates whether the load operation succeeded.
Dictionary load success
indicates success.Dictionary load error
indicates failure. Dictionary load failure can occur for several reasons, including:A dictionary with the given name is already loaded.
The dictionary file is not found.
The dictionary file contains no terms.
The
secure_file_priv
system variable is set and the dictionary file is not located in the directory named by the variable.
Example:
- +---------------------------------------------------------------------+
- | gen_dictionary_load('/usr/local/mysql/mysql-files/mydict','mydict') |
- +---------------------------------------------------------------------+
- +---------------------------------------------------------------------+
- +-----------------------------------------+
- | gen_dictionary_load('/dev/null','null') |
- +-----------------------------------------+
- +-----------------------------------------+
Document created the 26/06/2006, last modified the 26/10/2018
Source of the printed document:https://www.gaudry.be/en/mysql-rf-data-masking-reference.html
The infobrol is a personal site whose content is my sole responsibility. The text is available under CreativeCommons license (BY-NC-SA). More info on the terms of use and the author.
References
These references and links indicate documents consulted during the writing of this page, or which may provide additional information, but the authors of these sources can not be held responsible for the content of this page.
The author This site is solely responsible for the way in which the various concepts, and the freedoms that are taken with the reference works, are presented here. Remember that you must cross multiple source information to reduce the risk of errors.