levenshtein
(PHP 4 >= 4.0.1, PHP 5, PHP 7)
levenshtein — Calculate Levenshtein distance between two strings
Description
$str1
, string $str2
) : int$str1
, string $str2
, int $cost_ins
, int $cost_rep
, int $cost_del
) : int
The Levenshtein distance is defined as the minimal number of
characters you have to replace, insert or delete to transform
str1
into str2
.
The complexity of the algorithm is O(m*n),
where n and m are the
length of str1
and
str2
(rather good when compared to
similar_text(), which is O(max(n,m)**3),
but still expensive).
In its simplest form the function will take only the two
strings as parameter and will calculate just the number of
insert, replace and delete operations needed to transform
str1
into str2
.
A second variant will take three additional parameters that define the cost of insert, replace and delete operations. This is more general and adaptive than variant one, but not as efficient.
Parameters
-
str1
-
One of the strings being evaluated for Levenshtein distance.
-
str2
-
One of the strings being evaluated for Levenshtein distance.
-
cost_ins
-
Defines the cost of insertion.
-
cost_rep
-
Defines the cost of replacement.
-
cost_del
-
Defines the cost of deletion.
Return Values
This function returns the Levenshtein-Distance between the two argument strings or -1, if one of the argument strings is longer than the limit of 255 characters.
Examples
Example #1 levenshtein() example
<?php
// input misspelled word
$input = 'carrrot';
// array of words to check against
$words = array('apple','pineapple','banana','orange',
'radish','carrot','pea','bean','potato');
// no shortest distance found, yet
$shortest = -1;
// loop through words to find the closest
foreach ($words as $word) {
// calculate the distance between the input word,
// and the current word
$lev = levenshtein($input, $word);
// check for an exact match
if ($lev == 0) {
// closest word is this one (exact match)
$closest = $word;
$shortest = 0;
// break out of the loop; we've found an exact match
break;
}
// if this distance is less than the next found shortest
// distance, OR if a next shortest word has not yet been found
if ($lev <= $shortest || $shortest < 0) {
// set the closest match, and shortest distance
$closest = $word;
$shortest = $lev;
}
}
echo "Input word: $input\n";
if ($shortest == 0) {
echo "Exact match found: $closest\n";
} else {
echo "Did you mean: $closest?\n";
}
?>
The above example will output:
Input word: carrrot Did you mean: carrot?
See Also
- soundex() - Calculate the soundex key of a string
- similar_text() - Calculate the similarity between two strings
- metaphone() - Calculate the metaphone key of a string
English translation
You have asked to visit this site in English. For now, only the interface is translated, but not all the content yet.If you want to help me in translations, your contribution is welcome. All you need to do is register on the site, and send me a message asking me to add you to the group of translators, which will give you the opportunity to translate the pages you want. A link at the bottom of each translated page indicates that you are the translator, and has a link to your profile.
Thank you in advance.
Document created the 30/01/2003, last modified the 26/10/2018
Source of the printed document:https://www.gaudry.be/en/php-rf-levenshtein.html
The infobrol is a personal site whose content is my sole responsibility. The text is available under CreativeCommons license (BY-NC-SA). More info on the terms of use and the author.
References
These references and links indicate documents consulted during the writing of this page, or which may provide additional information, but the authors of these sources can not be held responsible for the content of this page.
The author This site is solely responsible for the way in which the various concepts, and the freedoms that are taken with the reference works, are presented here. Remember that you must cross multiple source information to reduce the risk of errors.