Hi,
I am new to ES and have been doing some simple testing of fuzzy matching. I
have a query related to Levenshtein distance. Does ElasticSearch
use Levenshtein distance or Damerau–Levenshtein distance?
For example I have the following text stored in an index (analyzer: simple):
AARONS
When search using 'arosn' the text is not found. The queries that I have
been testing with are as follows:
{
"size":50,
"query":{
"fuzzy":{
"surname":{
"value":"arosn",
"fuzziness":2,
"prefix_length":1,
"max_expansions":100
}
}
}
}
and
{
"size":50,
"query":{
"match":{
"surname":{
"query":"arosn",
"fuzziness":2
}
}
}
}
{
"size":50,
"query":{
"match":{
"surname":{
"query":"arosn~",
"fuzziness":2
}
}
}
}
{
"size":50,
"query":{
"query_string":{
"default_field":"surname",
"fuzziness":2,
"query":"arosn~2"
}
}
}
If the Damerau–Levenshtein distance algorithm was is use then I would
expect this to match with a distance of two:
arosn + (a) à aarosn + swap (n & s) à aarons
I am a little confused as there is reference to Damerau–Levenshtein:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#_fuzziness
So any ideas on how I can get Damerau–Levenshtein to work?
Thanks
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c9b9fb8b-d1f4-46d8-9426-a1dc1a729c9a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.