Using fuzzy (Levenshtein) in filter

Mir_S · July 17, 2016, 1:48pm

Hi everybody,

I have a short question. Did somebody know how it would be possible to combine regex queries together with fuzzy (Levenshtein) filter?

I firstly would like to perform a query using regexp (even that I know it slows things down). The result should then be filtered or perhaps should be used as base for a new query which uses fuzziness.

Is there any way to do this? And if yes could somebody can provide me a hint how to do it?

As a example:
I have a word in my documents like "waterresistant".
Perhaps I have the search term "waserr*" wherein the term "terr" (First Both letter must be absolut terms, "terr" must be used by the fuzzy algorithm, and the rest will be wildcard ) can differ with levenstein 1 from the other.

But it should mach the given word.

Thanks for your help.

BR
Ralf

nik9000 · July 17, 2016, 2:56pm

Just put both in a bool query and it should work. If you put the regex in
the must clause it'll influence scoring. If you put it in a filter clause
it won't.

Mir_S · July 17, 2016, 4:14pm

HI,

thanks for your tip. But I did not get the right result.
First I tried the two queries separated:

Query for wildcard (I used regex but also wildcard query would be possible):
GET fuzztest/_search { "query": { "regexp": { "name": "wa[a-z]*" } } }

Second the query for fuzziness:
GET /fuzztest/_search { "query": { "match": { "name": { "query": "waserr", "prefix_length": 2, "fuzziness": 1, "max_expansions": 6 } } } }

First Query delivers the document with waterresistant in it.
Second does not deliver any document.

If I bool both a must statement then no document will be delivered.

Did you have an idea how to get both together. It should also match documents like "waterresistantclock" or similar.

Any help would be great.

nik9000 · July 17, 2016, 4:42pm

I see! I wasn't reading closely enough. I don't know of a query that does
this. I believe the completion suggester has some features along these
lines through but I'm not too familiar with them.

You could do a fuzzy query on an edge ngram analysis of the field to get a
kind of fizzy prefix. Like analyze the field two ways, once with the edge
ngram and once in your normal way.

Mir_S · July 17, 2016, 4:54pm

Ya, I found a solution.

I create ngram indexer for search and index for the field and then used your idea with bool for regex and fuzzy. This worked as expected.

Thanks for your help.

BR

Topic		Replies	Views
Wildcard and Fuzzy query together Elasticsearch	5	3428	November 6, 2018
Fuzzy regexp search Elasticsearch	4	2623	July 5, 2017
Control fuzziness in a bool query Elasticsearch	1	535	July 6, 2017
Fuzziness in prefix query Elasticsearch	6	2501	March 2, 2018
How to use match_phrase and fuzzy query at the same time Elasticsearch	1	661	December 2, 2020

Using fuzzy (Levenshtein) in filter

Related topics