Using fuzzy (Levenshtein) in filter

Hi everybody,

I have a short question. Did somebody know how it would be possible to combine regex queries together with fuzzy (Levenshtein) filter?

I firstly would like to perform a query using regexp (even that I know it slows things down). The result should then be filtered or perhaps should be used as base for a new query which uses fuzziness.

Is there any way to do this? And if yes could somebody can provide me a hint how to do it?

As a example:
I have a word in my documents like "waterresistant".
Perhaps I have the search term "waserr*" wherein the term "terr" (First Both letter must be absolut terms, "terr" must be used by the fuzzy algorithm, and the rest will be wildcard ) can differ with levenstein 1 from the other.

But it should mach the given word.

Thanks for your help.

BR
Ralf

Just put both in a bool query and it should work. If you put the regex in
the must clause it'll influence scoring. If you put it in a filter clause
it won't.

HI,

thanks for your tip. But I did not get the right result.
First I tried the two queries separated:

Query for wildcard (I used regex but also wildcard query would be possible):
GET fuzztest/_search { "query": { "regexp": { "name": "wa[a-z]*" } } }

Second the query for fuzziness:
GET /fuzztest/_search { "query": { "match": { "name": { "query": "waserr", "prefix_length": 2, "fuzziness": 1, "max_expansions": 6 } } } }

First Query delivers the document with waterresistant in it.
Second does not deliver any document.

If I bool both a must statement then no document will be delivered.

Did you have an idea how to get both together. It should also match documents like "waterresistantclock" or similar.

Any help would be great.

I see! I wasn't reading closely enough. I don't know of a query that does
this. I believe the completion suggester has some features along these
lines through but I'm not too familiar with them.

You could do a fuzzy query on an edge ngram analysis of the field to get a
kind of fizzy prefix. Like analyze the field two ways, once with the edge
ngram and once in your normal way.

Ya, I found a solution.

I create ngram indexer for search and index for the field and then used your idea with bool for regex and fuzzy. This worked as expected.

Thanks for your help.

BR