Narrowing down search space for fuzzy searches


(Niklas Therning) #1

Hi,

I have an index with about 10 million documents. I'd like to run a fuzzy
search on those but that just takes too much time (10-20 s). Thankfully
I'm able to narrow down the search space to a few hundred documents
first using a query which takes less than a second. How I can run my
fuzzy search only on those documents that matches the narrowing query?
I've tried using a filtered query but it doesn't seem to change the
running time of the fuzzy search at all. The filtered query still takes
10-20 s. Here's what I have tried so far:

{
query:{
filtered:{
query:{text:{"b":...}},
filter:{query:{fuzzy:{"a":{value:...}}}}
}
}
}

Is there something I'm doing wrong here? I'm using version 0.18.7.

/Niklas


(Ævar Arnfjörð Bjarmason) #2

Fuzzy inherently entails brute force, have you looked into whether you can
use more efficient methods like ngram indexes instead?


(system) #3