Narrowing down search space for fuzzy searches

(Niklas Therning) #1


I have an index with about 10 million documents. I'd like to run a fuzzy
search on those but that just takes too much time (10-20 s). Thankfully
I'm able to narrow down the search space to a few hundred documents
first using a query which takes less than a second. How I can run my
fuzzy search only on those documents that matches the narrowing query?
I've tried using a filtered query but it doesn't seem to change the
running time of the fuzzy search at all. The filtered query still takes
10-20 s. Here's what I have tried so far:


Is there something I'm doing wrong here? I'm using version 0.18.7.


(Ævar Arnfjörð Bjarmason) #2

Fuzzy inherently entails brute force, have you looked into whether you can
use more efficient methods like ngram indexes instead?

(system) #3