We need to allow leading wildcard searches on our Elasticsearch installation and I'm looking for the best way to handle them.
We currently have one large index (~600Gb) and create aliases with a filter for our different clients based on a client id.
The leading wildcard searches currently work but take a long time.
From what I've been able to find, my options are:
- Break the large index into smaller indices to speed up the index search with wildcard.
- Add a reversed field for all the fields we want the leading wildcard to work for and use an edge n-gram filter to run the query using the index.
Option 1 will still take some time to run the query but hopefully would be a reasonable amount of time.
Option 2 means that we would basically need to double the size of our index which isn't something we want to do unless we have to.
Questions:
- I've tested using smaller indices and the searches are faster but I don't understand why this isn't the same as an alias with a filter for a single client id. Wouldn't an alias with a filter limit the scope of the index query for the leading wildcard?
- How realistic is option number 2? It just seems like this adds so much data to the index.
- Is there another approach that I could consider?