Leading wildcard search handling

We need to allow leading wildcard searches on our Elasticsearch installation and I'm looking for the best way to handle them.

We currently have one large index (~600Gb) and create aliases with a filter for our different clients based on a client id.

The leading wildcard searches currently work but take a long time.

From what I've been able to find, my options are:

  1. Break the large index into smaller indices to speed up the index search with wildcard.
  2. Add a reversed field for all the fields we want the leading wildcard to work for and use an edge n-gram filter to run the query using the index.

Option 1 will still take some time to run the query but hopefully would be a reasonable amount of time.

Option 2 means that we would basically need to double the size of our index which isn't something we want to do unless we have to.

Questions:

  1. I've tested using smaller indices and the searches are faster but I don't understand why this isn't the same as an alias with a filter for a single client id. Wouldn't an alias with a filter limit the scope of the index query for the leading wildcard?
  2. How realistic is option number 2? It just seems like this adds so much data to the index.
  3. Is there another approach that I could consider?

What for? Can you explain the use case for it?

Add a reversed field for all the fields we want the leading wildcard to work for and use an edge n-gram filter to run the query using the index.

Definitely, I'll go that way. Have a look at:

Thanks!

We migrated off Ferret search and currently offer the ability to search for the end of terms and we don't want to remove that functionality.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.