Reverse keyword vs new Wildcard data type

In email wildcard searching:


Would reverse keyword be faster in searching than wildcard given a single wildcard?

moc.niamod@* on reverse keyword
* on wildcard
For example.

For this specific use case: Reverse keyword is fastest

It's all about index and query alignment.
If you build your own index and ensure related queries are similarly treated then you can use an index with reversed tokens efficiently for the * search and a different index with non-reversed tokens for making a bill.gates@* search efficient. If you use the wrong query on the wrong index you end up scanning all unique values in the index rather than being able to quickly seek to the relevant parts. There's no safety rails to prevent you picking the wrong field.
That said, if you bothered to create the appropriate index and the searcher picked the right field then it should be quick and there's no need to verify each of the docs that have that term.

With wildcard field it's a more general-purpose index. One index does leading and trailing wildcards (and arbitrarily complex regex) searches. It has ngram index entries for every character position in the string to make it fast. The downside is that matching purely on this ngram index is not sufficient evidence of a match - for each candidate matched doc the engine also needs to verify that the search expression is indeed in that doc by some follow-up testing (this happens automatically behind the scenes). If you have many docs containing the search term we will have to test each of them. Small numbers of matches will be much quicker.

So it depends. It's unlikely to beat a carefully custom-designed combo of index and query but there's more versatility and less need for multiple data structures.

So generalizing the answer. A reverse keyword search would be faster given more matching results for the specific instance of searching emails part of a domain?
This is assuming the correct field is searched by the program/website/system/user with the appropriate query.

Yes. It would not be faster for that specific use case with that specific indexing.
The plus is that the wildcard field would support fast bill.gates@*, *, bill*@domain.*, [b|w]ill.*domain\.(org|com) etc queries too without needing additional indexed fields.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.