Best Optimization for wildcard queries

I am working with elasticsearch on audit log data. I wanted to know what are the best optimizations and tokenizer/analyzer/term usages for optimizing elastic for wildcard queries. I have looked at n-grams and they looked interesting but I could find no proper guide/documentation for optimizing elastic to my use case.

My use case
We are trying to search for program path string in log data. For example we have data that has strings like C://user/home/folder1/folder2/folder3/malware and we wish to run a wildcard *malware and we hope to match the document with the above string.

What would be the best path (if any) to go about such use case. What tokenizer should we use? I was looking at wildcard term features. Is this any different from keyword term?

Any help/guidance will be appreciated.

Have you looked at the wildcard field type?

1 Like

I will look into it. Should I pair it with n-gram tokenizer or any other tokenizer that is optimal for wildcard queries?

I believe it uses ngrams behind the scenes, so you do not need to pair it with anything as far as I know.

1 Like

Thank you for the help. Appreciate it!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.