In PostgreSQL there's a LIKE clause where you can wildcard a search term like WHERE content LIKE '%blue%'
.
This will match any record where the content
column contains "blue".
I was thinking that with ElasticSearch, I can get away with something less complex if I need to: check if there's any word that either starts or ends with non-analyzed search term "blue".
I started looking into edgeNGram, which is really good for front-side autocomplete-like searching. So that handles the case where I can find a word that starts with "blue", but lacks the ends with "blue" logic.
e.g.: edgeNGram for term "buy blueshield":
['b', 'bu', 'buy', 'b', 'bl','blu','blue','blues','bluesh','blueshi','blueshie','blueshiel','blueshield']
So searching for non-analyzed term "blue" would indeed match here. But what if the word "blue" was at the end of "blueshield"?
Term "buy shieldblue" would tokenize to:
['b', 'bu', 'buy', 's', 'sh','shi' ,'shie','shiel','shield','shieldb','shieldbl','shieldblu','shieldblue']
This wouldn't be a hit for non-analyzed search term "blue".
So I'm assuming that in order to achieve this I'd have to write my own tokenizer? If so, do I have to do this in Java? Or is there a way to do in via _settings API?