So I have a general design question. I have long been indexing documents where all string fields generally are stored as keywords. However now we are wanting to add the capability to allow users to enter a string and to search across all fields for a match (case insensitive, stemmed, fuzzy).
I have a REST interface between my client and Elasticsearch so I can intercept the program specific search object and map it to an Elasticsearch specific search.
I think I have two options:
- Begin to index a summary text field on each document which is a self built composite of all the fields I would search across. Then these queries would execute against that summary field.
- Index each field as both a keyword, to support searching as we have done to date, and add a text field for each field as well. Then this new search would need a listing of all the fields it needs to search against and OR the search. For example searching for bob would turn into something like "name:bob OR employeeid:bob OR nickname:bob"
I feel like each has their tradeoffs. (1) allows for all documents in all indexed to be searched against the same summary field, while all their other fields may differ and (2) makes highlighting a bit better because it would show which field matched versus needing to glean it from the summary text
I'm not sure if (2) would perform better but I would think it to.
Any thoughts would be greatly appreciated!