I've a scenario where I want to index contents in Elasticsearch with more information for each word appearing in the context. For example, if the content is -
Hi there, how are you. I miss you.
We want to index each word in above sentence as well context around the word. Example, 'miss' is a feeling, 'you' is a person, etc. We use payloads today to store the context as well as the position.
But processing payloads is not that faster and query performance is not great when the #of documents is huge. Any alternative we could try?
Also, the context is fixed, we have a set of things we care about (feeling, person, situation, etc.) and we use number mapping (0- feeling, 1 - person, etc.) when adding detail in ES payload.
Thanks @Jason_Wee. This might work in cases where the query from customer is like -
sky nature
where 'sky' is the word which appeared in content and 'nature' is something which sky represents. Bu the query from the customer will be like
nature OR environment
and we give back all results where we had words which meant above. In addition we also want to give aggregation of how many times we found nature/environment context. In highlight of result, we want to highlight the word which represented the searched context.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.