We use a custom analyzer/token-filter on an index to produce some payload data on a specific term that gets added to the inverted index of the field. This payload supports a custom query, but sometimes it would be useful to access this payload for other external purposes.
Is it possible to retrieve the payload of an indexed term as part of the document data retrieved via a search, or a multi-get, or some other way?
Or, alternatively, is it possible to define a plugin that can manipulate the document data added to an index before any analyzer starts to analyze the fields so that we can add the computed data to the document and pass it as payload too?
You can use the term vectors to retrieve the payloads associated to the analyzed terms of a specific document:
If it's for debug you can generate them on the fly, if speed matters then you'll need to store them and not forget that it comes with a cost
Check this page for a full description: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-termvectors.html
Thanks! This should do what I need. I appreciate the fast reply!
For completeness, do you know of a way to do the alternative I mentioned, ie: define a plugin that can manipulate the document data added to an index before any analyzer starts to analyze the fields so that we can add the computed data to the document and pass it as payload too?
for the purpose of enriching documents before analysis, theres also current "work in progress" on a new type of node currently called the Ingest Node which might be worth looking at once it is released. Not sure when it is going to be out, but maybe you like to take a sneak peak already.
The MapperSizePlugin is interesting as it adds a new _size field to the document. I need to add a new field whose value is an object with a document-dependent set of keys.
Do you happen to know if this is possible/allowed?
I would like to pursue this approach, but it would be nice to know in advance if this cannot work.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.