Can I retrieve the payload of an indexed term in _search or _mget?


#1

We use a custom analyzer/token-filter on an index to produce some payload data on a specific term that gets added to the inverted index of the field. This payload supports a custom query, but sometimes it would be useful to access this payload for other external purposes.

Is it possible to retrieve the payload of an indexed term as part of the document data retrieved via a search, or a multi-get, or some other way?

Or, alternatively, is it possible to define a plugin that can manipulate the document data added to an index before any analyzer starts to analyze the fields so that we can add the computed data to the document and pass it as payload too?

Thanks!


(Jimferenczi) #2

You can use the term vectors to retrieve the payloads associated to the analyzed terms of a specific document:
If it's for debug you can generate them on the fly, if speed matters then you'll need to store them and not forget that it comes with a cost :wink:
Check this page for a full description:
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-termvectors.html


#3

Thanks! This should do what I need. I appreciate the fast reply!

For completeness, do you know of a way to do the alternative I mentioned, ie: define a plugin that can manipulate the document data added to an index before any analyzer starts to analyze the fields so that we can add the computed data to the document and pass it as payload too?


(Jimferenczi) #4

You can check the mapper plugins:
https://www.elastic.co/guide/en/elasticsearch/plugins/current/mapper.html
If you define your own mapper then you should be able to manipulate your field and act on it before analysis.


(Christoph) #5

Hi,

for the purpose of enriching documents before analysis, theres also current "work in progress" on a new type of node currently called the Ingest Node which might be worth looking at once it is released. Not sure when it is going to be out, but maybe you like to take a sneak peak already.


#6

Thanks for the pointer!

The MapperSizePlugin is interesting as it adds a new _size field to the document. I need to add a new field whose value is an object with a document-dependent set of keys.

Do you happen to know if this is possible/allowed?

I would like to pursue this approach, but it would be nice to know in advance if this cannot work.


#7

Thanks for the reference Christoph. That will be very useful once it is ready.


(system) #8