Extra processing during every indexing action

Sachin_Shaju · April 22, 2017, 10:49am

I would like to have a custom behavior like performing some action/processing on(before/after) every index request to ES and update a field of inserted document after processing. I've done some research on Watcher component in ES and custom plugin development for the same. But I couldn't find suited options or methods for this. Any help would be appreciated.

shanec · April 22, 2017, 3:00pm

What you're asking is a bit too generic to answer outright. Can you give some more insight into what specifically you're trying to do?

There are a number of potential options that may fit, depending on what that custom action or processing is. One option may be ingest node (https://www.elastic.co/guide/en/elasticsearch/reference/5.3/ingest.html) which has a number of processors already built in (https://www.elastic.co/guide/en/elasticsearch/reference/5.3/ingest-processors.html). If it's not one of the existing processors, it's possible we may be adding one in the future that does what you're looking for, and it's nice to be able to hear what people are trying to do. If not, it may be possible to build your own ingest processor (see https://www.elastic.co/blog/writing-your-own-ingest-processor-for-elasticsearch for information on that). It also may be possible to use a component before indexing into elasticsearch, e.g. Logstash, to do the processing.

Sachin_Shaju · April 24, 2017, 4:33am

@shanec : We would like to have a field like isProcessed initially set to false. After the indexing we would like to save original file that is indexed to a remote storage and do some other processing and update that field afterwards as true. We would like to do this for all the indexing to the cluster. Would like to know ingest node is the option here. Or any suitable options ?

Sachin_Shaju · April 24, 2017, 8:03am

@shanec : there any way we can set an ingest processor /pipeline config for all indices in it ? Or at least set it as a setting during the index creation to use it on every index request ?

Sachin_Shaju · April 24, 2017, 8:33am

@dadoonet : Any help or suggestions on this ?

shanec · April 24, 2017, 4:04pm

If you're trying to actually do something like move a file around, I'd recommend you do this outside of Elasticsearch and in an upstream process. Otherwise you have to give every Elasticsearch node access to wherever the files come from / go to, including dealing with java security manager issues, etc. Elasticsearch isn't really meant for that. So I'd do that business logic in an upstream process and then use the _update API to update the isProcessed field after you've finished doing whatever you need to do.

Sachin_Shaju · April 25, 2017, 6:05am

Ok. thank you for your suggestion.

system · May 23, 2017, 6:13am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Custom document processing during indexing Elasticsearch	5	1020	July 5, 2017
Elasticsearch insert triggers Elasticsearch	5	3953	June 15, 2017
ElasticSerach java client or ElasticSerach java plugin for enrich all document at index time? Elasticsearch	10	811	July 5, 2017
Can I create ingest processor to do some action like append tags or insert an alert to other index when a document match some conditions Elasticsearch ingest-pipeline	4	507	July 26, 2021
How can have a custom TransportAction which runs for each index request like TransportReplicationAction? Elasticsearch	10	723	July 5, 2017

Extra processing during every indexing action

Related topics