I am using Logstash to pull data from the Oracle database and process some validation and index into Elasticsearch.
I have to update the existing document in the index if the "ID" exists. For that, I am using Elasticsearch Filter to check if the id exists and pull only one doc if the id exists.
Everything is working as expected. But because of bulk request we are missing the updates for some events,
Does Logstash has different threads to process every event? or
How to handle the events with elastisearch filter plugin?
I'm not sure if below is the best option, but it could work.
If you set your ID field to _id. You can then use the Logstash Elasticsearch output, with action set to update. When indexing elasticsearch will check the _id field to see if it exists, if it does, it will update, if it doesn't it will create the doc.
Like I said I'm not 100% sure this is the best option, someone else might have a better solution.
I think you're over complicating your output. You can just have a singular output with action update. This will work via upsert to index any new documents.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.