Handle concurrency while updating different records of the same id from json file

Badger · September 10, 2024, 2:47pm

If there are multiple records which contain a unique record identifier then you can combine them using an aggregate filter. This requires every event go through a single pipeline worker thread, so that they are all processed by a single filter instance.

If the records for an employee all have disparate data then you could have elasticsearch combine them. That is, if there are records for languages, addresses, shoe size, age, etc., but a record never contains both shoe size and address, then you can have logstash write out a file that updates each field using doc_as_upsert and feed that file to elasticsearch using curl (or Invoke-WebRequest on Windows). A little more detail here.

Given that I implemented that using a file output and curl I have to assume that having an elasticsearch output do the equivalent doc_as_upsert does not work the way I wanted it to.

Topic		Replies	Views
Reconcile data in ElasticSearch using logstash filter or output plugin Logstash	4	43	August 10, 2024
Duplicate records in Elasticsearch Logstash	3	407	March 5, 2021
Update/Add multiple records in elastic using logstash Logstash	3	990	February 13, 2019
Avoiding duplicate records Logstash	4	2459	November 2, 2017
Logstash concurrency - managing multiple LS nodes that share the same pipeline configuration Logstash	3	1190	November 22, 2019

Handle concurrency while updating different records of the same id from json file

Related topics