Handle concurrency while updating different records of the same id from json file

If there are multiple records which contain a unique record identifier then you can combine them using an aggregate filter. This requires every event go through a single pipeline worker thread, so that they are all processed by a single filter instance.

If the records for an employee all have disparate data then you could have elasticsearch combine them. That is, if there are records for languages, addresses, shoe size, age, etc., but a record never contains both shoe size and address, then you can have logstash write out a file that updates each field using doc_as_upsert and feed that file to elasticsearch using curl (or Invoke-WebRequest on Windows). A little more detail here.

Given that I implemented that using a file output and curl I have to assume that having an elasticsearch output do the equivalent doc_as_upsert does not work the way I wanted it to.

1 Like