Reconcile data in ElasticSearch using logstash filter or output plugin

vvavad · July 18, 2024, 6:00am

I have a Elasticsearch index(index1) with document structure

{
"id": "id1", 
"addresses": [
{"address": "a11","address_2": "a21","city": "c1","state": "s1","zip": "12345","phone": "9191919191"},
], 
"field1": "f1",  "field2": "f2", "field3": "f3" 
}

I have a logstash job/pipeline that will ingest data from json file and push it to elasticSearch index(index2). The data in json file:

{
"id": "a1", 
"addresses": [
{"address": "a11","address_2": "a21","city": "c1","state": "s1","zip": "12345","phone": "9191919191"},
{"address": "a21","address_2": "a22","city": "c2","state": "s2","zip": "12346","phone": "9191919192"},
], 
"field5": [{"f51", "f52",}], "field6": "f6" 
}

I wanted to check if logstash pipeline can be used to read the json file and update data in the existing elasticSearch index(index1), instead of writing the data to a new index(index2).
Another question is, is there a way to reconcile data in the logstash job/pipeline. Reconciliation could be of the existing addresses in the Elasticsearch index(index1) and addresses coming from json file thats getting read based of the unique field "id" which is also id of the document.

strawgate · August 8, 2024, 5:30pm

Setting the action on the Elasticsearch output to index will update the document (by id) in Elasticsearch if it finds a document already exists and will create a new document if it finds that it does not already exist.

vvavad · August 9, 2024, 12:27pm

Thanks for the response.
The challenge is, the existing elasticSearch index1 and the data in incoming json file is different. I only want to add/update a few fields in the index.
Whats happening is its replacing the document for a given id with data from json, the original data from index1 for that id is lost.

strawgate · August 9, 2024, 1:46pm

You could use the Elasticsearch Filter plugin to query for the existing document and add fields to the document before sending it to the output: Elasticsearch filter plugin | Logstash Reference [8.15] | Elastic

i.e.

Read JSON object
Query document ID from Elasticsearch
Add couple of fields from JSON object to document queried from Elastisearch
Output updated document to Elasticsearch

vvavad · August 10, 2024, 11:13am

Thanks. This is what I am also trying. Will keep this thread updated once I achieve this.

Topic		Replies	Views
Logshash filter plugin ruby code to remove duplicates from a json array of json address objects Logstash	2	28	July 22, 2024
How to merge documents from logstash output plugin to elasticsearch index while indexing Logstash	1	462	December 15, 2019
How To read in logstash the response from elasticsearch using elasticsearch output plugin Logstash	6	998	November 29, 2021
Handle concurrency while updating different records of the same id from json file Logstash	5	22	September 11, 2024
Merging two documents from existing ES indices into a new document Logstash	4	496	December 6, 2019

Reconcile data in ElasticSearch using logstash filter or output plugin

Related topics