How to merge JSON documents when ingesting?

colinphillips · October 24, 2017, 4:13pm

I'm new to logstash / beats and I'm wondering how best to go about transforming / merging two JSON documents. Here's an example of what I'm trying to do:

Document 1:

{
"id": 1234,
"things": [
{
"id": "1234",
"lat": 45.12344,
"long": 0.333232
},
{
"id": "1234",
"lat": 45.3322,
"long": 0.456543
}
]
}

Document 2:

{
"id": 1234,
"newField": 4321
}

What I want to go into elasticsearch:

{
"id": 1234,
"newField": 4321,
"things": [
{
"lat": 45.12344,
"long": 0.333232
},
{
"lat": 45.3322,
"long": 0.456543
}
]
}

So ... I have two documents at source (a REST API) that I want to merge based on the "id" field matching. I want to end up with a single document that contains elements of both source documents. In addition I am eliminating duplicate (and redundant) "id" fields in the first document.

Question:
Architecturally, how is the best way to go about this? Should I pre-process the documents before they hit logstash (maybe with a custom beat?), or can I handle this case via the aggregate and json plugins directly in logstash? Should I build a custom (specialized) plugin for logstash?

Jack_Judge · October 24, 2017, 5:58pm

In the Elasticsearch output plugin there's a couple of options that'll help you, the "upsert" option and "document_id".
You're using the "id" field as the basis for merging. So use that field as the "document_id" for logstash.

Post your updates to Elasticsearch as upserts and if a document doesn't exist with the id that you're using then one will be created. If a document with that id does exist then the upsert will update it. You can choose which fields to post so you can overwrite extraneous data.

colinphillips · October 25, 2017, 6:33pm

Thanks Jack_judge - I can see how to make that work, but I'm probably going to go another way as my API is OAUTH2 authenticated and I can't see a way to make that work directly with logstash. I'm planning to write a node.js app to authenticate with the API, and since I'm doing that the JSON transformation / aggregation that I'm after will be trivial in javascript.

system · November 22, 2017, 6:33pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to merge two document in logstash Logstash	2	1840	June 11, 2020
Update ElasticSearch existing documents with new fields Logstash	2	764	May 8, 2019
Combining multiple documents based on ID Logstash	22	6475	December 4, 2017
Reconcile data in ElasticSearch using logstash filter or output plugin Logstash	4	43	August 10, 2024
Merge multiple fields in document into json array Logstash	3	1389	March 31, 2021

How to merge JSON documents when ingesting?

Related topics