Logstash Kafka input handling extended JSON format

ys_goh · November 14, 2023, 1:49am

Hello all, I am trying to get data from MongoDB to OpenSearch, and this is our pipeline:

MongoDB ==> Kafka source connector ==> Kafka topic ==> Logstash ==> OpenSearch

Problem is, when MongoDB data get written into the Kafka topic, they are in extended JSON format, which has fields like ISODate gets written in a nested structure with $date and Long gets written in a nested structure with $numberLong etc.

As a result, OpenSearch / Elasticsearch now also has data in this extended JSON format, which makes consuming from it very difficult.

Is there a way Logstash or OpenSearch can handle this internally through some configurations without impacting too much on the performance? The data that we are processing is generally huge with hundreds of lines and multiple layers of nested JSON. So, running a script to remove / flatten these data does not seem to be an optimal solution.

system · November 14, 2023, 1:49am

OpenSearch/OpenDistro are AWS run products and differ from the original Elasticsearch and Kibana products that Elastic builds and maintains. You may need to contact them directly for further assistance.

(This is an automated response from your friendly Elastic bot. Please report this post if you have any suggestions or concerns )

system · December 12, 2023, 1:50am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.