Hello all, I am trying to get data from MongoDB to OpenSearch, and this is our pipeline:
MongoDB ==> Kafka source connector ==> Kafka topic ==> Logstash ==> OpenSearch
Problem is, when MongoDB data get written into the Kafka topic, they are in extended JSON format, which has fields like ISODate gets written in a nested structure with $date and Long gets written in a nested structure with $numberLong etc.
As a result, OpenSearch / Elasticsearch now also has data in this extended JSON format, which makes consuming from it very difficult.
Is there a way Logstash or OpenSearch can handle this internally through some configurations without impacting too much on the performance? The data that we are processing is generally huge with hundreds of lines and multiple layers of nested JSON. So, running a script to remove / flatten these data does not seem to be an optimal solution.