Logstash in k8s - parsing nested json from MongoDB and get every nested json as separated field

Denis_Lezgin · August 27, 2023, 7:32am

Hi there,

I'm using Logstash to take documents from specific MongoDB collection, and save it to Elasticsearch.
Nested fields are being saved to "log_entry" as one JSON, starting with "BSON" or "ID", depends on manipulations I do using filter.

Here is example of the "log_entry":

"log_entry": {\"_id\": \"122ghgh1111, \"msg_body\": {\"text_one\": 2, \"text_data\": [{\"position\": 1}, {....}, {...}]}}

there is a lot of text in log_entry, so i don't post everything, just the structure.

Below is my config (ive tried different ways , so I'll share it all. None of them isn't doing what I'd like to achieve):

logstashPipeline:
   logstash.conf:
      input {
         mongodb {
            uri => 'mongodb://user:password@host:port/<db_name>?directConnection=true'
            placeholder_db_dir => '/opt/logstash-mongodb'
            placeholder_db_name => 'logstash_sqlite.db'
            collection => 'my_collection'
            codec => "json"
         }
      }

      // First try - still saving nested JSON as one

      filter {
         mutate {
            gsub => [ "log_entry", "=>", ": "]
            rename => { "_id" => "mongo_id" }
            remove_filed => ["_id"]
         }
         mutate {
            gsub => [ "log_entry", "BSON::ObjectID\('([0-9a-z]+'\)", '"\1']
            rename => { "_id" => "mongo_id" }
         }
      }

      // Second try - still saving nested JSON as one


      filter {
         mutate {
            gsub => [ "log_entry", "=>", ": "]
            rename => { "_id" => "mongo_id" }
            remove_filed => ["_id"]
         }
         mutate {
            gsub => [ "log_entry", "BSON::ObjectID\('([0-9a-z]+'\)", '"\1']
            rename => { "_id" => "mongo_id" }
         }
         grok {
            match => { "log_entry" => "%{DATA:log_entry}" }
         }
         json {
            source => "log_entry"
            remove_field => ["log_entry"]
         }
      }

      output { elasticsearch {
            action => "index"
            index => "mongo_log_data"
            hosts => ["https://<host>:9200"]
            ssl => false
            ssl_certificate_verification => false
            user => "elastic"
            password => "some_password"
         }
      }

Can you please help me to build a correct working filter to achieve what I need?

Thanks in advance.

Badger · August 27, 2023, 12:57pm

You have told us that your existing filter does not do what you want, but have given no indication of what it is you want to do, which makes it hard to propose a solution. Can you provide an example of what an event looks like and what you would like it to look like?

Denis_Lezgin · September 5, 2023, 7:11am

Hi @Badger ,

This is what Im seeing in log_entry:

"log_entry": {\"_id\": \"122ghgh1111, \"msg_body\": {\"text_one\": 2, \"text_data\": [{\"position\": 1}, {....}, {...}]}}

I would like it to be saved in ES as:


text_one: 2
text_data:_position: 1

etc

which means, each field of the nested JSON to be save as separated field.

Thanks!

system · October 3, 2023, 7:12am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Parsing mongodb input fully via logstash => elasticsearch Logstash	7	2687	November 2, 2021
Need to Extract field Logstash windows	8	665	March 16, 2022
Logstash : How to extract a nested field from Json log and only index the content of the nested field Logstash	6	1909	November 24, 2022
Logstash filter for parsing nested json Logstash	1	284	September 12, 2019
Split nested json array from log Logstash	13	4068	May 18, 2020

Logstash in k8s - parsing nested json from MongoDB and get every nested json as separated field

Related topics