How can I parse array of objects using Logstash?

Hello everybody!

Does anybody knows how can I parse an array of objects of this type: [{name:Cezar, age:23}, {name:Leon, age:22}, {name:Steven, age:33}]. Every object should be an event in Discovery Section. Please take into consideration that this array is written in a single line (it comes from some logs extracted in Elasticsearch). I have tried and read a lot of configurations and I did't solve the problem.

Thanks in advance!

Use a json filter with the target option set, then use a split filter.

1 Like

seems to be an array of jsons..

without more of the schema its kinda hard.. but it'll be either

filter {
     json{
             source => "[name_of_field]"
     }
}

or

filter {
 split {
   field => "[name_of_field]"
 }
}
1 Like

Thank you guys for your responses! I don't understand what "[name_of_field]" refers to. I don't have anything before [. My file contains a large JSON array of 30 MB which begins with [ and ends with ], having nothing before or after. If I want to edit this big file, I use Notepad, but it works very very slow. So it won't be a very good solution to append a field in front of the array. Therefore, taking into consideration my above example, [{name:Cezar, age:23}, {name:Leon, age:22}, {name:Steven, age:33}] , by using Logstash, I want to get in Elasticsearch an index with 3 docs. Every doc should have these 2 columns: name and age. The first one with (Cezar,23), the second one with (Leon, 22), etc.
Do you know how could I do this more exactly? Thanks in advance!

[name_of_field] is a reference to a field called name_of_field. You did not tell us what your field name is, and we have no way of knowing. If your field is called message then you would use [message] in those filters.

1 Like

Hi @Badger!

Thank you for your precious replies! I have read numerous topics on discuss.elastic.co and I saw you gave very important pieces of advice. I have succeeded in sending data to Elasticsearch for the example above where I had an array with 3 objects. In order to do that, I have used the following configuration in Logstash:

input {
  file {
    path => ["/home/..../json-file-name"]
    start_position => "beginning"
    sincedb_path => ["/home/..../sincedb"]
    codec => "json"
  }
}

filter {
if [message] {
    drop { }
  }
}

output {
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "setlogs-%{+YYYY.MM.dd}"
  }
  
  stdout {
	codec => rubydebug
  }
}

It works for my example with 3 objects. But when try to do this for my JSON array which has 30 MB and almost 4300 objects, Logstash doesn't print anything. It is like it waits for some input. I let Logstash running for 50 min. but when I saw there is no output I closed it. Do you have any idea what happened?

Interesting. If you have a json codec then if it successfully parses the event it does not set the [message] field. So this is dropping any events that were not successfully parsed. It is a really, really obscure way to achieve that, and will confuse a lot of people. I suggest you remove that and see what you get on stdout.

I suggest you remove

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.