Split Json Array Objects

Hi,
I want to split my json array into multiple events in logstash. I tried using split field but it doesn't seem to work that way for objects in array. So any reference which plugin should I use in logstash to process data. Data format is given below:

{ "problems": [{
"Diabetes": [{
"medications": [{
"medicationsClasses": [{
"className": [{
"associatedDrug": [{
"name": "asprin",
"dose": "",
"strength": "500 mg"
}],
"associatedDrug#2": [{
"name": "somethingElse",
"dose": "",
"strength": "500 mg"
}]
}],
"className2": [{
"associatedDrug": [{
"name": "asprin",
"dose": "",
"strength": "500 mg"
}],
"associatedDrug#2": [{
"name": "somethingElse",
"dose": "",
"strength": "500 mg"
}]
}]
}]
}]
}]
}]
}

Expected result : 4 different document containing

{ "problems": [{
"Diabetes": [{
"medications": [{
"medicationsClasses": [{
"className": [{
"associatedDrug": [{
"name": "asprin",
"dose": "",
"strength": "500 mg"
}]
}]
}]
}]
}]
}]
}

{ "problems": [{
"Diabetes": [{
"medications": [{
"medicationsClasses": [{
"className": [{
"associatedDrug#2": [{
"name": "somethingElse",
"dose": "",
"strength": "500 mg"
}]
}]
}]
}]
}]
}]
}

{ "problems": [{
"Diabetes": [{
"medications": [{
"medicationsClasses": [{
"className2": [{
"associatedDrug": [{
"name": "asprin",
"dose": "",
"strength": "500 mg"
}]
}]
}]
}]
}]
}]
}

{ "problems": [{
"Diabetes": [{
"medications": [{
"medicationsClasses": [{
"className2": [{
"associatedDrug#2": [{
"name": "somethingElse",
"dose": "",
"strength": "500 mg"
}]
}]
}]
}]
}]
}]
}

Expected Document is each associatedDrug in separate doc
Thanks!

Hi

Which field do you want to split? What is exactly what you want to get? Could you please provide a sample of the expected result from the data sample you posted?

Thanks.

1 Like

I updated the expected outcome.

Please, make use of the Preformatted text tool (image ) to format your content or it'll be unreadable.

{
  "this_is": "a readable json"
}

{
"this_is":"garbage"
}

Hi

It is not clear to me that you can easily get what you get, but, off the top of my head, I think you could split by the medicationsClasses field and then use an if statement on the existence of either the className or the className2 fields to split again by that field. Something similar to this:

    split {
      field => "[problems][Diabetes][medications][medicationsClasses]"
    }
    if [className] {
      split {
        field => "[problems][Diabetes][medications][medicationsClasses][className]"
      }
    }
    else if [className2] {
      split {
        field => "[problems][Diabetes][medications][medicationsClasses][className2]"
      }
    }

You might have to first split by problems, and then, for each possible value of problems (Diabetes, etc...), split again for medicationsClasses as I suggested above.

Maybe you don't have to split so may times, only once for className and once for classname2, but you'll need the if statements to discern when to split by one and when by the other.

Hope this helps

Mhmh I have a feeling you cannot do the first split since it'll find nothing at first in [problems][Diabetes][medications][medicationsClasses].

To do what you suggested he should do something like first splitting [problems], then if [problems][Diabetes] exists (not only [Diabetes] because when you split something with logstash it'll keep the root field and nest the split fields inside it) split [problems][Diabetes], then if [problems][Diabetes][medications] exists split [problems][Diabetes][medications] etc...

That of course is not a scalable approach. If you want to apply that solution to an event with a slightly different internal structure, it'll break.

I'd probably go with a recursive solution on the top event using the ruby filter.

Hi @Fabio-sama

You are absolutely right. I was also thinking that, with the proper if statements in place and assuming he receives one problem per event, @Atul_Gunjal might be able to split only once, by either className or className2. Of course this is not a general solution, and it requires that he knows in advance all the possible problems.

The ruby{} filter option did cross my mind but my ruby skills, in a scale from 1 to 10, are close to -5 :wink: If he is good at ruby he might want to try that approach, but I believe it can be done with just native logstash for this particular case. If a more general solution is desired, I agree with you that a ruby{} filter is most likely the way to go.

my ruby skills, in a scale from 1 to 10, are close to -5 :wink:

:rofl: :rofl:

Well, I usually tend to adopt as general solutions as possible, since I don't like to be strictly linked to the shape of one specific log.

Though, if he only needs it for this very specific case, he could do something very similar to what you wrote, like:

split {
  field => "[problems]"
}

split {
  field => "[problems][Diabetes]"
}

split {
  field => "[problems][Diabetes][medications]"
}

split {
  field => "[problems][Diabetes][medications][medicationsClasses]
}

...and so on ...

Anyway, should I find 10 minutes today, I'll write a general ruby filter to accomplish such a task. Also because it might be useful to future readers as well.

1 Like

This method works for me...Thanks