Getting nested object with json filter

Hi all,

I've got problem when trying to get some nested fields in a json object with json filter.

My field look like this

{
"total_count" : 3,
"orders" : [
{
"marketplace_order_id" : "malikilledme3",
"order_lines" : [
{
"order_line_status_pending" : "PENDING SHIPMENT",
"order_line_status_current" : "TO SHIP",
"marketplace_order_line_id" : "malikilledme3-1"
},
{
"order_line_status_pending" : "PENDING SHIPMENT",
"order_line_status_current" : "TO SHIP",
"marketplace_order_line_id" : "malikilledme3-0"
}
]
},

Original field is [http][response][body][content], I want to get value from [http][response][body][content][orders][marketplace_order_id]

I've extracted top level field , but "orders" value is a nested json, so I tried to parse it ith json filter too

json {
        source => "[http][response][body][content]"
        target => "[http][response][json]"
        }

json {
        source => "[http][response][json][orders]"
        target => "[http][response][json][orders_json]"
        }

But when I do this, I got a java class error exception.

2020-01-21T17:23:37,932][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:api_ws], :non_running_pipelines=>}
[2020-01-21T17:23:43,428][WARN ][logstash.filters.json ] Error parsing json {:source=>"[http][response][json][orders]", :raw=>[{"marketplace_order_id"=>"malikilledme3", "order_lines"=>[{"marketplace_order_line_id"=>"malikilledme3-1", "order_line_status_pending"=>"PENDING SHIPMENT", "order_line_status_current"=>"TO SHIP"}, {"marketplace_order_line_id"=>"malikilledme3-0", "order_line_status_pending"=>"PENDING SHIPMENT", "order_line_status_current"=>"TO SHIP"}]}, {"marketplace_order_id"=>"malikilledme_again", "order_lines"=>[{"marketplace_order_line_id"=>"malikilledme_again-1", "order_line_status_pending"=>"PENDING SHIPMENT", "order_line_status_current"=>"TO SHIP"}, {"marketplace_order_line_id"=>"malikilledme_again-0", "order_line_status_pending"=>"PENDING SHIPMENT", "order_line_status_current"=>"TO SHIP"}]}, {"marketplace_order_id"=>"ws1-3_test_validation1", "order_lines"=>[{"marketplace_order_line_id"=>"ws1-3_test_validation1-0", "order_line_status_pending"=>"PENDING SHIPMENT", "order_line_status_current"=>"SHIPPED"}, {"marketplace_order_line_id"=>"ws1-3_test_validation1-1", "order_line_status_pending"=>"PENDING REFUND", "order_line_status_current"=>"SHIPPED"}]}], :exception=>java.lang.ClassCastException}

Any idea ?
regards
thomas

The first json filter will have parsed all the nested JSON, so the second json filter is never going to work. Note that [http][response][json][orders] is an array, so you will need to access [http][response][json][orders][0].

Oh, ok thanks badger !
I couldn' t find how to deal with arrays in situation I want, how can I do to get all values of fields marketplace_order_line_id for example ?

Something like [http][response][json][orders][0][order_lines][all_entries][marketplace_order_line_id] ?

As @Badger said that is an array, that's why you're receiving that error (you're trying to parse and array as a json).

I'm afraid I didn't really understand what you want in output. You wanna get all values of fields marketplace_order_line_id and store them where, precisely?
You'll most likely need a ruby filter and we can help you write it, but we first need to know what you're trying to achieve.

How would you like your output doc to be structured? Paste here the desired output and it'll help us help you :wink:

hi @Fabio-sama,
thanks for answering

Let's take this simplified doc

{
"total_count" : 3,
"offset" : 0,
"limit" : 200,
"orders" : [
{
"marketplace_order_id" : "A"
},
{
"marketplace_order_id" : "B"
},
"marketplace_order_id" : "C"
}
]
}

What I want to extract in ES is something like this

orders.marketplace_order_id: [ "A", "B", "C" ]

or if not possible in the same field,

orders.marketplace_order_id.0 : "A"
orders.marketplace_order_id.1 : "B"
orders.marketplace_order_id.2 : "C"

(I was able to get all values with split filter, but split duplicates all document, and I just want to get all values of 1 field in the same document, so it is probably not goog way to do it for me.)

And what about order_lines_status_pending and order_lines_status_current?

I'm sorry but I gotta insist in asking you to post something like:

Logstash input:
...your FORMATTED input..

Desired output
...your FORMATTED output...

It makes everything easier for you to better understand what you want, for us to better understand what you want and help you achieve it and for future readers to better understand if they have your same problems.

Thanks :wink:

Sorry, I've edited previous post to have something more simple.

Ok so, admitting the following is your situation:

Logstash input:

{
  "total_count": 3,
  "offset": 0,
  "limit": 200,
  "orders": [
    {
      "marketplace_order_id": "A"
    },
    {
      "marketplace_order_id": "B"
    },
    {
      "marketplace_order_id": "C"
    }
  ]
}

Desired output:

{
  "total_count": 3,
  "offset": 0,
  "limit": 200,
  "orders": {
    "marketplace_order_id": [ "A", "B", "C" ]
  }
}

This should do what you want:

  ruby {
    code => "
      unless (orders = event.get('orders')).nil?
        orders = { 'marketplace_order_id': orders.map{ |o| o['marketplace_order_id'] }.compact }
        event.set('orders', orders)
      end
    "
  }

Thanks so much, It works like a charm !

1 Like

No problem!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.