Ruby help with array of json objects

hi

sorry if its repeat question. In my document there are few fields which are array of json objects. I want to extract one of json array object's value and assign it to my new field. So for that i m using ruby code to convert an array of json objects in to flat json , copying it to tmp field and then trying to extract value from temp field but its not working.

Here is my logstash conf.

input {
  beats {
    port => 5044
  }
} filter {
          json {
                   source => "message"
                   }    
       ruby {
               code => '
                       event.set("request_http_headers_tmp",event.get("request_http_headers").to_json)
                     '
            } 
       if "transaction-id" in [request_http_headers_tmp]
            {
                 mutate{
                 add_field => { "transaction-id" => "%{request_http_headers_tmp.transaction-id}" }
                }
           } 
       mutate {
              remove_field => [ "@version","fields","tags","host","agent","log","message","version","request_http_headers" ]
               }
}
output {
      stdout {
            codec => rubydebug
             }
}

Here is my sample input

{"request_http_headers":[{"transaction-id": "1234" },{"raj":"test"}]}
{"request_http_headers":[{"TraceId":"9d912b9aedf0f0d6"},{"Request-Id":"a58db2fb"},{"Sampled":"0"},{"SpanId":"109996b844d56118"},{"ParentSpanId":"afd752b39e"},{"Via":"1.1 AgAAALgHxKA-"},{"X-Client-IP":"1.1.1.1"},{"transaction-id":"ecbc4be05f0c88ff00c11f61"}]}

Here is my output

"request_http_headers_tmp" => "[{\"transaction-id\":\"1234\"},{\"raj\":\"test\"}]",
                     "ecs" => {
    "version" => "1.0.1"
},
          "transaction-id" => "%{request_http_headers_tmp.transaction-id}",
              "@timestamp" => 2020-07-13T17:56:31.709Z
}
{
"request_http_headers_tmp" => "[{\"TraceId\":\"9d912b9aedf0f0d6\"},{\"Request-Id\":\"a58db2fb\"},{\"Sampled\":\"0\"},{\"SpanId\":\"109996b844d56118\"},{\"ParentSpanId\":\"afd752b39e\"},{\"Via\":\"1.1 AgAAALgHxKA-\"},{\"X-Client-IP\":\"1.1.1.1\"},{\"transaction-id\":\"ecbc4be05f0c88ff00c11f61\"}]",
                     "ecs" => {
    "version" => "1.0.1"
},
          "transaction-id" => "%{request_http_headers_tmp.transaction-id}",
              "@timestamp" => 2020-07-13T17:56:31.709Z
}

any clue on what am i doing wrong ?

i have tried %{[request_http_headers_tmp][transaction-id]} , %[request_http_headers_tmp][transaction-id] , %[request_http_headers_tmp]%[transaction-id] but i just dont know how to make this work :disappointed:

%{[request_http_headers_tmp][transaction-id]} would be the correct syntax. But you converted request_http_headers_tmp to JSON. So it is a string, not a hash and you can not access any fields in it.

1 Like

so what would be better approach here ?

Either do not call to_json or do it after extracting the information you needed. You could also get these fields from the original request_http_headers. I'm not sure what exactly your goal is and why you are copying that field.

Your desired value is in [request_http_headers][0][transaction_id]

If it isn't always in the first entry, you have to loop over the array with Ruby to find it.

ok let me try that approach. Thanks . I will update in few.

But that won't work?!

I think what you wanted to do is merge an array of hashes in ruby. Then you'd have the desired structure. But that's too uncomfortable to google on a smartphone. So I'll leave that up to you :slight_smile:

Back at a computer for a minute. Sorry that I didn't give you a clear solution from the start. It took me a moment to understand your thought process. You wanted to change the structure of the header field to be able to access your ID that is in one of the array entries. to_json doesn't do that. It just encodes that very same structure as a JSON string. What you really want to do is this:

event.set('request_http_headers', event.get('request_http_headers').reduce({}, :merge!))

Then your ID should by accessible as [request_http_headers][transaction-id].

mutate{
  copy => { "[request_http_headers][transaction-id]" => "transaction-id" }
}

I haven't tried it. But this way it sounds logical to me :slight_smile:

so other thing is not 100% transaction-id will be there in document. so will followng do he trick ?

if [request_http_headers][transaction-id] {

mutate{
  copy => { "[request_http_headers][transaction-id]" => "transaction-id" }
}

}

Yes. You can build a condition. But copy can't do anything anyway if the field that should be copied does not exist :upside_down_face:

logstash failing on this . i m not expert at ruby :slight_smile:

event.set('request_http_headers', event.get('request_http_headers').reduce({}, :merge!))

So what does Logstash say? (Remember to use double quotes to wrap the code if you take it the way it is because I used single quotes in the code. Or change my code accordingly. I know that you did it the opposite way around in your original configuration.)

ruby {
  code => "event.set('request_http_headers', event.get('request_http_headers').reduce({}, :merge!))"
}

I'll need your error message because I don't currently have a test system at hand.

:exception=>"LogStash::ConfigurationError", :message=>"Expected one of #, {, } at line 14, column 35 (byte 330) after filter {\n\n

here is my config i m using

input {
  beats {
    port => 5044
  }
} filter {

             json {
                   source => "message"
                   }

             ruby {
               code => '
                       event.set("request_http_headers_tmp",event.get("request_http_headers").to_json)
                       event.set('request_http_headers1', event.get('request_http_headers').reduce({}, :merge))
                     '
            }
            if "transaction-id" in [request_http_headers_tmp]
           {

                 mutate{
                 #add_field => { "transaction-id" => "%{request_http_headers_tmp.transaction-id}" }
                 copy => { "[request_http_headers_tmp][transaction-id]" => "transaction-id" }
                }
           }

             mutate {
              remove_field => [ "@version","fields","tags","host","agent","log","message","version","request_http_headers","ecs" ]
               }
             }
output {
      stdout {
            codec => rubydebug
             }
}

Read my previous answer. That's exactly the quote problem I had predicted. It causes a syntax error. And remember for the future: The code extract in those syntax error messages in the log usually ends exactly at the positiion of the problem. So the one you posted was probably longer than what you posted and ended somewhere in the Ruby code (Right?!)

crap sorry i missed your previous reply. yes its quote issue how genius of me lol

thanks a lot !

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.