_grokparsefailure is occurring even after the grok pattern is success in grokdebugger

Hi Team,

I'm using logstash 6.8.3, and I'm trying to parse ES slow logs and my sample field is a ES source_query which looks like

"source_query":{"from":0,"size":0,"post_filter":{"bool":{"must":[{"term":{" **someId** ":{"value":1234,"boost":1.0}}},{"bool":{"must":[{"term":{"indexedAttributes.some_id.long":{"value":1234,"boost":1.0}}},{"term":{"deleted":{"value":"false","boost":1.0}}},{"term":{"someGroupIds":{"value":121221,"boost":1.0}}}],"adjust_pure_negative":true,"boost":1.0}}],"adjust_pure_negative":true,"boost":1.0}},"version":true,"_source":{"includes":["orderId"],"excludes":},"sort":[{"sortedAttributes.lastUpdatedOn.date":{"order":"desc","missing":"_last","unmapped_type":"keyword"}}]}

and the grok pattern i've used is like below.

^{\"from\":%{INT},\"size\":(%{DATA:totalSize}),%{DATA}{\"someId\":{\"value\":%{INT:someId},%{DATA}}$

and it works perfect in https://grokdebug.herokuapp.com/, see below screenshot but however it returns "tags" => [
[0] "_grokparsefailure"
] and i don't see the extracted fields totalSize and someId

Thanks,
Vaseem

This part don't respect your grok pattern.
Replace

^{\"from\":%{INT},\"size\":(%{DATA:totalSize}),%{DATA}{\"someId\":{\"value\":%{INT:someId},%{DATA}}$

With

^{%{QUOTEDSTRING}:%{INT},%{QUOTEDSTRING}:%{DATA:totalSize},%{DATA}{%{QUOTEDSTRING}:{%{QUOTEDSTRING}:%{INT:someId},%{DATA}}$

#I use quoted string and remove unecessary parentheses.

Your input respect the json format, use the json plugin should be the easiest way to index your values.

1 Like

Hey @Cad ,

Awesome, it worked, a big shout to you :slight_smile:

Thank you so much for helping me out!.

Hey @Cad ,

Yes you are right, I think I should use json plugin, but from the entire json I'm interested in only one key value and that json path is not static in nature, how can I solve this, any thoughts?

Thanks in Advance!
Vaseem.

Hey @Cad ,

I've used Json filter, and I'm trying to extract another josn object field value with in it, its basically like below:


however, I got an exception like below

[2021-08-10T11:36:03,605][WARN ][logstash.filters.json    ] Error parsing json {:source=>"query", :raw=>{"bool"=>{"must"=>[{"term"=>{"someId"=>{"value"=>12345, "boost"=>0.1e1}}}, {"term"=>{"active"=>{"value"=>true, "boost"=>0.1e1}}}, {"term"=>{"deleted"=>{"value"=>false, "boost"=>0.1e1}}}], "adjust_pure_negative"=>true, "boost"=>0.1e1}}, :exception=>java.lang.ClassCastException: org.jruby.RubyHash cannot be cast to org.jruby.RubyIO}

My configuration look like this

if "_grokparsefailure" in [tags]     {
        json {
            source => "source_query"
            target => "parsedJson"
        }
        json {
           source => "query"
            #add_field => { "json_someId" => "%{[query][bool][must[0]][term][busId][value]}" }
        }
}

It will be grateful for me, if you can help me in this

Thanks,
Vaseem

Hey @Cad ,
Finally I got the fix for the exception

[2021-08-10T11:36:03,605][WARN ][logstash.filters.json    ] Error parsing json {:source=>"query", :raw=>{"bool"=>{"must"=>[{"term"=>{"someId"=>{"value"=>12345, "boost"=>0.1e1}}}, {"term"=>{"active"=>{"value"=>true, "boost"=>0.1e1}}}, {"term"=>{"deleted"=>{"value"=>false, "boost"=>0.1e1}}}], "adjust_pure_negative"=>true, "boost"=>0.1e1}}, :exception=>java.lang.ClassCastException: org.jruby.RubyHash cannot be cast to org.jruby.RubyIO}

, I used below code block in input filter and the issue got fixed.

codec => json{
        charset => "UTF-8"
    }

However I'm unable to extract the key value from the parsed josn, and this is how I'm trying to extract, can you please put me in right path

json {
           source => "query"
            add_field => { "json_someId" => "%{[query][bool][must[0]][term][busId][value]}" }
        }

Hi,

According to the data you show us, term have a nested field named someId not busId.
And to access to a specific array position, you have to use [must][0] instead of [must[0]]

Here the final line:
add_field => { "json_someId" => "%{[query][bool][must][0][term][someId][value]}" }

Hey @Cad ,

I tried the above code but it doesn't throw any error or doesn't add the requested filed, any thoughts please?

Can you show us the complet conf file ?

1 Like

Hey @Cad ,

It's fixed, I used this code now and it worked, and thank you so much for your kind and quick response :slight_smile:

if "_grokparsefailure" in [tags]     {
        json {
            source => "source_query"
            target => "parsedJson"
            add_field => { "someId" => "%{[parsedJson][query][bool][must][0][term][someId][value]}" 
        }
    }
1 Like

Hey @Cad ,

I'm trying to calculate the time taken to parse by JSON filter but the given field is not being added in the output, can you please see if there is anything wrong with my code

ruby {
        code => "event.set('before_json_processed_at', Time.now().nsec)"
    }
        json {
            source => "source_query"
            target => "parsedJson"
            add_field => { "someBusid" => "%{[parsedJson][query][bool][must][0][term][someBusid][value]}" }
          
        }
        ruby {
                code => 'event.set("after_json_processed_at", Time.now().nsec)
                duration_in_n=event.get("after_json_processed_at")-event.get("before_json_processed_at")
                event.set("duration_in_nanos", duration_in_n )'
       }

Hi,

Which one is not added ?
Do you see the other two ?

Hey @Cad ,

No I don't see either of them.

Thanks you,
Vaseem

Hey @Cad ,

I found my self the solution BUT not with the above code, However I'm using the awesome endpoint provided by the Elasticsearch Node Stats API | Logstash Reference [8.11] | Elastic

And I've used.

localhost:9600/_node/stats/pipelines?pretty

And I got the output like this,

{
                        "id": "a1756021d535d74b0f29412d959bb091cbddf78bce667bba8c3a4bd82b169cc2",
                        "failures": 2,
                        "events": {
                            "out": 927,
                            "duration_in_millis": 1274,
                            "in": 927
                        },
                        "name": "grok",
                        "matches": 925,
                        "patterns_per_field": {
                            "source_query": 1
                        }
                    },
                    {
                        "id": "84f54948542022bfb6ea79821ff56800a92d34848b7f69577928773286794507",
                        "events": {
                            "out": 2,
                            "duration_in_millis": 39,
                            "in": 2
                        },
                        "name": "json"
                    }

from the above output, My understanding is, the event json took total 2 records and parsed/transformed/processed it in 39 milliseconds which is basically 39/2 = 19.5 milliseconds for each document , Can you please correct me if my understanding is not correct?

Also, can you please tell me the difference between these

duration_in_millis

Vs

queue_push_duration_in_millis

Thanks,
Vaseem

Hi vaseemQA,

I think so, just to be more precise, it's an average per event

I never use this API but according to the documentation, queue_push_duration_in_millis seems to be related to Persistent queues.

This answer from a previous topic seems pretty accurate about what they means and about my understanding from the documentation.

Sorry for not being able to be more helpful on this subject

Cad.

1 Like

Hey @Cad ,

I need help on json filter plugin, the thing is I've the capture a value from one json path but some times the json pattern changes so, how can we achieve that do we have any fallback for that?

for example see below code please.

json {
            source => "source_query"
            target => "parsedJson"
            add_field => { "es_someId" => "%{[parsedJson][query][bool][must][0][term][someId][value]}"}
          if "es_someId" =~ "parsedJson" {
                update => {"es_someId" => "%{[parsedJson][query][bool][filter][0][bool][must][3][bool][should][0][term][someId][value]}"}
            }
            remove_field => [tags]
        }

but is giving failed exception as like below

Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"LogStash::ConfigurationError", :message=>"Expected one of #, => at line 69, column 16 (byte 2013) after filter

and in the line number 69 , there is a if block statement which is if "es_someId" =~ "parsedJson" { as shown in above code example.

can you please help me

Thanks in Advance,
Vaseem

I think here you want to check the content of es_someId. In a condition, to access to a field value you have to use this syntax :
if [es_someId] =~ "parsedJson" {

Hey @Cad,

Thanks for that, can you look at the below poblem please

Problem: Trying to extract the key value pair from a json BUT the JSON is not static in nature it may have number of patterns like expected key pair value json path will be changed dynamically, so in grok we use

break_on_match=>true

it helps us to check the match against the number of patterns and if matched one of the pattern then it wont check for another one, similarly do we have anything to match against different jsonPaths for a value?

Thanks in Advance,

Vaseem

I'm not aware of an option who can do that.
But you can try to use the ruby filter to browse your fields and found the data you want.

Hey @Cad ,

Yeah finally I've moved to ruby instead JSON plugin as the JSON is not static in nature and below is the logic I used , let me know if there is any better way.

if "_grokparsefailure" in [tags]{
         
            ruby{
                 code => 'event.get("message")
                 if event.get("message").include? "someId"
                     someIdIndexStarts=event.get("message").index("someId")
                    someIdValueIndexStart=someIdIndexStarts+19
                     lengthOfTotalString=event.get("message").length
                     for i in someIdValueIndexStart..lengthOfTotalString
                         if event.get("message").slice(i) == ","
                             endOfSomeIdValueIndex=i
                             event.set("es_someId",event.get("message").slice(someIdValueIndexStart..endOfSomeIdValueIndex-1));
                             break;
                         end
                     end
                 end'
            }
    }

Thanks for your support all the time @Cad

Vaseem,