Using drop to filter messages

Hi All,

I am using grok filter to parse messages coming into Logstash from filebeat. We have ELK 7.6.2 stack.

I need to filter out and process "only" the following message(s) in the gc log as follows. Note that pretty much every value is a variable:

[2022-06-01T16:47:10.415+0000][info][gc             ] GC(15217) Pause Full (Diagnostic Command) 2178M->1036M(2560M) 1134.387ms

Please guide on what regex should I use to make the following work. So far I am able to do this:

        filter {
                if [type] == "tv_gclog_analysis"  {

                grok {
                        match => { "message" => '\[%{TIMESTAMP_ISO8601:createdTime}\]\[%{WORD:logLevel}\]+%{GREEDYDATA:message} %{GREEDYDATA:heapDrop}\(%{DATA:maxHeap:int}\) %{GREEDYDATA:timeTaken:int}' }
                        }

                        if ([message] !~ "Full") {
                            drop { }
                                        }

                                }
                        }

This does not work..

If I take out the drop part above then I do see all messages coming in and getting indexed.

Thanks

Indeed. If remove the drop {} and look at the event you will see that [message] is an array, because you take a field called [message] and use grok to extract a field called [message] from it, so you end up with

    "message" => [
    [0] "[2022-05-24T02:15:20.979+0000][info][gc             ] GC(187) Pause Full (G1 Evacuation Pause) 2559M->1698M(2560M) 724.899ms",
    [1] "[gc             ] GC(187) Pause Full (G1 Evacuation Pause)"
]

It strikes me as unlikely that that is useful to you. Perhaps rename the grok field to gcmessage, or else set the overwrite option on the grok filter.

In either case, why do the grok if you are going to throw away the results? Move the drop before the grok.

    if [message] !~ "Full" { drop {} }
    grok { ...

Thanks @Badger

I tried placing the drop part before the grok and saw no message(s) coming in. This is how it looks now:

        filter {
                if [type] == "tv_gclog_analysis"  {

                        if [message] !~ "Full" { drop {} }

                        grok {
                                match => { "message" => '\[%{TIMESTAMP_ISO8601:createdTime}\]\[%{WORD:logLevel}\]+%{GREEDYDATA:message} %{GREEDYDATA:heapDrop}\(%{DATA:maxHeap:int}\) %{GREEDYDATA:timeTaken:int}' }
                        }

                                }
                        }

I have set up rubydebug and noticed nothing coming in.

how about you do something reverse. i.e only process log which has Full word in it

if ( "Full" in [message]) {
         grok {   }
}
else { 
   drop {} 
}

Thanks. Tried the following and still dont see messages coming in:

filter {
          if [type] == "tv_gclog_analysis"  {

                        if "Full" in [message] {
                                grok {
                                        match => { "message" => '\[%{TIMESTAMP_ISO8601:createdTime}\]\[%{WORD:logLevel}\]+%{GREEDYDATA:message} %{GREEDYDATA:heapDrop}\(%{DATA:maxHeap:int}\) %{GREEDYDATA:timeTaken:int}' }
                        }

                                }
                        else {
                               drop {}
                            }
                        }
                   }

try this

it is possible that it is going inside if but your grok pattern is not legit and can't produce anything

filter {
          if [type] == "tv_gclog_analysis"  {

                        if "Full" in [message] {
                          mutate {  add_field => { "zaeemmasooddddddd" => "AAAAAAAAAAAAAAAAAA" } }
                                #grok {
                                        #match => { "message" => '\[%{TIMESTAMP_ISO8601:createdTime}\]\[%{WORD:logLevel}\]+%{GREEDYDATA:message} %{GREEDYDATA:heapDrop}\(%{DATA:maxHeap:int}\) %{GREEDYDATA:timeTaken:int}' }
                        }

                                }
                        else {
                               drop {}
                            }
                        }
                   }

if this works you will see lot of AAAAAAAAAAAA on your screen. then you can fix your grok.

Thanks. With the following I see no messages coming in:

filter {
          if [type] == "tv_gclog_analysis"  {

                        if "Full" in [message] {
                          mutate {  add_field => { "zaeemmasooddddddd" => "AAAAAAAAAAAAAAAAAA" } }
                                #grok {
                                        #match => { "message" => '\[%{TIMESTAMP_ISO8601:createdTime}\]\[%{WORD:logLevel}\]+%{GREEDYDATA:message} %{GREEDYDATA:heapDrop}\(%{DATA:maxHeap:int}\) %{GREEDYDATA:timeTaken:int}' }
                        #}

                                }
                        else {
                               drop {}
                            }
                        }
                   }

The rubydebug log file does not get populated.

I can for sure see messages containing "Full" in the gc log as follows:

[2022-06-02T19:37:05.443+0000][info][gc             ] GC(23145) Pause Full (Diagnostic Command) 1967M->1114M(2560M) 643.338ms

you have to debug this out. because if it is not even going in loop means it is not going in

if [type] == "tv_gclog_analysis" either? start from there one step at a time

just print everything after if [type] == then if "full" in [message] ...... and so on

Thank you so much!

The problem was in my setting up of the filter as it was nesting in another one.

I can see only "Full" messages coming in as follows:

$ grep Full 6044_rubydebug.txt 
        [0] "[2022-06-02T20:39:43.996+0000][info][gc             ] GC(23418) Pause Full (Diagnostic Command) 1946M->1066M(2560M) 831.805ms",
        [1] "[gc             ] GC(23418) Pause Full (Diagnostic Command)"
        [0] "[2022-06-02T20:39:43.165+0000][info][gc,start       ] GC(23418) Pause Full (Diagnostic Command)",

Now the issue is I only want to see the line carrying the important stuff which is:

[2022-06-02T20:45:20.349+0000][info][gc             ] GC(23462) Pause Full (Diagnostic Command) 1328M->1088M(2560M) 471.188ms"

I guess another regex would be needed? So need to filter out the following two:

 "[gc             ] GC(23462) Pause Full (Diagnostic Command)"
 "[2022-06-02T20:45:19.878+0000][info][gc,start       ] GC(23462) Pause Full (Diagnostic Command)",

then you can just do this

if "Diagnostic Command" in [message] { drop{} }
if "gc             "  in [message] {
  #dosomething
}
else { drop {} }

i don't know if it count spaces in this or not. if this does not work you can go reverse and drop

if "gc,start" in [message] { drop {} }

Thank You @elasticforme

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.