Grok (or any alternative) to search for keywords in logs

Hello,

Is it possible to create keywords in logstash, by searching for them in the message?

The logs are formatted in the following way, however they are not always in the same place - they could be embedded in other messages. For example in the following I would like to create keyword-value pairs
protocol=HTTP/1.1
uname=test-camel-svc
method=PUT
registert=02384827
etc..

INFO Test-Info:[protocol=HTTP/1.1, uname=test-camel-svc, method=PUT, registert=02384827, server=TESTVM9:8083, tracker_id=84d9d0231-2231-4a11-5e2d-88afa5ee12c6, bda_id=4352, testbda_id=?/?, cause=incoming_request, calling_server=XX.XXX.XXX.XXX, request_time=N/A, reception_date=2020-01-00T01:00:00.000+0000, time_elapsed_ms=37] 3123344 --- [XNIO-1 task-1] g.u.m.commons.logging.MDCLoggingFilter   : processing_end, processing_end

Note that the logs could be something completely irrelevant (for example a java error) or (and this is the problem): the whole info could be included in other messages with slightly different format:

WARN -- extra-characters .blahblah Test-Info:[protocol=HTTP/1.1, uname=test-camel-svc, method=PUT, registert=02384827, server=TESTVM9:8083, tracker_id=84d9d0231-2231-4a11-5e2d-88afa5ee12c6, bda_id=4352, testbda_id=?/?, cause=incoming_request, calling_server=XX.XXX.XXX.XXX, request_time=N/A, reception_date=2020-01-00T01:00:00.000+0000, time_elapsed_ms=37] 3123344 --- [XNIO-1 task-1] g.u.m.commons.logging.MDCLoggingFilter   : processing_end, processing_end some more-characters-here

Is there any way to do this with grok (or any alternative to grok?)

You want what is inside the square brackets after Test-Info: right?

For example, in this message:

INFO Test-Info:[protocol=HTTP/1.1, uname=test-camel-svc, method=PUT, registert=02384827, server=TESTVM9:8083, tracker_id=84d9d0231-2231-4a11-5e2d-88afa5ee12c6, bda_id=4352, testbda_id=?/?, cause=incoming_request, calling_server=XX.XXX.XXX.XXX, request_time=N/A, reception_date=2020-01-00T01:00:00.000+0000, time_elapsed_ms=37] 3123344 --- [XNIO-1 task-1] g.u.m.commons.logging.MDCLoggingFilter   : processing_end, processing_end

You want these fields:

protocol=HTTP/1.1, uname=test-camel-svc, method=PUT, registert=02384827, server=TESTVM9:8083, tracker_id=84d9d0231-2231-4a11-5e2d-88afa5ee12c6, bda_id=4352, testbda_id=?/?, cause=incoming_request, calling_server=XX.XXX.XXX.XXX, request_time=N/A, reception_date=2020-01-00T01:00:00.000+0000, time_elapsed_ms=37

If so, you can use a combination of dissect and kv.

filter {
    dissect {
        mapping => {
            "message" => "%{}Test-Info:[%{kvMsg}]%{}"
        }
    }
    kv {
        source => "kvMsg"
        value_split => "="
        field_split => ", "
    }
}

The dissect filter above you put everything between the square brackets after Test-Info in a field called kvMsg, the kv filter will then parse this message and create the individual fields.

1 Like

interesting! The problem is that one extra space can break this ... It is possible to have an extra space or slight variations somehow?

Extra space where? Can you share an example?

I haven't fully tested this but curious if it will work for you. This assumes Test-Info is in your log message.

filter {
  if "Test-Info" in [message] {
    kv {
      source => "message"
      field_split => ", "
      value_split => "="
      trim_key => " "
      trim_value => " "
      prefix => ""
    }
  }
}

it should ignore log messages that do not match your desired pattern and handle extra spaces.

Very interesting!
But I only want kv to process the part of the message that is after "Test-Info:[" and until "]".
Because this message could have more data (before and after). Is there a way to limit kv only to this part and for the rest continue with something else (Eg grok?)

there could be a) extra spaces inside the Test-Info part and also b) more data before and after Test-Info that kv cannot handle - so also after kv, I need to continue processing with something else (eg grok). Kv must process only the part inside Test-Info:[kv-data].
For example notice the extra space after protocol (in the other reply Sunile_Manjee suggested to use trim_value and trim_key, but doesn't limit the processing within the first "[kv-data]" ) :

INFO ----Further.data-with-different-format ----WARN Test-Info:[protocol=HTTP/1.1  , uname=test-camel-svc, method=PUT, registert=02384827 , server=TESTVM9:8083, tracker_id=84d9d0231-2231-4a11-5e2d-88afa5ee12c6, bda_id=4352, testbda_id=?/?, cause=incoming_request, calling_server=XX.XXX.XXX.XXX, request_time=N/A, reception_date=2020-01-00T01:00:00.000+0000, time_elapsed_ms=37] 3123344 --- [XNIO-1 task-1] g.u.m.commons.logging.MDCLoggingFilter   : processing_end, processing_end - more extra data;separatedwithother-symbols

If you have variable fields, then you have to use Grok:

 grok {
       match => {  "message" => "%{DATA}:%{SPACE}\[%{DATA:msg}\]%{SPACE}%{POSINT}%{SPACE}%{GREEDYDATA}" }

  }

Yes but for my case perhaps I could use first an if statement to see if Test-Info is included. If true, then use kv to get the key-value pairs. If not then continue with grok.
Is this ok with logstash?

Whatever you want.

1 Like

If the extra space is on those cases you can use a mutate filter in the kvMsg field before the kv filter.

Something like this:

mutate {
    gsub => ["kvMsg"," , ",", "]
}

This will turn all the insaces where you have <SPACE>,<SPACE> into ,<SPACE> and avoid the need to trim the fields individually.

can you try this which should only fetch KVs between Test-Info:[ and ]

filter {
  if "Test-Info" in [message] {
    grok {
      match => {
        "message" => "Test-Info:\[%{GREEDYDATA:test_info}\]"
      }
    }
    kv {
      source => "test_info"
      field_split => ", "
      value_split => "="
      trim_key => " "
      trim_value => " "
      prefix => ""
    }
  }
}

Thank you for the answers, but for some reason it does not seem to work. I don't get any error messages. The test_info is created and i can see it in kibana, but kv does not seem to work. The other solution that involves dissect seems to be working

Thank you! This works perfectly!!
One last question: is it possible somehow to avoid adding kvMsg in the final message (i.e use it temporarily and then delete this keyword?)

You may add a remove_field => ["kvMsg"] in the kv filter to remove the field if the filter is successful.

    kv {
        source => "kvMsg"
        value_split => "="
        field_split => ", "
        remove_field => ["kvMsg"]
    }

Or you could use a [@metadata] field, that will not be present in the output, just replace kvMsg with [@metadata][kvMsg] for example.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.