Parsing message with plain text and json

(Sunil Chaudhari) #1

I have to filter one log which is mix of plain text and the json message.
So the sample is
[ERROR] ABCD posting failed! { StatusCodeError: 422 - {"error":{"message":"Duplicated ticket","status_code":422}}

Can anyone guide me what filters I can use for this case and how?
One way I have is I can write grok pattern for whole message but I don't think that's good solution.



You could possibly use a grok or a dissect filter for that.

I don't know dissect very well so I will give a grok example

Using the following grok pattern


Would result in

  "msg": "ABCD posting failed! ",
  "level": "ERROR",
  "json": "{\"error\":{\"message\":\"Duplicated ticket\",\"status_code\":422}",
  "status": "StatusCodeError: 422"

The you can use a json filter on the json field above to further parse it.

1 Like

An example using dissect would be

    dissect { mapping => { "message" => "[%{level}] %{errorMessage} { %{} - %{json}" } }
    json { source => "json" }
(Sunil Chaudhari) #4

Thanks @A_B,
Above sample worked fine with this.
However I have another sample which is bit different.

[ERROR] @verifyFavourite: { status: 403, code: 'LIMIT_REACHED', title: 'Maximum numbers of favourites reached' }

Groke pattern I wrote for this

Somehow escaping slash removed in the post when I quoted the text, I don't know why. So put extra slashes.

above grok pattern gives me below json.

  "level": [
  "SPACE": [
      " ",
      " "
  "errorDescription": [
  "errorJsonMessage": [
      "{ status: 403, code: 'LIMIT_REACHED', title: 'Maximum numbers of favourites reached' "

Note the single quote around value in json 'LIMIT_REACHED'.

However when I give same data to logstash config using std input, using echo command. It gives me below json. and got jsonparsefailure for the json field.


echo '[ERROR] @verifyFavourite: { status: 403, code: 'LIMIT_REACHED', title: 'Maximum numbers of favourites reached' }' |bin/logstash -f logstash_ec2_grok.conf /esarchive/test_environment/


In this case it gives json without single quotes. LIMIT_REACHED
Why is this difference in output from grok debugger and the console output?



I don't think that is valid JSON. E.g. jq on my machine does not like it.

That aside, if you have more than one line format in a log file you have to handle that in some way... Staying with grok you can make one über pattern that handles all scenarios (with conditional and/or optional fields) or make one named pattern per line type and match the lines against any of them

To elaborate a bit
Make a pattern file in e.g. /etc/logstash/patterns/my_log_patterns with content (just grabbing the patterns from above)

MY_PATTERN_01 \[%{DATA:level}\]%{SPACE}%{DATA:msg}\{%{SPACE}%{DATA:status}%{SPACE}-%{SPACE}%{DATA:json}\}$
MY_PATTERN_02 \[%{DATA:level}\]%{SPACE}%{DATA:errorDescription}\:%{SPACE}%{GREEDYDATA:errorJsonMessage}\}$


Then you can do

match          => { "message" => "%{COMBINED_PATTERNS}" }
(Sunil Chaudhari) #6

Hi @A_B,
Thanks for this approach.
I am struggling with jsonParse failure.

This is my input:

{ expedia: '{"status":{"code":500,"description":"Internal server error"},"data":"Cannot read property 'toFixed' of undefined"}',status: 'PAYMENT_FAILED' }

Grok Pattern:


When I use json filter for expediaDetails, it fails.

expediaDetails Output of logstash:

"expediaDetails" => "{ expedia: status:code:500,status: PAYMENT_FAILED } { expedia: status:description:Internal server error,status: PAYMENT_FAILED } { expedia: data:Cannot read property \'toFixed\' of undefined,status: PAYMENT_FAILED } "

Input is the nested json and the output is very different than the output in grokedebugger.
from the logstash output, I am not able to put the values in new fields. For example I tried below approach. Both are not working.

ruby {
code => 'event.set("statusCode1", event.get("[expediaDetails][expedia][status][code]"))'

mutate { add_field => {
"statusCode" => "%{[expediaDetails][expedia][status][code]}"

I dont know how to put these values in particular field.
What I understand is, in logstash output, its printing message like an array of values.
So first it prints expedia:status:code:500, later expedia:status:description and so on. But actual input structure is different.
Please help on this.



You go to the end of the line but the last part in not JSON. Only the part between the single ticks. i.e.

So, I would expect JSON parse failures...

(Sunil Chaudhari) #8

Sorry, Can you please explain bit more. I couldn't understand.
Also please suggest the solution.



JSON requires double quotes for its strings.

Which means that the start and end of the following is not valid JSON

{ expedia: '{"status":{"code":500,"description":"Internal server error"},"data":"Cannot read property 'toFixed' of undefined"}',status: 'PAYMENT_FAILED' }

Only the middle part that is within the single quotes. I have learned this the hard way :slight_smile:

(system) closed #10

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.