Logstash not able to parse logs with spaces between key value pair in json object

Dhruvi_Shah · March 22, 2024, 5:19pm

I have a log containing json object. The log gets parsed if json object has no spaces. If it has spaces between key value pair, it is not getting parsed.

Configuration file used

input {
syslog {
port => 3011
}
}

filter {
grok {
match => { "message" =>
[
"%{SYSLOGTIMESTAMP:timestamp4} %{DATA:time_ms}|%{DATA:field1}|%{DATA:field2}|99|%{DATA:field3}|%{DATA:field4}|%{DATA:field5}|%{DATA:field6}|%{DATA:field7}|%{DATA:field8}|%{DATA:field9}|%{DATA:field10}|%{DATA:field11}|%{DATA:field12}|%{GREEDYDATA:field13}"
]
}
}
date {
match => ["timestamp4", "MMM dd HH:mm:ss"]
}
if [field13] {
mutate {
add_field => {"log_type" => "my-logs"}
}
}
}

output {
if [log_type] == "my-logs" {
stdout { codec => rubydebug }
elasticsearch {
hosts => ["ES_HOST:9200"]
index => "my-logs-000001"
}
}
}

Logs getting parsed:
echo "Mar 21 13:27:11 11:11.366293|dataadwhw1|ebsmp4713user5_@maiator|99|4064|22|SUCCESS|data|19|UA101|10.1.1.70|https|data.com|{"wrg_id":"200000337"}|200" | nc localhost 3011

echo "Mar 21 13:27:11 11:11.366293|dataadwhw1|ebsmp4713user5_@maiator|99|4064|22|SUCCESS|data|19|UA101|10.1.1.70|https|data.com|{"wrg_id" :"200000337"}|200" | nc localhost 3011

Log not getting parsed:
echo "Mar 21 13:27:11 11:11.366293|dataadwhw1|ebsmp4713user5_@maiator|99|4064|22|SUCCESS|data|19|UA101|10.1.1.70|https|data.com|{"wrg_id": "200000337"}|200" | nc localhost 3011

Any workaround to parse this log?

Badger · March 22, 2024, 7:58pm

    input { generator { count => 1 lines => [
 'Mar 21 13:27:11 11:11.366293|dataadwhw1|ebsmp4713user5_@maiator|99|4064|22|SUCCESS|data|19|UA101|10.1.1.70|https|data.com|{"wrg_id":"200000337"}|200',
 'Mar 21 13:27:11 11:11.366293|dataadwhw1|ebsmp4713user5_@maiator|99|4064|22|SUCCESS|data|19|UA101|10.1.1.70|https|data.com|{"wrg_id" :"200000337"}|200',
 'Mar 21 13:27:11 11:11.366293|dataadwhw1|ebsmp4713user5_@maiator|99|4064|22|SUCCESS|data|19|UA101|10.1.1.70|https|data.com|{"wrg_id": "200000337"}|200'
] } }
output { stdout { codec => rubydebug { metadata => false } } }
filter {
    mutate { remove_field => [ "event", "host", "log" ] }

    grok { match => { "message" => [ "%{SYSLOGTIMESTAMP:timestamp4} %{DATA:time_ms}\|%{DATA:field1}\|%{DATA:field2}\|99\|%{DATA:field3}\|%{DATA:field4}\|%{DATA:field5}\|%{DATA:field6}\|%{DATA:field7}\|%{DATA:field8}\|%{DATA:field9}\|%{DATA:field10}\|%{DATA:field11}\|%{DATA:field12}\|%{GREEDYDATA:field13}" ] } }
    date { match => ["timestamp4", "MMM dd HH:mm:ss"] }
    if [field13] { mutate { add_field => {"log_type" => "my-logs"} } }
}

parses all three lines. You need to give us a reproducible example of what fails.

leandrojmp · March 22, 2024, 8:43pm

Your logs have always this format? This can be parsed with the csv filter using the | as the separator, no need to use grok.

Dhruvi_Shah · March 23, 2024, 10:00am

These 3 logs do get parsed when I use grok debugger tool in elastic. But when I use it in a setup where the logstash is sending logs to kibana, the 3rd log doesn't reach to kibana for some reason. And even no error is being logged in the logs.

Dhruvi_Shah · March 23, 2024, 10:01am

I am not using csv because timestamp4 and time_ms field are not separated by a delimiter.

leandrojmp · March 23, 2024, 3:58pm

This is not an issue you can combine a grok filter to extract this part and then a csv filter.

Your main issue with the csv filter would be the fact that you have unquoted strings and quoted strings in the same line, to make this work you would need to trick the csv filter to not consider double quotes as quotes, this can be easily done with the option quote_char.

The following pipeline will parse your messages:

filter {
    grok {
        match => {
            "message" => "%{SYSLOGTIMESTAMP:timestamp4} %{DATA:time_ms}\|%{GREEDYDATA:csv_message}"
        }
        remove_field => ["message"]
    }
    csv {
        source => "csv_message"
        separator => "|"
        columns => ["[field1]","[field2]","[@metadata][not_used]","[field3]","[field4]","[field5]","[field6]","[field7]","[field8]","[field9]","[field10]","[field11]","[field12]",,"[field13]"]
        quote_char => "'"
        skip_empty_columns => true
        remove_field => ["csv_message"]
    }
    date {
        match => ["timestamp4", "MMM dd HH:mm:ss"]
    }
    if [field13] {
        mutate {
            add_field => {
                "log_type" => "my-logs"
            }
        }
    }
}

The result is something like this:

{
       "field11" => "data.com",
        "field6" => "data",
       "time_ms" => "11:11.366293",
        "field7" => "19",
        "field2" => "ebsmp4713user5_@maiator",
        "field5" => "SUCCESS",
        "field4" => "22",
        "field9" => "10.1.1.70",
       "field13" => "200",
        "field3" => "4064",
      "log_type" => "my-logs",
        "field8" => "UA101",
    "@timestamp" => 2024-03-21T16:27:11.000Z,
    "timestamp4" => "Mar 21 13:27:11",
        "field1" => "dataadwhw1",
       "field10" => "https",
       "field12" => "{\"wrg_id\": \"200000337\"}"
}

Dhruvi_Shah · March 24, 2024, 3:07pm

Thanks for the workaround. But still field12 only gets parsed if it is of the form {"wrg_id":"200000337"} or {"wrg_id" :"200000337"}. But the whole log remains unparsed if field12 is {"wrg_id": "200000337"}.

Also not able to figure out much from the logs too. Just getting connection closed message.
[logstash.inputs.syslog ][main][d7e74be29670dab531986f0a3c5e7079c4452996e98bbec29bc3c11efe6f59c1] connection closed {:client=>"0:0:0:0:0:0:0:1:49428"}

leandrojmp · March 24, 2024, 3:11pm

Not sure what you mean with this, the values you mentioned are all the same. Please share some example message where the logs is not working and what is the output you are getting.

Dhruvi_Shah · March 24, 2024, 3:21pm

The values are not exactly same. The two logs which are getting parsed have either no space between the key and value. [Log1 => field12 is {"wrg_id":"200000337"}] or there is space between key and value before colon. [Log2 => field12 is {"wrg_id" :"200000337"}]
The log which is not getting parsed has space between key and value after colon. [Log3 => field12 is {"wrg_id": "200000337"}]

leandrojmp · March 24, 2024, 4:14pm

This makes no difference if you are using the filter that I shared, the csv filter will parse de csv message, the value you have between the separators makes no different for the csv filter.

Just tested it here putting a lot of spaces and got this output:

{
        "field2" => "ebsmp4713user5_@maiator",
        "field8" => "UA101",
        "field9" => "10.1.1.70",
      "log_type" => "my-logs",
       "field11" => "data.com",
       "field13" => "200",
        "field3" => "4064",
        "field5" => "SUCCESS",
        "field6" => "data",
        "field1" => "dataadwhw1",
        "field4" => "22",
    "timestamp4" => "Mar 21 13:27:11",
       "time_ms" => "11:11.366293",
       "field10" => "https",
    "@timestamp" => 2024-03-21T16:27:11.000Z,
        "field7" => "19",
       "field12" => "{\"wrg_id\"                  :                    \"200000337\"}"
}

So it is not clear what is your issue, you need to share the message that is failing and what is the output you are getting.

Dhruvi_Shah · March 25, 2024, 5:27pm

Not sure but below log is getting parsed in my setup.
echo "Mar 21 13:27:11 11:11.366293|dataadwhw1|ebsmp4713user5_@maiator|99|4064|22|SUCCESS|data|19|UA101|10.1.1.70|https|data.com|{"wrg_id": "200000337"}|200" | nc localhost 3011

@leandrojmp Can you let me know which version of elastic and logstash are you using? And whether you are using docker to setup ELK or binary files for installation?

system · April 22, 2024, 5:28pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash parsing json Logstash	9	763	March 29, 2021
Logstash Parsing Error for Field which contains Spaces Logstash	5	1172	July 6, 2017
Logstash Json Parsing Error Logstash	2	224	August 25, 2023
KV not including space values Logstash	6	603	March 9, 2018
Logstash config JSON parse error Logstash	4	1839	October 29, 2019

Logstash not able to parse logs with spaces between key value pair in json object

Related topics