Logstash cannot parse JSON file correctly

Hello there team, how are you?

I am pretty new to ELK stack, but right now I do have issue with parsing logs.
We do have ~15 machines shipping logs over filebeat to our logstash.
We are parsing NGINX and self created logs in JSON form line by line in files.
Since theres too much logs we are splitting our files by hour.

Logstash configuration:

input {
beats{
port => 5044
}
}
filter {
if "beats_input_codec_plain_applied" in [tags] {
mutate {
remove_tag => ["beats_input_codec_plain_applied"]
}
}
if "json" in [tags] {
mutate {
replace => [ "message", "%{message}" ]
gsub => [ 'message','\n','' ]
}
if [message] =~ /^{.*}$/ {
json { source => message }
}
} else if [fileset][module] == "nginx" {
if [fileset][name] == "access" {
grok {
match => { "message" => ["%{IPORHOST:[nginx][access][remote_ip]} - %{DATA:[nginx][access][user_name]} [%{HTTPDATE:[nginx][access][time]}] "%{WORD:[nginx][access][method]} %{DATA:[nginx][access][url]} HTTP/%{NUMBER:[nginx][access][http_version]}" %{NUMBER:[nginx][access][response_code]} %{NUMBER:[nginx][access][body_sent][bytes]} "%{DATA:[nginx][access][referrer]}" "%{DATA:[nginx][access][agent]}""] }
remove_field => "message"
}
mutate {
add_field => { "read_timestamp" => "%{@timestamp}" }
}
date {
match => [ "[nginx][access][time]", "dd/MMM/YYYY:H:m:s Z" ]
remove_field => "[nginx][access][time]"
}
useragent {
source => "[nginx][access][agent]"
target => "[nginx][access][user_agent]"
remove_field => "[nginx][access][agent]"
}
geoip {
source => "[nginx][access][remote_ip]"
target => "[nginx][access][geoip]"
}
}
else if [fileset][name] == "error" {
grok {
match => { "message" => ["%{DATA:[nginx][error][time]} [%{DATA:[nginx][error][level]}] %{NUMBER:[nginx][error][pid]}#%{NUMBER:[nginx][error][tid]}: (*%{NUMBER:[nginx][error][connection_id]} )?%{GREEDYDATA:[nginx][error][message]}"] }
remove_field => "message"
}
mutate {
rename => { "@timestamp" => "read_timestamp" }
}
date {
match => [ "[nginx][error][time]", "YYYY/MM/dd H:m:s" ]
remove_field => "[nginx][error][time]"
}
}
}
}
output {
if "json" in [tags] {
elasticsearch {
hosts => ["localhost:9200"]
index => "api-%{+YYYY.MM.dd}"
document_type => "api"
sniffing => true
template_name => "api-template"
template => "/opt/bitnami/logstash/newdata/apizendesk.json"
}
}
else {
elasticsearch {
hosts => ["localhost:9200"]
index => "weblogs-%{+YYYY.MM.dd}"
document_type => "nginx_logs"
}
}
#stdout { codec => rubydebug }
}

And here is the few lines of JSON file:

{"context":"console-1555926841","location":"xx1","dt":"2019-04-22 10:00:00","ts":1555927200,"ident":"request","mtuser":19163,"zuser":-1,"pid":18127,"action":"LoadedScript:get","page":"api/v2/users/show_many","args":{"query":{"ids":"369978220634"}},"response":,"bt":}

But sometimes we do have error in our code and its stored like this:

{"context":"web-51230891bb17712a9470918156dd73362785420123124552527","location":"xx2","dt":"2019-04-20 23:13:32","ts":1555976012,"ident":"Unhandled_error","mtuser":0,"zuser":0,"pid":2216,"error":{"code":500,"type":"DbException","errorCode":2002,"message":"DbConnection failed to open the DB connection.","file":"/home/user/protected/components/filename.php","line":13,"trace":"#0}...

So how to split this error object?
Also should I use split, grok, json_lines?
What would be the best way to do this?

Thank you people.

Edit 1: Just wanted to be clear: I have searched forum, stackoverflow, google, duckduckgo, but at the end I think its best option to get the response from community itself :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.