Multiple IF conditions - Logstash

Hi,

I am fixing bigger logstash config file where I have custom grok patterns but that is just tip of the iceberg regarding my problems.

In Filebeat I have multiple log files and some of them (their log events) are visible in Kibana ok, and some not because their content ends up in the message field. Some log events are not even visible in Kibana when I try to filter it by tags in Kibana.

I can fix stuff if I have one IF statement but when multiple IF statements are involved I am not sure what is the culprit besides maybe Filebeat being slow or my custom grok patterns are not so good or both?

I have no problem with fixing logstash grok patterns but multiple IF statemens I have a problem with.

You can see the simple apache example I was resolving with the help of Larry on the following link: Kibana not showing dictionary output of log events

I can suspect that it takes time for Filebeat to send stuff to Elasticsearch through Logstash but just want to make sure if following is legit in logstash-config file.

if "something-1" in [tags] {
grok {
}
}
.
.
.
if "something-2" in [tags] {
grok {
}
}
.
.
.
if "something-N" in [tags] {
grok {
}
}

I have this set multiple times i.e. I have multiple if conditions set like that. So, was wondering if that is ok by logstash or I should include else if as well?

Not sure how logstash checks multiple if statements, like programming or it checks them all and if they match I get output?

What I want to achieve is that all of those if statements execute because I have to see all log files in Kibana i.e. their log events. If there is a smarter way to do it without IF I would appreciate some direction.

If an event can contain more than one of the tags then you have to do it that way. If it will only ever contain one of the tags then it will be very slightly more efficient to use else if.

Is the output conditional as well? If not, and you never drop {} events, then every event should be appearing in Kibana.

Thanks Badger for a fast response.

Yes, event can contain more than one tag, it depends on me how many will I put them. The thing is that I have multiple servers with same service (for e.g. syslog), and app logs (e.g. app1)

For system stuff like syslog I have just set tag "syslog" in Filebeat.

By output you mean entire log event in Kibana or?

When I asked about the output being conditional I was asking if you use similar conditions in the output. For example

output {
    if "oneThing" in [tags] {
        elasticsearch { index => "foo" }
    }
    if "anotherThing" in [tags] {
        elasticsearch { index => "bar" }
    }
}

Some folks do things like that and do not realize that if an event has neither oneThing nor anotherThing in tags it is not output anywhere, and effectively dropped.

Oh, I see. No, I don't use conditional in the output part, only inside the filter.

I will have one more problem to write here but I will do it either later today or tomorrow. It is connected to multiple grok patterns inside one pair of square brackets

Till next message

So, my question or doubt is related to multiple grok patterns under one IF condition.

The thing is that I have a log file that has multiple different log events but difference among them is not so big. They differ in maybe one field. So, is it better to set multiple grok patterns under one IF condition or should I break into pieces i.e. set another pattern under another IF condition so it doesn't look so crowdy ?

See example below:

if “app1” in [tags] {
grok {
match => ["message", “Pattern1”,
“Pattern2”,
“Pattern3”]
}
}

If the first part of the message is consistent and the differences are towards the end then it might be better to split things into two groks. For example if your messages always start with a timestamp, a program name, a thread id, a log level, and then have variable messages, grok the first 4 fields and then use a second grok to process the variable messages. But you are making me guess what your data looks like. This would be easier if you showed us the logs.

Here are examples of log events (slightly modified) so I don't expose too much data of the company I work for

<DEVEL> <app1> 2019-01-30T16:09:12+0000 <debug> GET params: { tid: 'internal_test' }

<DEVEL> <app1> 2019-01-30T16:09:12+0000 <debug> [validator] tid: internal_test

<DEVEL> <app1> 2019-01-30T16:09:12+0000 <debug> [process_data] 0mq message: {"wid":"internal_test","cookie":"154856321077857463500","device_cookie":"154856321077857463500","uid":null,"email":null,"ip":"186.40.130.90","timestamp":16455637391234,"query":{"nuv":"1.3.5","mobile":"iphone","a0":"Boots"},"is_mobile":true,"source":"app1.tracker","sub":{"ip":"186.40.130.90"}}

I would start using dissect

dissect { mapping => { "message" => "<%{env}> <%{appname}> %{ts} <%{loglevel}> %{restOfLine}" } }

Then attack restOfLine with an array of grok patterns.

Ok, thanks

I will use multiple grok under multiple IF conditions for a start. Later on when I adapt to dissect I'll enhance my config file. I am used to grok since it has grok debugger.

Also, thanks much for your help! Much appreciated. :slight_smile:

Today another thing jumped out. I was changing haproxy patterns and put them back to normal but logstash started to have following problem which I don't know how to resolve:

logstash | [2019-01-31T15:21:47,818][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"logstash-2019.01.31", :_type=>"doc", :routing=>nil}, #<LogStash::Event:0x1008fc91>], :response=>{"index"=>{"_index"=>"logstash-2019.01.31", "_type"=>"doc", "_id"=>"umKApGgBXLYk4dTRcZq2", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse field [timestamp] of type [date]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid format: \"Jan 31 15:21:45\""}}}}}

I have updated docker ELK to 6.6.0 and tried deleting index from Kibana but nothing changes. I don't quite understand which part of my config file is triggering this issue. If anyone can see and point me to the right direction a bit?

My Logstash config file looks like the one below:

input {
beats {
port => 5044
}
}

filter {
if "app0" in [tags] {
grok {
match => ["message", "<%{WORD:stack}> <%{WORD:service}> %{TIMESTAMP_ISO8601:timestamp} <%{LOGLEVEL:log_level}> %{WORD:method} %{GREEDYDATA:msg}"]
}
}

if "app1" in [tags] {
grok {
match => ["message", "<%{WORD:stack}> <%{WORD:service}> %{TIMESTAMP_ISO8601:timestamp} <%{LOGLEVEL:log_level}> \[%{WORD:action}\] %{GREEDYDATA:message}"]
}
}

if "haproxy" in [tags] {
grok {
match => ["message", "%{HAPROXYHTTP}"]
}
}

if "syslog" in [tags] {
grok {
match => ["message", "%{SYSLOGLINE}"]
}
}

if "auth" in [tags] {
grok {
match => ["message", "%{SYSLOGTIMESTAMP} %{IPORHOST:hostname} %{GREEDYDATA:msg}"]
}
}

if "app3" in [tags] {
grok {
match => ["message", "\[%{DATA:stack}\]\[%{DATA:program}\]\[%{DATA:log_level}\] %{GREEDYDATA}"]

}
}

if "app4" in [tags] {
grok {
match => ["message", "\[%{DATA:stack}\]\[%{DATA:program}\]\[%{DATA:log_level}\] %{GREEDYDATA}"]

}
}

if "app5" in [tags] {
grok {
match => ["message", "\[%{DATA:stack}\]\[%{DATA:program}\]\[%{DATA:log_level}\] ``%{GREEDYDATA}"]

}
}

if "app6" in [tags] {
grok {
match => ["message", "\[%{DATA:stack}\]\[%{DATA:program}\]\[%{DATA:log_level}\] %{GREEDYDATA}"]

}
}

if "app7" in [tags] {
grok {
match => ["message", "\[%{DATA:stack}\]\[%{DATA:program}\]\[%{DATA:log_level}\] %{GREEDYDATA}"]

}
}

}

output {
elasticsearch {
hosts => "elasticsearch:9200"
}
}

I was just looking at another date format problem. I would suggest the same. What is the format of the timestamp on the documents already in the index. In a test environment create a new index and add one document with each format. Does changing which one is the first to be added make a difference?

Thanks for a response and link. I will post that question on Elasticsearch part of the forum then as you have suggested on the other thread. I am a n00b, hence I would need more detailed guidance how to split indexes just because of the date format. This is a bit hairy and hopefully Elasticsearch team has some simple solution for me and others with the same issue. Honestly, I don't even like that this is happening, it just came out of the blue, looks more like a bug than a feature and it makes ELK setup tad difficult.

Managed to resolve the problem together with my boss. Elasticsearch had a collision with its predefined @timestamp and my timestamp defined in grok patterns.

When we replaced %{TIMESTAMP_ISO8601:timestamp} for %{TIMESTAMP_ISO8601:ts} everything started to work normal again.

Just not sure why all of a sudden this has happened and not before because I had %{TIMESTAMP_ISO8601:timestamp} from ELK version 6.2.4 till 6.5.4

Hopefully this helps someone with the similar issue. I will paste this answer to the other thread too so maybe person with the same struggle will be step forward.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.