Parse a log line and send only 1st 2 levels

pgokhale10 · November 2, 2017, 6:51pm

Hi,
Lets says my log looks like this:

IP - - [24/Oct/2017:08:11:46 +0000] "GET /X/Y/Z/ABCD HTTP/1.1" 200 1225 "www.x.com?query=&abcd" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36" 0.012 0.012 .

Now when the filebeat sees this line, it should just send the 1st 2 levels:
/X/Y

See the bold part of the log

Thanks~

andrewkroh · November 2, 2017, 8:56pm

Are you asking how to make Filebeat send only the first two levels of the path? If so, what is your current setup? Are you using Filebeat modules?

pgokhale10 · November 2, 2017, 8:58pm

Hi,
Yes we have filebeat as a container sitting inside a pod. I am pointing filebeat to get data from ngnix access log.

ilebeat:
prospectors:
# Each - is a prospector. Below are the prospector specific configurations
-
paths:
- /var/nginx/.log
exclude_lines: ['^.\b(itfm-cloud-dc-transformer)\b.*$']
input_type: log
# You usually want tail_files: true. Wavefront shows the ingestion timestamp to be when a log line
# was received, not when it was written to your logfile. For this reason, ingesting back-in-time
# data will surprise you unless you are expecting that.
tail_files: true
# This is important if you have some kind of log rotator. Filebeat won't let go of an open FD
# of a rotated file without this option.
close_inactive: 5m
registry_file: /var/lib/filebeat/registry

andrewkroh · November 2, 2017, 9:12pm

You can either add Logstash into the mix or use an Ingest Node pipeline to parse the path. In either case you will use grok to break the message into fields and this can include parsing the path down to two levels.

If you use Ingest Node you will add a pipeline option to your config the is the name of the pipeline that you PUT into ES.

pgokhale10 · November 2, 2017, 9:31pm

Thanks Andrew,
I send logs from filebeat to wavefront. In this case wavefront is a place where i put the logs into different field using something like:
pattern: '%{COMBINEDAPACHELOG} %{NUMBER:response}'

So if I were to only send the urls 2 level deep I need to do something at the filebeat level. Any option for that?

andrewkroh · November 2, 2017, 11:08pm

Can you customize the Wavefront grok pattern? Filebeat does not have parsing capability.

If you can add a second grok filter that operates on the request field with a pattern like (?<path>(/[^/]+){2}) this should create a path field containing /X/Y.

system · November 30, 2017, 11:09pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.