Hi,
Yes we have filebeat as a container sitting inside a pod. I am pointing filebeat to get data from ngnix access log.
ilebeat:
prospectors:
# Each - is a prospector. Below are the prospector specific configurations
-
paths:
- /var/nginx/.log
exclude_lines: ['^.\b(itfm-cloud-dc-transformer)\b.*$']
input_type: log
# You usually want tail_files: true. Wavefront shows the ingestion timestamp to be when a log line
# was received, not when it was written to your logfile. For this reason, ingesting back-in-time
# data will surprise you unless you are expecting that.
tail_files: true
# This is important if you have some kind of log rotator. Filebeat won't let go of an open FD
# of a rotated file without this option.
close_inactive: 5m
registry_file: /var/lib/filebeat/registry
You can either add Logstash into the mix or use an Ingest Node pipeline to parse the path. In either case you will use grok to break the message into fields and this can include parsing the path down to two levels.
If you use Ingest Node you will add a pipeline option to your config the is the name of the pipeline that you PUT into ES.
Thanks Andrew,
I send logs from filebeat to wavefront. In this case wavefront is a place where i put the logs into different field using something like:
pattern: '%{COMBINEDAPACHELOG} %{NUMBER:response}'
So if I were to only send the urls 2 level deep I need to do something at the filebeat level. Any option for that?
Can you customize the Wavefront grok pattern? Filebeat does not have parsing capability.
If you can add a second grok filter that operates on the request field with a pattern like (?<path>(/[^/]+){2}) this should create a path field containing /X/Y.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.