Hi all,
New to Elastic and Logstash, however my scenario is large IT environment and I would like to collect logs from various sources. First I would like to design most basic ones such as filebeat collecting syslogs, some other common modules for apache, IIS and etc. Will be great to keep all data relevent and flexible to filtering and enriching it so I'm thinking for scenario filebeat module [syslog,apache] -> logstash main pipeline -> one multi.conf file with input/filter/output logic for all modules separate by IF conditions (can't make it working):
input {
beats {
port => 5044
}
}
filter {
if [fileset][name] == "syslog" {
if [message] =~ /last message repeated [0-9]+ times/ {
drop { }
}
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
add_field => [ "received_at", "%{@timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
syslog_pri { }
date {
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
mutate {
add_tag => ["syslog"]
}
}
}
filter {
if [fileset][name] == "apache" {
grok {
match => [ "message" , "%{COMBINEDAPACHELOG}+%{GREEDYDATA:extra_fields}"]
overwrite => [ "message" ]
}
date {
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
}
mutate {
add_tag => ["apache"]
}
}
}
filter {
if [fileset][name] == "nginx" {
grok {
match => { "message" => "%{COMBINEDNGINXOG}"}
}
date {
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
}
mutate {
add_tag => ["nginx"]
}
}
}
output {
if "syslog" in [tags] {
elasticsearch {
hosts => "https://XXXXXXXXXXXXX"
index => "syslog-%{+YYYY.MM.dd}-%{[@metadata][version]}"
}
}
if "apache" in [tags] {
elasticsearch {
hosts => "https://XXXXXXXXXXXXX"
index => "apache-%{+YYYY.MM.dd}-%{[@metadata][version]}"
}
}
if "nginx" in [tags] {
elasticsearch {
hosts => "https://XXXXXXXXXXXXX"
index => "nginx-%{+YYYY.MM.dd}-%{[@metadata][version]}"
}
}
}
OR separate conf files on Logstash for input (beats, several conf files for filters (10-syslog-filter.conf, 11-apache-filter.conf) and output file with several IF statements for separating the workflow based on tags (syslog, apache) and create separate indexes (syslog-{date}, apache-{date} and etc.).
So my first question is which is best logic for separating different filters and logic with one big multi.conf file or separate files. Also should I use multiple pipelines for this scenario or one single pipeline with configuration file(s) is sufficient in Beats input scenarion for begining. I've red most of the documentation but could't find and example and "best practice" for similar scenario.
Thanks in advance,
Hristo.