Hi, I am using Logstash to parse IIS logs and to ingest some metrics about the system of my servers. So I am using Filebeat to send the logs and Metricbeat to send the metrics. However,to allow the analysis of future new data sources, I decided to create multiple pipelines with pipeline-to-pipeline communication following the distributor pattern with this configuration (pipelines.yml
) :
- pipeline.id: main
path.config: "/etc/logstash/main_pipeline.conf"
pipeline.workers: 1
- pipeline.id: filebeat
path.config: "/etc/logstash/filebeat_pipeline.conf"
pipeline.workers: 6
- pipeline.id: metricbeat
path.config : "/etc/logstash/metricbeat_pipeline.conf"
pipeline.workers: 1
My problem is that the performances dropped from 10 000 doc/sec while i'm using one big pipeline to 500 doc/sec or so with multiple pipelines even if I change the numbers of workers allocated to each pipelines.
I don't know what i'm doing wrong and i can't find beginnings of explanation or research leads.
Can someone please help me ?
The configuration of the single big pipeline is the following one :
- pipeline.id: All
path.config: "/etc/logstash/AllInOne_pipeline.conf"
pipeline.workers: 8
For both of the configuration, I'm running Logstash with 2GB heap size.
Here is the configuration of each pipelines :
Main :
input {
beats {
port => 5044
}
}
filter {
# Save some fields, remove the others ones, restaure the old one (don't need to change mapping)
mutate {
add_field => {
"[aux][id]" => "%{[agent][id]}"
"[aux][type]" => "%{[agent][type]}"
"[aux][hostname]" => "%{[agent][hostname]}"
}
}
prune {
blacklist_names => ["^ecs", "^host", "^metricset", "^agent","^event"]
}
mutate {
add_field=> {
"[agent][id]" => "%{[aux][id]}"
"[agent][type]" => "%{[aux][type]}"
"[agent][hostname]" => "%{[aux][hostname]}"
}
}
prune {
blacklist_names => ["^aux"]
}
}
output {
if [agent][type] == "metricbeat" {
pipeline {
send_to => metricbeat
}
}
else if [agent][type] == "filebeat" {
pipeline {
send_to => filebeat
}
}
}
Filebeat :
input {
pipeline {
address => filebeat
}
}
filter {
if [message] =~ "^#" {
drop { }
}
grok {
match { "message" => *grok expression working fine*}
remove_field => ["message"]
}
date {
match => [ "Timestamp", "yyyy-MM-dd HH:mm:ss"]
timezone => "UTC"
}
}
output {
elasticsearch {
hosts => ["xx.xx.xx.xx:9200"]
index => "iis-%{+yyyy.MM.dd}"
template_name => "iis"
}
}
Metricbeat :
input {
pipeline {
address => metricbeat
}
}
filter {
mutate {
add_field => {
"[aux1]" => "%{[event][dataset]}"
"[aux2]" => "%{[event][module]}"
}
}
prune {
blacklist_names => ["^event"]
}
mutate {
add_field => {
"[event][dataset]" => "%{[aux1]}"
"[event][module]" => "%{[aux2]}"
}
}
prune {
blacklist_names => ["^aux"]
}
}
output {
elasticsearch {
hosts => ["xx.xx.xx.xx:9200"]
index => "metricbeat-%{+xxxx.ww}"
template_name => "metricbeat-7.0.0"
}
}
All :
input {
beats {
port => 5044
}
}
filter {
# Save some fields, remove the others ones, restaure the old one (same mapping)
mutate {
add_field => {
"[aux][id]" => "%{[agent][id]}"
"[aux][type]" => "%{[agent][type]}"
"[aux][hostname]" => "%{[agent][hostname]}"
}
}
prune {
blacklist_names => ["^ecs", "^host", "^metricset", "^agent","^event"]
}
mutate {
add_field=> {
"[agent][id]" => "%{[aux][id]}"
"[agent][type]" => "%{[aux][type]}"
"[agent][hostname]" => "%{[aux][hostname]}"
}
}
prune {
blacklist_names => ["^aux"]
}
if [agent][type]== "metricbeat" {
mutate {
add_field => {
"[aux1]" => "%{[event][dataset]}"
"[aux2]" => "%{[event][module]}"
}
}
prune {
blacklist_names => ["^event"]
}
mutate {
add_field => {
"[event][dataset]" => "%{[aux1]}"
"[event][module]" => "%{[aux2]}"
}
}
prune {
blacklist_names => ["^aux"]
}
}
else if [agent][type] == "filebeat" {
if [message] =~ "^#" {
drop { }
}
grok {
match { "message" => *Same grok expression working fine*}
remove_field => ["message"]
}
date {
match => [ "Timestamp", "yyyy-MM-dd HH:mm:ss"]
timezone => "UTC"
}
}
}
output {
if [agent][type] == "metricbeat" {
elasticsearch {
hosts => ["xx.xx.xx.xx:9200"]
index => "metricbeat-%{+xxxx.ww}"
template_name => "metricbeat-7.0.0"
}
}
else if [agent][type] == "filebeat" {
elasticsearch {
hosts => ["xx.xx.xx.xx:9200"]
index => "iis-%{+yyyy.MM.dd}"
template_name => "iis"
}
}
}
Finally, the logstash.yml
:
path.data: /var/lib/logstash
path.logs: /var/log/logstash
I'm working with Debian 9.8 ,4GB RAM and 8 cores.
(Sorry for my bad english )