Hi,
Our workflow is filebeat --> redis --> logstash --> ES
Our problem : when there is a big influx of logs from the filebeats it fills up redis but logstash doesn't seem to work harder nor use its persistent queues.
Observations
Prior to our migration to ELK 7.5.2 with pipelines and persistent queues, we ran ELK 6.8.6 with persistent queue and no pipelines (old school massive logstash.conf mode). With ELK 6.8.x the persistent queue was buffering data from redis fine and when the influx of logs stopped it caught up by emptying the persistent queue. Since ELK 7.5.2, we see no usage of the persistent queues in the different pipelines and our redis instance fills up when logstash does not seem to be working significantly harder.
Note that I can see the same events in than events out in the Kibana logstash monitoring dashboard. They are virtually the same graphs.
Questions:
- How to find the bottleneck? (redis consumers? Pipelines workers? Elasticsearch output?)
- Why when I reduce the number of worker to 1 within a pipeline I can see the PQ increasing? However when there are many workers within a pipeline it does not seem to help to consume quicker from redis and does not increase the Persistent Queue
Architecture
- Server : 4 cores
- Pipelines : 8
- ELK: 7.5.2 across the board
- Each pipeline are made from a same redis input with a different key, filters and the same ES output.
pipelines.yml
# Purpose: Catch-all pipeline
- pipeline.id: broker
path.config: "/usr/share/logstash/pipeline/logstash.conf"
pipeline.workers: 2
pipeline.batch.size: 1000
queue.type: persisted
queue.max_bytes: 20gb
# Purpose: INFRA
- pipeline.id: infra
path.config: "/usr/share/logstash/pipeline/10*.conf"
pipeline.workers: 2
# Purpose: PRODUCTION
- pipeline.id: prod
path.config: "/usr/share/logstash/pipeline/30*.conf"
pipeline.workers: 5
pipeline.batch.size: 1000
queue.type: persisted
queue.max_bytes: 20gb
# Purpoe: PREprod
- pipeline.id: preprod
path.config: "/usr/share/logstash/pipeline/50*.conf"
pipeline.workers: 1
pipeline.batch.size: 250
queue.type: persisted
queue.max_bytes: 5gb
# Purpose: Deal with all AWS events (SQS for RDS/ECS/ASG/...) - no redis
- pipeline.id: aws-events
path.config: "/usr/share/logstash/pipeline/90*.conf"
pipeline.workers: 1
# Purpose: Dummy inputs to confirm logstash is processing (for monitoring purpose) - no redis
- pipeline.id: heartbeat
pipeline.workers: 1
path.config: "/usr/share/logstash/pipeline/999-heartbeat.conf"
# End.
Input example
input {
redis {
data_type => "list"
host => "redis.foobar.com"
id => "input_prod_redis"
key => "logs_prod"
port => 6379
}
}