Full logstash queue

d4nnyx · April 14, 2021, 5:38pm

Hello there,

since Im running our Elastic Stack, I have a problem with persistent queues, which time to time gets full from unknown reason.

Elasticsearch setup:
3 master nodes
12 data nodes 32CPU 32GB RAM
avg index rate 80K/s
Data volume: 20TB

My logstash setup:
4 logstash nodes 8CPU 16GB RAM

logstash.yml

path.data: /var/lib/logstash
path.logs: /var/log/logstash
queue.drain: true
config.reload.automatic: true
xpack.monitoring.elasticsearch.hosts: [ "XXX" ]
xpack.monitoring.enabled: true
xpack.monitoring.enabled: true
xpack.monitoring.elasticsearch.username: "logstash_system"
xpack.monitoring.elasticsearch.password: "XXXX"

pipelines.yml

- pipeline.id: signpost-pipeline
  path.config: "/etc/logstash/conf.d/signpost-pipeline.conf"
  queue.type: persisted
  queue.max_bytes: 1gb
- pipeline.id: php-pipeline
  path.config: "/etc/logstash/conf.d/php-pipeline.conf"
  queue.type: persisted
  queue.max_bytes: 10gb
- pipeline.id: json-pipeline
  path.config: "/etc/logstash/conf.d/json-pipeline.conf"
  queue.type: persisted
  queue.max_bytes: 20gb

signpost-pipeline.conf

input {
	beats {
		port => '5045'
		id => 'signpost-pipeline'
		client_inactivity_timeout => 120
	}
}

filter {}

output
{
	if 'json' in [tags] {
		pipeline { send_to => "json-pipeline" }
	} else if 'php' in [tags] {
		pipeline { send_to => "php-pipeline" }
}

json-pipeline.conf

input
{
	pipeline { address => "json-pipeline" }
}

filter
{
	json {
		source => 'message'
		skip_on_invalid_json => true
	}
}

output
{
		elasticsearch {
			hosts => [ "XXX" ]
			user => logstash_internal
			password => XXX
			index => '%{[env][service]}'
			manage_template => false
			action => "create"
		}
}

PHP pipeline is just some legacy pipeline with a few documents per day which causes no troubles.

And here come the questions:

Is there any way to explore the queues to debug which service is overwhelm the queue or for better debugging in general?
Persistent queues definitely decrease throughput because of writing to disk. Is there a difference between 1GB queue and 20GB queue speaking of throughput? Can I increase performance with Kafka for example?
Is elasticsearch output as I use it single-thread and can be multiplied to utilize more threads/cores?

In general, my main problem is I have almost full logstash queues, both Logstash and Elasticsearch data nodes are 50% idle, same IO, I have 0 pending tasks, almose empty thread pool "queues" and Im just trying to find some bottleneck here. Logstash delivers messages to Elasticsearch, but it is unbearably slow.

I appreciate any help I get here.

system · May 12, 2021, 5:38pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How can i accelerate the queue output speed in logstash Logstash	8	1227	February 5, 2018
Regarding Logstash Persistent Queus Logstash	1	275	June 22, 2018
Logstash: Persistent Queue Behaviour Logstash	4	864	February 22, 2021
Persistent Queue Configuration question Logstash	13	5790	March 16, 2017
Persisted queue never clearing Logstash	1	2405	March 2, 2017

Full logstash queue

Related topics