Clarification towards logstash batch_size and memory usage

Hello All,

I have two concerns

  1. I have a pipeline.yml where I am explicitly setting batch size as 100, but when I start logstash it shows me 125, is this an expected behavior ?
    config

- pipeline.id: filebeat
path.config: "/opt/logstash/config/filebeat.conf"
pipeline.workers: 1
pipeline.batch.size: 100
queue.type: persisted
path.queue: /opt/logstash/data/queue
queue.max_events: 0
queue.max_bytes: 4096mb

Logstash stats :

curl -XGET 'localhost:9600/_node/stats/pipelines/filebeat?pretty'
{
"host" : "xxxx",
"version" : "7.5.0",
"http_address" : "127.0.0.1:9600",
"id" : "16bb4a4c-1b21-4ffb-ad26-439eeb84446a",
"name" : "xxxxx",
"ephemeral_id" : "e51c630a-761d-4f4e-b074-4866568fadc3",
"status" : "green",
"snapshot" : false,
"pipeline" : {
"workers" : 2,
"batch_size" : 125,
"batch_delay" : 50
},
"pipelines" : {
"filebeat" : {
"events" : {
"queue_push_duration_in_millis" : 8989,
"duration_in_millis" : 74097,
"out" : 37201,
"filtered" : 37201,
"in" : 81258
},

  1. Where does logstash uses the rest of the memory
  • I am using cadvisor to monitor logstash memory
  • logstash is running with 4gb jvm memory

## JVM configuration

# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space
-Xms4g
-Xmx4g

  • Logstash stats and docker stats show very less memory consumption, but cadvisor shows a lot. I am using queue.type: persisted, so events should not consume memory
  • I suppose that there would be non-heap memory as well, but that accounts to 1/2 gb over a long period of time.

curl -XGET 'localhost:9600/_node/stats/jvm?pretty'
{
"host" : "xxxxx",
"version" : "7.5.0",
"http_address" : "127.0.0.1:9600",
"id" : "16bb4a4c-1b21-4ffb-ad26-439eeb84446a",
"name" : "xxxxx",
"ephemeral_id" : "e51c630a-761d-4f4e-b074-4866568fadc3",
"status" : "green",
"snapshot" : false,
"pipeline" : {
"workers" : 2,
"batch_size" : 125,
"batch_delay" : 50
},
"jvm" : {
"threads" : {
"count" : 39,
"peak_count" : 43
},
"mem" : {
"heap_used_percent" : 6,
"heap_committed_in_bytes" : 4277534720,
"heap_max_in_bytes" : 4277534720,
"heap_used_in_bytes" : 298714216,
"non_heap_used_in_bytes" : 184141792,
"non_heap_committed_in_bytes" : 209522688,
"pools" : {
"young" : {
"peak_used_in_bytes" : 139591680,
"committed_in_bytes" : 139591680,
"peak_max_in_bytes" : 139591680,
"max_in_bytes" : 139591680,
"used_in_bytes" : 68886720
},
"survivor" : {
"peak_used_in_bytes" : 17432576,
"committed_in_bytes" : 17432576,
"peak_max_in_bytes" : 17432576,
"max_in_bytes" : 17432576,
"used_in_bytes" : 1483960
},
"old" : {
"peak_used_in_bytes" : 228343536,
"committed_in_bytes" : 4120510464,
"peak_max_in_bytes" : 4120510464,
"max_in_bytes" : 4120510464,
"used_in_bytes" : 228343536
}
}
},
"gc" : {
"collectors" : {
"young" : {
"collection_time_in_millis" : 12783,
"collection_count" : 340
},
"old" : {
"collection_time_in_millis" : 811,
"collection_count" : 3
}
}
},
"uptime_in_millis" : 557542
}
}

docker stats :

docker stats xxxxx --no-stream
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
8b12fd8ee990 xxxxx 28.90% 793.6MiB / 11.56GiB 6.71% 159MB / 52.3MB 187MB / 2.78GB 51

Cadvisor : Query used container_memory_working_set_bytes

Regards
Ashish

No. I suggest you check the indentation in your pipelines.yml (it's plural, right?) to make sure those settings apply to the pipeline id you want them to.

@Badger : The file name is pipelines.yml (Sorry for the typo above)
Here is the config
image

Regards
Ashish

@Badger : Did you see my comments?

Yes, and I cannot explain why that would be happening.