Elasticsearch CPU utilization problem with Logstash

I have trouble with using Elasticsearch with Logstash.
When inserting data to ES through Logstash (not using Filebeat. Elasticsearch & Logstash only), CPU utilization of ES is over 90%.
Also, I expected if I use multiple Logstash, data inserting time is same or decreased compared to using 1 Logstash. However, my testing result is not like that.
Please advise how to use ES and multiple Logstash efficiently.

Below is how I tested.

[TEST ENVIRONMENT]

  • O/S : CentOS 7.9
  • cpu 32 vCore
  • mem 64GB
  • ELK version : OSS 6.8.17
  • Run 1 ES & 2 Logstash (different port) on same server

[TEST#1 - using 1 Logstash] - Send 100mb of data to Logstash

  • takes about 2 minutes to insert data to ES
  • ES CPU utilization: 90 ~ 100%
    • (only 100mb data, but CPU utilization is too high. It seems ES does not use all CPU cores)
  • CPU Idle of server: 90 ~ 99%

[TEST#2 - using 2 Logstash] - Send 100mb of data to each Logstash (total of 200mb)

  • takes about 4 minutes to insert data to ES
    • (each Logstash handles same size of data as TEST#1, but takes twice longer than TEST#1)
  • ES CPU utilization: 80 ~ 100%
  • CPU Idle of server: 90 ~ 99%

[thread_pool status during testing]

  • most of numbers are 0.

curl http://localhost:9200/_cat/thread_pool?v

node_name          name                active queue rejected
node-01 analyze                  0     0        0
node-01 fetch_shard_started      0     0        0
node-01 fetch_shard_store        0     0        0
node-01 flush                    0     0        0
node-01 force_merge              0     0        0
node-01 generic                  0     0        0
node-01 get                      0     0        0
node-01 index                    0     0        0
node-01 listener                 0     0        0
node-01 management               1     0        0
node-01 refresh                  0     0        0
node-01 search                   0     0        0
node-01 search_throttled         0     0        0
node-01 snapshot                 0     0        0
node-01 warmer                   0     0        0
node-01 write                    0     0        0

[ES SETTING - Elasticsearch.yml]

network.host: ["_local_", "_site_"]
http.port: 9200
transport.port: 9300

bootstrap.memory_lock: true
bootstrap.system_call_filter: false

thread_pool.search.max_queue_size: 10000
thread_pool.bulk.size: 16
thread_pool.bulk.queue_size: 10000
thread_pool.write.size: 32
thread_pool.write.queue_size: 10000

http.max_content_length: 100mb
network.tcp.no_delay: true
network.tcp.keep_alive : true
network.tcp.reuse_address: true
network.tcp.send_buffer_size : 1024mb
network.tcp.receive_buffer_size : 1024mb

Why have you overridden these parameters? These are expert settings and incorrect settings can have adverse effects. I would recommend restoring these to the deafults.

Elasticsearch indexing is often very I/O intensive so it is important to verify that your storage is not the bottleneck here. What type of storage are you using? Local SSD? What does disk utilisation and iowait look like during indexing?

As your reply, I reset the parameters and test again, but have same result.

I'm using local SSD.
Here is I/O status - I/O status during ES test and stress test.
Based on the below disk stress test result, I guess there is no bottleneck on disk I/O during my ES test.

[I/O Status - ES TEST using 2 Logstash]

  • Test condition: Send 100mb of data to each Logstash (total of 200mb)
  • Average kB_wrtn/s value of sdb: 4000 ~ 8000
    • (Sometimes it reaches to 10096.00, 20076.00, 33344.00)
--------------------------------------------------------------------------
# iostat 1
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.92    0.00    0.63    3.14    0.00   92.32

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda               0.00         0.00         0.00          0          0
sdb             194.00         0.00      6596.00          0       6596
--------------------------------------------------------------------------

[I/O Status - Stress Test]

dd if=/dev/sdb of=test.file bs=64M count=10000 oflag=dsync

--------------------------------------------------------------------------
# iostat 1
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.03    0.00    0.97    2.22    0.00   96.78

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda               0.00         0.00         0.00          0          0
sdb            1705.00    196612.00    165972.00     196612     165972
...
--------------------------------------------------------------------------

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.