I have 5 nodes in my ELK cluster. I want to boost the indexing speed. I find out most of the nodes cpu $idle from command iostat are more than 50%, some of them are even more than 80% which mean the cpu are not fully utilized . Is there any parameter that can be set to enhance the utilization?
Indexing is very I/O intensive, so can often be limited by disk performance and/or network performance rather than CPU. What is the hardware specification of your nodes? What is the average size of your documents? What average indexing rate are you achieving? Have you followed these guidelines?
Thx..WIll try....the iowait on most the node are less than 5.....can i conclude that the the IO performance right now is way below than it can be?
For your questions, here is more detailed answers
Target log size : 36GB per day
No of Lines of log : 543414248
document size: 71byte
Node 1: CPU: i7-7700 RAM: 32GB SSD: 1TB
Node 2: CPU: i5-8400 RAM: 48GB SSD: 1TB
Node 3: CPU: i7-8700RAM: 32GB SSD: 1TB
Node 4: CPU: i7-7700 RAM: 32GB SSD: 1TB
Node 5: CPU: i9-9900 RAM: 48GB SSD: 1TB
Node 1: CPU: i5-6600K RAM:24GB SSD: 1TB
Node 2: CPU: i7-4770 RAM:16GB SSD: 1TB
Our current indexing rate is around 3400 documents per second. So we can digest around
300 million of line of log per day.
I hope we can achieve around 6000 documents per second
We have already implemeted some of the suggestion from the guidlines
- 2 workers on filebeat to send data to ES
- Disable swapping
- GIve memory to the filesystem cache (We allocated half of the memory to ES and other half is left for system)
- Use auto-generated ids
- Indexing buffer size has been set to 30%
But still for most of the time the cpu utilization of of the ES nodes are not in balance. Two or three of them are around 60% and the others are around 10%
I wish to fully utilize them but don't know how