Bulk API Connection Timeouts and Frequent Long GC Pauses

jwheels · March 2, 2018, 4:40pm

Hi there,

I am having stability issues with a modest single-node ES deployment. The node is a beefy server with 500GB RAM and 72 hyperthreads. Several other data-intensive applications run on the server, so I am aware that resource contention is a potential issue.

According to the _stats API, the "store" size is 947428199 bytes (<1GB). Currently, I have a single producer of data that uses the bulk APIs to post <10 documents per second. The producer is frequently hitting connection errors due to timeouts on the bulk API. At the moment, no search queries are being executed (so caches should not be at play).

I see plenty of warnings about GC taking too long in the Elasticsearch logs. I enabled GC logging and indeed there is frequent garbage collection that sometimes takes several seconds to complete. I have the JVM Heap Size set to 5GB (with min = max). Swap is disabled on the system.

I have attached a graph of heap usage over time. It seems that GC is quite effective at reducing the heap usage to < 50%, but the usage rises extremely quickly again.

Does anyone have any insight as to why the heap size grows so quickly, with such a small dataset?

Thanks,
Josh

system · March 30, 2018, 4:41pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Heap / GC Issues Elasticsearch	9	480	July 6, 2017
Very long GC Elasticsearch	11	6773	July 6, 2017
Cluster (ES 5.2) performance degrading after indexing Elasticsearch	3	508	June 6, 2017
Stop-the-world slow GC's all the time [Production] Elasticsearch	3	1069	July 5, 2017
Long running GC, cluster status RED, only few GB's data Elasticsearch	12	2524	July 5, 2017

Bulk API Connection Timeouts and Frequent Long GC Pauses

Related topics