Cluster (ES 5.2) performance degrading after indexing

dinnot · May 8, 2017, 10:00am

Hi everybody!

We are experiencing some problems with our clusters. Each night we have a process that will send some updates to our data, trough logstasj(so hitting the bulk api). The updates ar of different size every night, but it can get to a few million documents (documents are quite small, <1kb). Then, in the mornings, we will find most of the nodes on 74% memory usage (so just below the threshold for GC) and the cluster will be really slow (avg search time moves from <0.5s to a few seconds). On our test cluster we use m4.large aws machines and on our prod cluster we use aws m4.xlarge machines, and we found the same problems with both. We've also had the problem with both 50% heap size or smaller values. A restart of the ES process will always fix the problem until the next indexing period. As we are not live with our clusters, we don't have regular search queries, but I dont know how that would be a problem. We are running with standard settings as far as I am aware.

Any advice will be really appreciated.

Thank you,
Sorin

jpountz · May 9, 2017, 9:20am

I suspect that you are sending too large requsts to Elasticsearch that cause Netty (the framework we use for networking) to create large memory buffers which are not reclaimed the same way as regular JVM memory. I think you would benefit from using smaller bulk sizes. Maybe try something like 10k docs per bulk request?

dinnot · May 9, 2017, 9:42am

Thanks for the reply! At the moment we use the following pipeline setting for logstash:
Starting pipeline {"id"=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>1000, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>4000}
We have no setting for elastichsearch-output.flush_size, so I'd guess the bulk size we call with would be 1k. One thing we use is the http_compression option to send the requests gzipped. I guess that could use more memory, depending on the implementation in Netty. Another thing I've noticed is that after about 100k searches done after indexing, the search query time will be in the same region as before indexing. Any more suggestions and ideas would really be appreciated.

system · June 6, 2017, 9:46am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Performance degrading after a couple of weeks Elasticsearch	7	527	October 30, 2018
Redesigning ES Cluster, questions about optimization Elasticsearch	4	342	July 6, 2017
Production cluster slows down after 15-20 days of starting the services Elasticsearch	8	956	July 5, 2017
Elasticsearch heap issues Elasticsearch	4	442	July 5, 2017
Cluster from virtual machines Elasticsearch	5	774	July 5, 2017

Cluster (ES 5.2) performance degrading after indexing

Related topics