Index queue gets full after starting using parent/child


We started using parent/child to filter users in our app, this means a lot of children are created per user (depending on the case, thousands).

Since we started using it we are getting the next error:

[es_rejected_execution_exception] rejected execution of org.elasticsearch.transport.TcpTransport$RequestHandler@376b8dd0 on EsThreadPoolExecutor[index, queue capacity = 400, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@5549c39[Running, pool size = 8, active threads = 8, queued tasks = 400, completed tasks = 315780]]

At the beginning it didn't happen many times, but as long as days passes (and number of documents grow) we are getting this error more and more often.

By default the queue is 200, so tried to extend it to 400 but it didn't help. I think the problem is in another place, so maybe extending the queue too much is not a good idea.

Checking metrics everything looks good, excepting the refresh ratio ( refresh.total_time / ), which has been growing at the same pace that the errors.

We also realized that the errors are not coming in a constant rate, but in waves, every X minutes.

Elasticsearch Version: 5.6.0

Any thoughts? Thanks.

Are you using X-Pack Monitoring to see what the cluster is doing, and if not, then you should :slight_smile:

Basically, your cluster is overloaded and you need to find out why and where. And the stats it delivers will be useful.

Thanks for the answer.

Yes, that is what I'm trying to find out :confused:

Didn't use X-Pack yet, I'll give it a try.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.