Bulk rejections in my Elasticsearch cluster

We are frequently facing bulk rejections in my Elasticsearch cluster in AWS ES 5.5 version. One way is to resolve by increasing the number of data nodes. However can we introduce Redis as cache in between to resolve the issue?

Read this blog post and check how many shards you are actively indexing into and how many concurrent indexing threads/processes are in use.

Hi Christian,

Thanks for the reply. I have read the post, just a thought came in mind instead of increasing the data nodes what if we keep the data in cache and we dont overload the bulk queue will introducing Redis will help in such case ?

The way you are indexing now seems to overload the cluster so I am not sure how enabling Redis would help. I would suggest reducing the number of shards indexed into or the number of concurrent indexers.

How many shards are you actively indexing into? How are you indexing?

Primary Active Shards are : 1291
Active Shards : 2582
Data nodes are 4
We are indexing everyday. Is there a way in AWS ES service to change the number of shards indexed?

You should be able to use the shrink index api to reduce the number of primary shards. You can also change the number of primary shards for any new indices being created through an index template.

Thanks again for your quick reply . My apologies of not mentioning earlier that we are using ES version 5.5. Will I be still be able to use the API ?

Best Regards
Vandana

I think it is available in that version but you might want to check the documentation.

How I can drive the conclusion that which component is causing this problem ? Also I am facing this issue only in production . My test env works fine. How can i reproduce this error in test env?

Does your test environment have the same number of shards and the same load as the production system?

Not related but did you look at https://www.elastic.co/cloud and https://aws.amazon.com/marketplace/pp/B01N6YCISK ?

Cloud by elastic is one way to have access to all features, all managed by us. Think about what is there yet like Security, Monitoring, Reporting, SQL, Canvas, APM, Logs UI, Infra UI and what is coming next :slight_smile: ...

Load might be equivalent . Number of shards in test env : 1884 . number of shards in production : 2749.
Number of data nodes in test env : 2
Number of data nodes in production : 4

So if increase the number of shards in test env I will be able to reproduce the error?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.