Elasticsearch JVM Memory Pressure Issue

Hi,

I am using m4.large.elasticsearch with 2 nodes having 512 GB of EBS Volume.In total of 1TB disk space. I have setup the fielddata cache limit to 40%.
We are continuously experiencing Cluster Index Blocking issue which is preventing further writing operation of new indexes.

I can see the continuous JVM memory pressure is beyond 95%

Can anyone suggest anything on this ?

What is the output of:

GET /_cat/health?v
GET /_cat/indices?v
GET /_cat/shards?v

Hi,
I have flushed the cache few minutes back. Still no improvement.
Its huge data from GET /_cat/shards?v however nothing seems to be alarming from the outputs. We have setup hourly basis indices.

Its huge data from GET /_cat/shards?v

That clearly indicates a problem IMO.
Anyway could you share the output of the other 2 commands please and some 10s lines of the shards one?

Hi,

Sorry I had to remove the cluster name

epoch          timestamp           cluster                    status node.total node.data shards  pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1558109554    09:12:34                                                       2         2  14374 7187    0    0        0             0                  -                100.0%

---------------UUid modified and Index name removed -------

health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open ABC 5 1 61 0 622.9kb 311.4kb
green open CDE 5 1 4429 0

------------index and node name changed --------------

index shard prirep state docs store ip node
temp-1 2 r STARTED 2124 1mb x.x.x.x ABC
temp-2 2 p STARTED 2124 1mb x.x.x.x CDE

Please format your code, logs or configuration files using </> icon as explained in this guide and not the citation button. It will make your post more readable.

Or use markdown style like:

```
CODE
```

This is the icon to use if you are not using markdown format:

There's a live preview panel for exactly this reasons.

Lots of people read these forums, and many of them will simply skip over a post that is difficult to read, because it's just too large an investment of their time to try and follow a wall of badly formatted text.
If your goal is to get an answer to your questions, it's in your interest to make it as easy to read and understand as possible.
Please update your post.

You probably have too many shards per node.

May I suggest you look at the following resources about sizing:

https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing

And https://www.elastic.co/webinars/using-rally-to-get-your-elasticsearch-cluster-size-right

Hi David,

Thanks for your suggestions. Is it possible to chat online with you anywhere ? I have few other things to discuss. We have just started using Elasticsearch for centralised log aggregation. Hence I need few other points to clear.

No. It's not. If you are nearby a conference we support or speak at, you can always try to ask questions there.

We do have a support offer where you can ask to discuss with an engineer at elastic. LMK if you want to know more about this and I'll connect you to sales.

For now, feel free to continue asking here on discuss.elastic.co as this is the best place to get information and share also the knowledge with the rest of the community.

Ok thanks David. Let me know if you have any conference scheduled in near future in UK. I would like to attend the same.

Regarding the queries :- My cluster is still not allowing me to write any new indices as it is blocked by Cluster Write Blocked. The only reason I see is the JVM memory pressure is too high around 95%.

Things I am struggling to figure out :
My Cluster has 2 nodes each with 512GB disk space and 8 GB RAM.
My default settings are 5 Primary shards and 1 replica shard.
When I query for a particular index GET _cat/shards/index_name it gives me 10 records with the allocation like 2 primary 2 replica for certain no. of documents and 4p,4r for another set of documents and 3p,3r for another set , 1p,1r for another sets and 0p,0r for another sets.
Why the allocation is like this as described below :

Index_name 2 r STARTED 925 612.3kb x.x.x.x replica node name
Index_name 2 p STARTED 925 612.3kb x.x.x.x primary node name 
Index_name 4 r STARTED 910 568.5kb x.x.x.x replica node name
Index_name 4 p STARTED 910 568.5kb x.x.x.x primary node name 
Index_name 3 r STARTED 907 615.1kb x.x.x.x replica node name
Index_name 3 p STARTED 907 615.1kb x.x.x.x primary node name 
Index_name 1 r STARTED 881 481.7kb x.x.x.x replica node name
Index_name 1 p STARTED 881 481.7kb x.x.x.x primary node name 
Index_name 0 r STARTED 920 631.2kb x.x.x.x replica node name
Index_name 0 p STARTED 920 631.2kb x.x.x.x primary node name 

I have read somewhere that fielddata put lot of pressure of JVM heap so tried to clear the cache of all fielddata that did not work out.Also the no.of fielddata is not that huge. I have updated the fielddata cahce size to 40%

I am actually stuck now as I am not able to decide what configuration do I need to update so that writing new indexes is enabled for now and also I am looking for a long term solution to update the cluster in production .

Do I need to change the primary shard value ?
And how do you determine the heap size looking at the above index size ?
Any quick check to determine what is participating my JVM memory pressure for which the write operation got blocked ?

Please let me know the settings to get rid of this problem now.

If you looked at the resources I pasted, you probably found that 14374 shards for 2 data nodes is a way too much.

Knowing that we recommend at most:

  • 50gb per shard
  • 20 shards per Gb of HEAP

Here with 8gb of RAM (so may be 4gb of HEAP), you can't have more than 80 shards on a node... You see the problem?

So some advices:

  • Use 1 shard instead of 5. Some of the numbers you shared shows that you are very far away from 50gb so you can store more within one shard.
  • Use Shrink API
  • Use the rollover API
  • Use another time period if you are using time based indices, ie one index per month.
  • Remove old indices.
1 Like

Thanks once again David.

Just wanted to know why the default settings are 5:1 for shards if too many shards is really a memory issue. How does it contributes to memory issue. Any theory behind this ?
Also will there be any problem with the performance if I configure 1 shard per node?

Regarding the index writing blocked issue :- Is there any temporary fix that I can do in my test cluster to enable writing the indexes once again ?

That's one of the reason we changed it to 1 shard from version 7.0.

Read https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster (which is one of the resources I previously linked to).

Oh? It's a test cluster? Then I'd simply remove old indices.

Thanks a ton David.
I will go through the links and may come up with more queries.
I really appreciate your help.

Hi @dadoonet ,
I have setup the cluster with hourly indexes and as mentioned above I have 2 nodes with 512GB space.

When you say 1 shard per index that means if I create 1 primary and 1 replica shard for an index it will create 48 shards a days w.r.t my index pattern.
Now for example If I want to retain the documents for 30days then in 30days time my shards would be 48 * 30=1440 before I housekeep the indexes after 30 days.

Now 1GB of JVM memory allows only 20 shards hence in that case my memory can accomodate upto 80 shards in one node as I have 4GB of heap available in one node. Total of 8GB of memory can accomodate upto 160 shards which is far less than the 1440 shards in 30days time. Have I understood correctly and if yes do I need to upgrade my RAM to accomodate this much shards for 30days or may be more than that if the retention period is 6 months time ?

Do I need to increase the no. of nodes or RAM ? which one would be better ?
I need to suggest something for production as the size of each index is going to ramp up a lot because the users will ramp up in few days . Can you please suggest something ?

Why would you set up and use hourly indices?? Given the retention period you mention it does not see to make sense. How much data are you expecting to index per day?

Hi @Christian_Dahlqvist

Basically the hourly index has already been set in production. The retention period is high because we will have lot of dashboards in kibana for various trend analysis.

Currently the daily index is small however in next few weeks the daily index will be around 2 GB per index. So for 24hrs indexing will be 48GB per day.

Therefore in this case I will end up with 48 shards per day(i.e.1440 shards in 30 days) however according to the thumb rule we should have 20 shards per GB memory which can accommodate only 160 shards (4GB heap size on each node).
This is way less than the estimated shards in 30 days.

Please suggest me a probable solution.

2GB is quite a small shard do I would recommend you switch to daily indices with 2 primary shards. If you you find it hard to predict data volumes you can use rollover to crease indices of a certain target size rather than have then cover a specific period. This can be managed through ILM.

Thanks a lot @Christian_Dahlqvist . Are these shards incur any costs as I am using Elasticsearch as a service from AWS ?
Or is it just the cost of the node that I need to pay ?

Note that ILM is under elastic license so like many goodies, it's not available in AWS managed service.

Did you look at https://www.elastic.co/cloud and https://aws.amazon.com/marketplace/pp/B01N6YCISK ?

Cloud by elastic is one way to have access to all features, all managed by us. Think about what is there yet like Security, Monitoring, Reporting, SQL, Canvas, APM, Logs UI, Infra UI and what is coming next :slight_smile: ...

Thanks @dadoonet and @Christian_Dahlqvist
I will have a look at the links suggested by you.

One thing I want to ask is that my Production is still running fine with 5 Primary shards and 1 replica shard. The memory pressure is around 70-75%.
What I meant is like I haven't experienced any problem with the memory yet unlike my TestCluster with the same configuration.
The total no. of primary shards has gone upto 5331 which is huge compared to the available heap memory of 4GB on each node.

Do you think I am going to face the same problem in production very soon l like I faced in Test which blocked the index writing?

If the recommendation is 20 shards per GB then it would have blocked the index writing by now with so many shards right ?

Just wanted to understand before I update the cluster settings in production.