[5.3] Heap details & collection


#1

Hello,

I tried to measure the impact of our datamodel on the heap (in terms of non-collectable objects) . Here is what I've done :

  • Wait a long time after last insertion (no segments.index_writer_memory);
  • for each type of index (some are rolled daily, others monthly) :
    1. get the memory footprint of all open indices using the elastic API (see the curl command here-below);
    2. force a full gc;
    3. measure the size of the CMS Old gen and the memory footprint of open indices;
    4. close some indices;
    5. repeat the operation (jump 2.).
Curl command

HEADER="index,docs.count,memory.total,segments.memory,segments.index_writer_memory,segments.version_map_memory,segments.fixed_bitset_memory,fielddata.memory_size,query_cache.memory_size";
curl -XGET "http://${HOST}:${PORT}/_cat/indices?h=$HEADER&s=index&bytes=b"

My cluster is quite simple : 2 data/master nodes with 64GB of ram and 31GB of heap.
There is currently 10,000,000,000 documents in my database, with the following repartition/stats :

  • type1_daily_rolled : 6.4G docs, 2.3TB in store (including replica), 6.5GB in memory (including replica, using the given command);
  • type2_daily_rolled : 2G docs, 600GB in store, 1.9GB in memory;
  • type1_monthly_rolled : 33M docs, 3.7GB in store, 18MB in memory;
  • type2_monthly_rolled : 19M docs, 3GB in store, 40MB in memory;
  • type3_monthly_rolled : 1.3M docs, 170MB in store, 1MB in memory;
  • type4_monthly_rolled : 1.1M docs, 1.6GB in store, 6MB in memory;
  • type5_monthly_rolled : 316k docs, 700MB in store, 4MB in memory;
  • type6_monthly_rolled : 12k docs, 3MB in store, 36kB in memory.

Most of field are indexed. Since only the first two takes most of the place, I'll focus on them.

With this values, and with a factor of 1.5, I expect the CMS Old gen to be under 10GB on both nodes when there is no activity. However, after a full GC, I stack at 17GB of space for the Old gen on the two nodes.

I have 8 days of data for each daily rolled index. I started to close indices of type1_daily_rolled for each day (data are quite evenly shared by day). Observations on the CMS Old gen are done with the JConsole (approximate values). Here is what happened when closing the indices :

  1. Start : CMS Old gen = 17GB
  2. Closing 1st day --> 16.5GB
  3. 2nd day --> 15.9GB
  4. 3rd day --> 15.4GB
  5. 4th day --> 14.5GB
  6. 5th day --> 13.7GB
  7. 6th day --> 9.3GB
  8. 7th day --> 7GB
  9. 8th day --> 5.7GB

Since I could not explained the huge gap between the 4th and the 6th day, I re-opened indices. At the end, CMS Old gen stacked at 10GB.

By closing and opening once again others indices, I gained some space from the Old Gen.

I tried to make indices read-only (index.blocks.write=true with the version 5.3 of elastic), and it did have no impact on the Old gen.

Here are my questions :

  1. Is my way to measure heap space not appropriate ?
  2. What could possibly not be collected when and indice is not closed ?
  3. Is there a way to clame this space other than doing a close/open sequence on index ?

If you have any idea/recommendation, I would happily read them :slight_smile:


(Mark Walkom) #2

Just so it's clear to us, what are you aiming to achieve with all this analysis?


#3

I have limited resources. I want to keep the daily-rolled indices two weeks (the 8 days of data represent 2 weeks). Before going live, I would like to be sure that performances are still good when dealing with 2 weeks of data (currently not the case).


(Mark Walkom) #4

What are the resources you have?
How much data are you indexing?


#5

In production: 1 dedicated master node, 2 data nodes (64GB of ram and 31GB of heap). I don't think giving proc specifications will give you any hint on why some CMS Old gen is not collected, isn't it ?

For the data :

  • type1_daily_rolled:
    • 20 fields, 18 are indexed (only date, keywords, and numeric format are used, the 2 fields which are not indexed are arrays);
    • 3G documents to index in a week (5 days), with an average of 17k doc/s during business hours (around 8 hours a day);
    • from 40 to 60 index created daily;
  • type2_daily_rolled:
    • 16 fields, 14 are indexed (only date, keywords, and numeric format are used, the 2 fields which are not indexed are arrays);
    • 2.4G documents to index in a week (5 days), with an average of 11k doc/s during business hours (around 8 hours a day);
    • from 40 to 60 index created daily.

For those two indices, refresh_interval is set to 30 seconds.

Business on others indices is not huge. Do you still want info on them ?


(Mark Walkom) #6

I'm not sure what you are worrying about is worth all this time and effort, I don't think you are optimising for anything worth doing. If Elasticsearch is not having constant GCs that are bring the cluster down, then I would just let it manage the heap as it sees fit.

I am not intending to be flippant, I just don't see the value in all this if there's nothing causing problems.

That is bad, see https://www.elastic.co/guide/en/elasticsearch/guide/2.x/important-configuration-changes.html#_minimum_master_nodes


#7

I don't understand your answer.

Here is an extract from the documentation :

  • If you have ten regular nodes (can hold data, can become master) [...]
    [...]
  • If you have two regular nodes, you are in a conundrum. A quorum would be 2, but this means a loss of one node will make your cluster inoperable. A setting of 1 will allow your cluster to function, but doesn’t protect against split brain. It is best to have a minimum of three nodes in situations like this.

What's the problem with 1 dedicated master node, and 2 data (only) nodes ?


When letting the JVM manage the heap (I took the elastic default configuration), it configures the MaxNewSize to 2GB, which is, in our case not enough (lots of CMS).

From the HeapDump quick analysis, I can see that 5GB of org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader$BlockState have been freed on each node (by doing the close/open sequence). I wonder whether the forcemerge action would lead to the same result ...


(Mark Walkom) #8

Because the two data nodes could lose contact with each other, yet still stay in contact with the master, and consider themselves a valid cluster. This is a split brain situation.


(system) #9

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.