High OS memory usage and less heap memory usage during indexing

puneetm · August 25, 2015, 3:34pm

Hi,

I am trying to index 500 million records. I have heap_size of 16 GB, total memory of 32 GB.

I am trying to index 50,000 records in bulk in a loop till 500 million records with a single transport client connection in java.

However, After some time, I have noticed that overall CPU usage on elasticsearch has increased to nearly 100%, and indexing failed. However heap usage is not much.

Here are the node_stats

"os": {
            "timestamp": 1440515157491,
            "uptime_in_millis": 18871,
            "cpu": {
               "sys": 4,
               "user": 50,
               "idle": 45,
               "usage": 54,
               "stolen": 0
            },
            "mem": {
               "free_in_bytes": 133742592,
               "used_in_bytes": 34225528832,
               "free_percent": 0,
               "used_percent": 99,
               "actual_free_in_bytes": 221503488,
               "actual_used_in_bytes": 34137767936
            },
            "swap": {
               "used_in_bytes": 18765332480,
               "free_in_bytes": 49951268864
            }
         },
         "process": {
            "timestamp": 1440515158006,
            "open_file_descriptors": 1358,
            "cpu": {
               "percent": 10,
               "sys_in_millis": 333343,
               "user_in_millis": 279399,
               "total_in_millis": 612742
            },
            "mem": {
               "resident_in_bytes": 3216711680,
               "share_in_bytes": -1,
               "total_virtual_in_bytes": 3339325440
            }
         },
         "jvm": {
            "timestamp": 1440515158272,
            "uptime_in_millis": 17181206,
            "mem": {
               "heap_used_in_bytes": 2777733000,
               "heap_used_percent": 16,
               "heap_committed_in_bytes": 17145004032,
               "heap_max_in_bytes": 17145004032,
               "non_heap_used_in_bytes": 69697944,
               "non_heap_committed_in_bytes": 71225344,
               "pools": {
                  "young": {
                     "used_in_bytes": 130136048,
                     "max_in_bytes": 279183360,
                     "peak_used_in_bytes": 279183360,
                     "peak_max_in_bytes": 279183360
                  },
                  "survivor": {
                     "used_in_bytes": 34865152,
                     "max_in_bytes": 34865152,
                     "peak_used_in_bytes": 34865152,
                     "peak_max_in_bytes": 34865152
                  },
                  "old": {
                     "used_in_bytes": 2612731800,
                     "max_in_bytes": 16830955520,
                     "peak_used_in_bytes": 12738557344,
                     "peak_max_in_bytes": 16830955520
                  }
               }
            },
            "threads": {
               "count": 75,
               "peak_count": 82
            },
            "gc": {
               "collectors": {
                  "young": {
                     "collection_count": 21391,
                     "collection_time_in_millis": 1190698
                  },
                  "old": {
                     "collection_count": 11,
                     "collection_time_in_millis": 2329
                  }
               }
            },
            "buffer_pools": {
               "direct": {
                  "count": 111,
                  "used_in_bytes": 11379522,
                  "total_capacity_in_bytes": 11379522
               },
               "mapped": {
                  "count": 287,
                  "used_in_bytes": 35753712656,
                  "total_capacity_in_bytes": 35753712656
               }
            }
         },

Please let me know what is going on wrong here. Thanks

msimos · August 25, 2015, 10:52pm

Try reducing the number of records you are sending at one time. You're probably getting rejections because you're over flowing the bulk queue because on your setup Elasticsearch can't keep up. Start out with like 500 records and increase it until you start seeing rejections. Once you've found an acceptable number for your environment then you need to retry any rejections you get:

https://www.elastic.co/guide/en/elasticsearch/guide/current/_monitoring_individual_nodes.html#_threadpool_section

ksenji · August 25, 2015, 11:38pm

If you look at the mapped section in buffer_pools, you have 33G of data mapped in the RAM.
It has completely used up your RAM and the OS could be trying to swap it to disk. How many shards do you have?

tinle · August 25, 2015, 11:42pm

Your system is swapping. That's not good. Use smaller bulk size, back off strategy and retry in your code.

Turn off swapping. Turn on memlock.

puneetm · August 26, 2015, 11:47am

Thanks Everyone
I have turned off swapping and turned on memlock. Also, now, I am indexing 750 records with wait of 4 seconds between every bulk insert call. With this overall memory usage on elasticsearch server seems to be stable now. But to me it seems that indexing rate is very low.
More points:
I am using HDD for indexing and OS is Windows 2008 r2

My question is:

If I change to SDD from HDD will that make any impact on indexing and how much, also, are there any other ways by which I can improve upon this indexing rate?
Which is most common/recommended OS for elasticsearch.

Sarwar · August 26, 2015, 12:20pm

Puneet, there are some basic posts on the website. Also, check https://www.elastic.co/guide/en/elasticsearch/guide/current/indexing-performance.html?q=indexing%20performance

Topic		Replies	Views
High elastic search heap memory consumption while indexing huge files Elasticsearch	7	1996	September 20, 2017
Memory usage seems excessive Elasticsearch	3	309	July 6, 2017
Trying to identify high heap memory usage (v1.7.5) Elasticsearch	2	782	May 4, 2017
High heap during indexing documents Elasticsearch	4	1982	April 12, 2017
Where is all my memory? Or how to estimate better Elasticsearch	4	434	July 6, 2017

High OS memory usage and less heap memory usage during indexing

Related topics