High OS memory usage and less heap memory usage during indexing


(Puneet) #1

Hi,

I am trying to index 500 million records. I have heap_size of 16 GB, total memory of 32 GB.

I am trying to index 50,000 records in bulk in a loop till 500 million records with a single transport client connection in java.

However, After some time, I have noticed that overall CPU usage on elasticsearch has increased to nearly 100%, and indexing failed. However heap usage is not much.

Here are the node_stats

"os": {
            "timestamp": 1440515157491,
            "uptime_in_millis": 18871,
            "cpu": {
               "sys": 4,
               "user": 50,
               "idle": 45,
               "usage": 54,
               "stolen": 0
            },
            "mem": {
               "free_in_bytes": 133742592,
               "used_in_bytes": 34225528832,
               "free_percent": 0,
               "used_percent": 99,
               "actual_free_in_bytes": 221503488,
               "actual_used_in_bytes": 34137767936
            },
            "swap": {
               "used_in_bytes": 18765332480,
               "free_in_bytes": 49951268864
            }
         },
         "process": {
            "timestamp": 1440515158006,
            "open_file_descriptors": 1358,
            "cpu": {
               "percent": 10,
               "sys_in_millis": 333343,
               "user_in_millis": 279399,
               "total_in_millis": 612742
            },
            "mem": {
               "resident_in_bytes": 3216711680,
               "share_in_bytes": -1,
               "total_virtual_in_bytes": 3339325440
            }
         },
         "jvm": {
            "timestamp": 1440515158272,
            "uptime_in_millis": 17181206,
            "mem": {
               "heap_used_in_bytes": 2777733000,
               "heap_used_percent": 16,
               "heap_committed_in_bytes": 17145004032,
               "heap_max_in_bytes": 17145004032,
               "non_heap_used_in_bytes": 69697944,
               "non_heap_committed_in_bytes": 71225344,
               "pools": {
                  "young": {
                     "used_in_bytes": 130136048,
                     "max_in_bytes": 279183360,
                     "peak_used_in_bytes": 279183360,
                     "peak_max_in_bytes": 279183360
                  },
                  "survivor": {
                     "used_in_bytes": 34865152,
                     "max_in_bytes": 34865152,
                     "peak_used_in_bytes": 34865152,
                     "peak_max_in_bytes": 34865152
                  },
                  "old": {
                     "used_in_bytes": 2612731800,
                     "max_in_bytes": 16830955520,
                     "peak_used_in_bytes": 12738557344,
                     "peak_max_in_bytes": 16830955520
                  }
               }
            },
            "threads": {
               "count": 75,
               "peak_count": 82
            },
            "gc": {
               "collectors": {
                  "young": {
                     "collection_count": 21391,
                     "collection_time_in_millis": 1190698
                  },
                  "old": {
                     "collection_count": 11,
                     "collection_time_in_millis": 2329
                  }
               }
            },
            "buffer_pools": {
               "direct": {
                  "count": 111,
                  "used_in_bytes": 11379522,
                  "total_capacity_in_bytes": 11379522
               },
               "mapped": {
                  "count": 287,
                  "used_in_bytes": 35753712656,
                  "total_capacity_in_bytes": 35753712656
               }
            }
         },

Please let me know what is going on wrong here. Thanks


(Mike Simos) #2

Try reducing the number of records you are sending at one time. You're probably getting rejections because you're over flowing the bulk queue because on your setup Elasticsearch can't keep up. Start out with like 500 records and increase it until you start seeing rejections. Once you've found an acceptable number for your environment then you need to retry any rejections you get:

https://www.elastic.co/guide/en/elasticsearch/guide/current/_monitoring_individual_nodes.html#_threadpool_section


(Ksenji) #3

If you look at the mapped section in buffer_pools, you have 33G of data mapped in the RAM.
It has completely used up your RAM and the OS could be trying to swap it to disk. How many shards do you have?


(Tin Le) #4

Your system is swapping. That's not good. Use smaller bulk size, back off strategy and retry in your code.

Turn off swapping. Turn on memlock.


(Puneet) #5

Thanks Everyone
I have turned off swapping and turned on memlock. Also, now, I am indexing 750 records with wait of 4 seconds between every bulk insert call. With this overall memory usage on elasticsearch server seems to be stable now. But to me it seems that indexing rate is very low.
More points:
I am using HDD for indexing and OS is Windows 2008 r2

My question is:

  1. If I change to SDD from HDD will that make any impact on indexing and how much, also, are there any other ways by which I can improve upon this indexing rate?
  2. Which is most common/recommended OS for elasticsearch.

(Sarwar Bhuiyan) #6

Puneet, there are some basic posts on the website. Also, check https://www.elastic.co/guide/en/elasticsearch/guide/current/indexing-performance.html?q=indexing%20performance


(system) #7