Indexing is getting slower and slower as it progresses

yes, the java process is a Elastic process as this is dedicated ES server and nothing else is running. My understanding is that the RES and SHR values does not directly correlate with heap size allocated, but instead RES could be a sum of Heap and physical memory being used. Am i right in this?

I have confirmed the heap size allocation by following.

psoft 21991 16286 99 Dec17 pts/1 7-18:02:27 /psoft/ELS/pt/es_jre1.8.0_281/bin/java -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+AlwaysPreTouch -server -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -XX:+HeapDumpOnOutOfMemoryError -Xmx32766m -Xms32766m -XX:ParallelGCThreads=23 -XX:NewRatio=2 -Des.path.home=/rajaels/ELS/pt/elasticsearch6.1.2 -Des.path.conf=/rajaels/ELS/pt/elasticsearch6.1.2/config -cp /rajaels/ELS/pt/elasticsearch6.1.2/lib/* org.elasticsearch.bootstrap.Elasticsearch

The above also shows other java heap size parameters added.

I have also confirmed that compressed object pointers were being used as i could see that as part of elastic bootup.

I have purposely added 31666m for -Xmx as this is the max i could use for compressed object pointers and if use 31667m or above 'compressed object pointers' are not being used.

Yes, the VIRT, RES and SHR are definitely slowly increasing. i.e when i started the indices with no data in shards, these values are very low and then as it index more and more documents, these sizes are increasing.

As they were increasing and i could see high number of GC overhead messaged in ESLCUSTER log file, i am indexing the documents in batches, with each batch consists of 5 to 10mil documents, and once this batch is indexed, i am shutting down the ES, restarting it, waiting for 120 mins as i can still high CPU usage, though there is no indexing occuring and starting the index process.

But as soon as restart the ES (not index), the VIRT size is same as before shutting down, which led me believe that ES may be keeping the meta data of previously indexed data and shards info in memory and this size is steadily increasing as it index more documents..

Attachments are basically pdf, doc, txt and other readable files, which are attached by end uses to case note information, which is what i am trying to index along with case notes.

No, I have done deeper analysis as yet on vmstat, top, iostat, jvisualvm as i am not OS expert to meaningfully interpret the output and identify issues at that level and hence provided hot thread info in this thread

Ignore VIRT size being the same before and after a complete elasticsearcxh restart, its "normal" and not important to your issue, as long as you really did shutdown elasticsearch completely (the java process exited and a new one started).

One of my points was to remind that the heap is not the only memory user within a java application, you probably need to use JVM tools to dig deeper into that.

But most interesting thing from your post is:

waiting for 120 mins as i can still high CPU usage, though there is no indexing occurring

You now have high CPU usage when the cluster is doing effectively nothing? Is there also significant IO ongoing on at the same time? That is a mystery you need to solve.

Please run "iostat -x 60" for 5 minutes before and 20 minutes after you restart elasticsearch (i.e. run for approx 25 minutes in total)

Same for "vmstat 60"

And same for "top -b -d 60 | awk 'NR<10 || /psoft/ || /java/'"

I appreciate you personally dont have deep experience with tools like iostat/vmstat/jvirtualvm/etc. But without such knowledge and/or access to the system, troubleshooting things is very hard.

Please also refer to:

Also, lets say 30 minuets after restart but before restarting the indexing process, can you do a

GET /_nodes/stats

again and share output.

From output above, where I presume system had been running a while, way more time was spent merging than indexing:

      "indexing": {
        "index_total": 4120398,
        "index_time_in_millis": 50952877, <-- 849 minutes
        "index_current": 4,
        "index_failed": 0,
        "delete_total": 0,
        "delete_time_in_millis": 0,
        "delete_current": 0,
        "noop_update_total": 0,
        "is_throttled": false,
        "throttle_time_in_millis": 0
      }

and

      "merges": {
        "current": 22,
        "current_docs": 1629270,
        "current_size_in_bytes": 92451402153,
        "total": 3515,
        "total_time_in_millis": 789186847, <-- 13153 minutes
        "total_docs": 18493872,
        "total_size_in_bytes": 1133936081916,
        "total_stopped_time_in_millis": 0,
        "total_throttled_time_in_millis": 141446873, <-- 2357 minutes
        "total_auto_throttle_in_bytes": 1505047452
      }

So I am now guessing that when elasticsearch is starting it is just continuing the merging processes that had not completed?

If you re-run the _node/stats call 20-30 minutes after cold start, you can confirm this. It also sort of fits with the problem report. Merging can be very IO intensive.

Perhaps @Christian_Dahlqvist or others, many who will know the internals way better than me, can suggest how to maybe manage/reduce merging without having too many downsides.

But I think you are going to need either faster storage, during this phase at least, or review how you are ingesting and sharding the data - eg. across how many servers and with which specification in terms of RAM/CPU.

btw I also noticed

      "translog": {
        "operations": 2133296,
        "size_in_bytes": 114717685968,
        "uncommitted_operations": 1159571,
        "uncommitted_size_in_bytes": 62252691813
      },

That seems to me a big numbers of uncommitted operations, almost half?

I am sorry, there is a typo, it should be 20 mins instead of 120mins in "waiting for 120 mins as i can still high CPU usage".

Thank you for taking time and providing these insights of where and what i can look at further.

"The uncommitted_operations": 1159571" may be due to the reason where i had disabled the refresh completely ("refresh_interval" : "-1") for one batch, when i took the node stats and hence it may be showing high number of "uncommitted_operations": 1159571".

But, what makes me wonder now is that, though i now have set "refresh_interval" : "120s" before starting the current batch and bounced ES, i now have "uncommitted_operations" : 1,379,595 in /_stats?pretty

Attached the stats for the current batch (~6.8mil rows running for the last 65 hours at an average index rate of 1500 docs/min).

This confirms that merging is taking more than the double time of indexing.

Merging - "total_time_in_millis" : 1562814775, -- 26046 minutes
Indexing -- "index_time_in_millis" : 691062410, --11517 minutes

1 Like

Current Indice stats (/icc_case_notes_crpte/_stats?pretty")

{
"_shards" : {
"total" : 200,
"successful" : 200,
"failed" : 0
},
"_all" : {
"primaries" : {
"docs" : {
"count" : 77419450,
"deleted" : 328041
},
"store" : {
"size_in_bytes" : 3766096105440
},
"indexing" : {
"index_total" : 6351191,
"index_time_in_millis" : 691062410, 11517 minutes
"index_current" : 2,
"index_failed" : 0,
"delete_total" : 0,
"delete_time_in_millis" : 0,
"delete_current" : 0,
"noop_update_total" : 0,
"is_throttled" : false,
"throttle_time_in_millis" : 0
},
"get" : {
"total" : 0,
"time_in_millis" : 0,
"exists_total" : 0,
"exists_time_in_millis" : 0,
"missing_total" : 0,
"missing_time_in_millis" : 0,
"current" : 0
},
"search" : {
"open_contexts" : 0,
"query_total" : 3600,
"query_time_in_millis" : 2244,
"query_current" : 0,
"fetch_total" : 0,
"fetch_time_in_millis" : 0,
"fetch_current" : 0,
"scroll_total" : 0,
"scroll_time_in_millis" : 0,
"scroll_current" : 0,
"suggest_total" : 0,
"suggest_time_in_millis" : 0,
"suggest_current" : 0
},
"merges" : {
"current" : 7,
"current_docs" : 84887,
"current_size_in_bytes" : 5208559790,
"total" : 4200,
"total_time_in_millis" : 1562814775, 26046 minutes
"total_docs" : 26446125,
"total_size_in_bytes" : 1630459159005,
"total_stopped_time_in_millis" : 0,
"total_throttled_time_in_millis" : 176622814,
"total_auto_throttle_in_bytes" : 1355747444
},
"refresh" : {
"total" : 34251,
"total_time_in_millis" : 291675999,
"listeners" : 0
},
"flush" : {
"total" : 814,
"total_time_in_millis" : 8085766
},
"warmer" : {
"current" : 0,
"total" : 30761,
"total_time_in_millis" : 17421
},
"query_cache" : {
"memory_size_in_bytes" : 0,
"total_count" : 0,
"hit_count" : 0,
"miss_count" : 0,
"cache_size" : 0,
"cache_count" : 0,
"evictions" : 0
},
"fielddata" : {
"memory_size_in_bytes" : 0,
"evictions" : 0
},
"completion" : {
"size_in_bytes" : 0
},
"segments" : {
"count" : 5781,
"memory_in_bytes" : 20149719869,
"terms_memory_in_bytes" : 20041048869,
"stored_fields_memory_in_bytes" : 70015256,
"term_vectors_memory_in_bytes" : 0,
"norms_memory_in_bytes" : 22342528,
"points_memory_in_bytes" : 4278516,
"doc_values_memory_in_bytes" : 12034700,
"index_writer_memory_in_bytes" : 1666056872,
"version_map_memory_in_bytes" : 2354056,
"fixed_bit_set_memory_in_bytes" : 0,
"max_unsafe_auto_id_timestamp" : -1,
"file_sizes" : { }
},
"translog" : {
"operations" : 1902669,
"size_in_bytes" : 107916750917,
"uncommitted_operations" : 1379595,
"uncommitted_size_in_bytes" : 78191490895
},
"request_cache" : {
"memory_size_in_bytes" : 0,
"evictions" : 0,
"hit_count" : 479,
"miss_count" : 3121
},
"recovery" : {
"current_as_source" : 0,
"current_as_target" : 0,
"throttle_time_in_millis" : 0
}
},
"total" : {
"docs" : {
"count" : 77419450,
"deleted" : 328041
},
"store" : {
"size_in_bytes" : 3766096105440
},
"indexing" : {
"index_total" : 6351191,
"index_time_in_millis" : 691062410,
"index_current" : 2,
"index_failed" : 0,
"delete_total" : 0,
"delete_time_in_millis" : 0,
"delete_current" : 0,
"noop_update_total" : 0,
"is_throttled" : false,
"throttle_time_in_millis" : 0
},
"get" : {
"total" : 0,
"time_in_millis" : 0,
"exists_total" : 0,
"exists_time_in_millis" : 0,
"missing_total" : 0,
"missing_time_in_millis" : 0,
"current" : 0
},
"search" : {
"open_contexts" : 0,
"query_total" : 3600,
"query_time_in_millis" : 2244,
"query_current" : 0,
"fetch_total" : 0,
"fetch_time_in_millis" : 0,
"fetch_current" : 0,
"scroll_total" : 0,
"scroll_time_in_millis" : 0,
"scroll_current" : 0,
"suggest_total" : 0,
"suggest_time_in_millis" : 0,
"suggest_current" : 0
},
"merges" : {
"current" : 7,
"current_docs" : 84887,
"current_size_in_bytes" : 5208559790,
"total" : 4200,
"total_time_in_millis" : 1562814775,
"total_docs" : 26446125,
"total_size_in_bytes" : 1630459159005,
"total_stopped_time_in_millis" : 0,
"total_throttled_time_in_millis" : 176622814,
"total_auto_throttle_in_bytes" : 1355747444
},
"refresh" : {
"total" : 34251,
"total_time_in_millis" : 291675999,
"listeners" : 0
},
"flush" : {
"total" : 814,
"total_time_in_millis" : 8085766
},
"warmer" : {
"current" : 0,
"total" : 30761,
"total_time_in_millis" : 17421
},
"query_cache" : {
"memory_size_in_bytes" : 0,
"total_count" : 0,
"hit_count" : 0,
"miss_count" : 0,
"cache_size" : 0,
"cache_count" : 0,
"evictions" : 0
},
"fielddata" : {
"memory_size_in_bytes" : 0,
"evictions" : 0
},
"completion" : {
"size_in_bytes" : 0
},
"segments" : {
"count" : 5781,
"memory_in_bytes" : 20149719869,
"terms_memory_in_bytes" : 20041048869,
"stored_fields_memory_in_bytes" : 70015256,
"term_vectors_memory_in_bytes" : 0,
"norms_memory_in_bytes" : 22342528,
"points_memory_in_bytes" : 4278516,
"doc_values_memory_in_bytes" : 12034700,
"index_writer_memory_in_bytes" : 1666056872,
"version_map_memory_in_bytes" : 2354056,
"fixed_bit_set_memory_in_bytes" : 0,
"max_unsafe_auto_id_timestamp" : -1,
"file_sizes" : { }
},
"translog" : {
"operations" : 1902669,
"size_in_bytes" : 107916750917,
"uncommitted_operations" : 1379595,
"uncommitted_size_in_bytes" : 78191490895
},
"request_cache" : {
"memory_size_in_bytes" : 0,
"evictions" : 0,
"hit_count" : 479,
"miss_count" : 3121
},
"recovery" : {
"current_as_source" : 0,
"current_as_target" : 0,
"throttle_time_in_millis" : 0
}
}
},
"indices" : {
"icc_case_notes_crpte" : {
"primaries" : {
"docs" : {
"count" : 77419450,
"deleted" : 328041
},
"store" : {
"size_in_bytes" : 3766096105440
},
"indexing" : {
"index_total" : 6351191,
"index_time_in_millis" : 691062410,
"index_current" : 2,
"index_failed" : 0,
"delete_total" : 0,
"delete_time_in_millis" : 0,
"delete_current" : 0,
"noop_update_total" : 0,
"is_throttled" : false,
"throttle_time_in_millis" : 0
},
"get" : {
"total" : 0,
"time_in_millis" : 0,
"exists_total" : 0,
"exists_time_in_millis" : 0,
"missing_total" : 0,
"missing_time_in_millis" : 0,
"current" : 0
},
"search" : {
"open_contexts" : 0,
"query_total" : 3600,
"query_time_in_millis" : 2244,
"query_current" : 0,
"fetch_total" : 0,
"fetch_time_in_millis" : 0,
"fetch_current" : 0,
"scroll_total" : 0,
"scroll_time_in_millis" : 0,
"scroll_current" : 0,
"suggest_total" : 0,
"suggest_time_in_millis" : 0,
"suggest_current" : 0
},
"merges" : {
"current" : 7,
"current_docs" : 84887,
"current_size_in_bytes" : 5208559790,
"total" : 4200,
"total_time_in_millis" : 1562814775,
"total_docs" : 26446125,
"total_size_in_bytes" : 1630459159005,
"total_stopped_time_in_millis" : 0,
"total_throttled_time_in_millis" : 176622814,
"total_auto_throttle_in_bytes" : 1355747444
},
"refresh" : {
"total" : 34251,
"total_time_in_millis" : 291675999,
"listeners" : 0
},
"flush" : {
"total" : 814,
"total_time_in_millis" : 8085766
},
"warmer" : {
"current" : 0,
"total" : 30761,
"total_time_in_millis" : 17421
},
"query_cache" : {
"memory_size_in_bytes" : 0,
"total_count" : 0,
"hit_count" : 0,
"miss_count" : 0,
"cache_size" : 0,
"cache_count" : 0,
"evictions" : 0
},
"fielddata" : {
"memory_size_in_bytes" : 0,
"evictions" : 0
},
"completion" : {
"size_in_bytes" : 0
},
"segments" : {
"count" : 5781,
"memory_in_bytes" : 20149719869,
"terms_memory_in_bytes" : 20041048869,
"stored_fields_memory_in_bytes" : 70015256,
"term_vectors_memory_in_bytes" : 0,
"norms_memory_in_bytes" : 22342528,
"points_memory_in_bytes" : 4278516,
"doc_values_memory_in_bytes" : 12034700,
"index_writer_memory_in_bytes" : 1666056872,
"version_map_memory_in_bytes" : 2354056,
"fixed_bit_set_memory_in_bytes" : 0,
"max_unsafe_auto_id_timestamp" : -1,
"file_sizes" : { }
},
"translog" : {
"operations" : 1902669,
"size_in_bytes" : 107916750917,
"uncommitted_operations" : 1379595,
"uncommitted_size_in_bytes" : 78191490895
},
"request_cache" : {
"memory_size_in_bytes" : 0,
"evictions" : 0,
"hit_count" : 479,
"miss_count" : 3121
},
"recovery" : {
"current_as_source" : 0,
"current_as_target" : 0,
"throttle_time_in_millis" : 0
}
},
"total" : {
"docs" : {
"count" : 77419450,
"deleted" : 328041
},
"store" : {
"size_in_bytes" : 3766096105440
},
"indexing" : {
"index_total" : 6351191,
"index_time_in_millis" : 691062410,
"index_current" : 2,
"index_failed" : 0,
"delete_total" : 0,
"delete_time_in_millis" : 0,
"delete_current" : 0,
"noop_update_total" : 0,
"is_throttled" : false,
"throttle_time_in_millis" : 0
},
"get" : {
"total" : 0,
"time_in_millis" : 0,
"exists_total" : 0,
"exists_time_in_millis" : 0,
"missing_total" : 0,
"missing_time_in_millis" : 0,
"current" : 0
},
"search" : {
"open_contexts" : 0,
"query_total" : 3600,
"query_time_in_millis" : 2244,
"query_current" : 0,
"fetch_total" : 0,
"fetch_time_in_millis" : 0,
"fetch_current" : 0,
"scroll_total" : 0,
"scroll_time_in_millis" : 0,
"scroll_current" : 0,
"suggest_total" : 0,
"suggest_time_in_millis" : 0,
"suggest_current" : 0
},
"merges" : {
"current" : 7,
"current_docs" : 84887,
"current_size_in_bytes" : 5208559790,
"total" : 4200,
"total_time_in_millis" : 1562814775,
"total_docs" : 26446125,
"total_size_in_bytes" : 1630459159005,
"total_stopped_time_in_millis" : 0,
"total_throttled_time_in_millis" : 176622814,
"total_auto_throttle_in_bytes" : 1355747444
},
"refresh" : {
"total" : 34251,
"total_time_in_millis" : 291675999,
"listeners" : 0
},
"flush" : {
"total" : 814,
"total_time_in_millis" : 8085766
},
"warmer" : {
"current" : 0,
"total" : 30761,
"total_time_in_millis" : 17421
},
"query_cache" : {
"memory_size_in_bytes" : 0,
"total_count" : 0,
"hit_count" : 0,
"miss_count" : 0,
"cache_size" : 0,
"cache_count" : 0,
"evictions" : 0
},
"fielddata" : {
"memory_size_in_bytes" : 0,
"evictions" : 0
},
"completion" : {
"size_in_bytes" : 0
},
"segments" : {
"count" : 5781,
"memory_in_bytes" : 20149719869,
"terms_memory_in_bytes" : 20041048869,
"stored_fields_memory_in_bytes" : 70015256,
"term_vectors_memory_in_bytes" : 0,
"norms_memory_in_bytes" : 22342528,
"points_memory_in_bytes" : 4278516,
"doc_values_memory_in_bytes" : 12034700,
"index_writer_memory_in_bytes" : 1666056872,
"version_map_memory_in_bytes" : 2354056,
"fixed_bit_set_memory_in_bytes" : 0,
"max_unsafe_auto_id_timestamp" : -1,
"file_sizes" : { }
},
"translog" : {
"operations" : 1902669,
"size_in_bytes" : 107916750917,
"uncommitted_operations" : 1379595,
"uncommitted_size_in_bytes" : 78191490895
},
"request_cache" : {
"memory_size_in_bytes" : 0,
"evictions" : 0,
"hit_count" : 479,
"miss_count" : 3121
},
"recovery" : {
"current_as_source" : 0,
"current_as_target" : 0,
"throttle_time_in_millis" : 0
}
}
}
}
}

"deleted" : 328041 are probably due to the fact that I had to cancel the previous run as it was too slow, so that i can throttled up by changing attachment handlers from 3 to 7 and segment size to 7mb, as it had to delete and reindex almost 1mil+ rows out 6.8mil+ rows it indexed.

even the refresh time is quite high .. "total_time_in_millis" : 291675999, -- 4861 minutes or 81 hours?

Is this total time spent on these operations since the index is created or since the last restart of ES?

Hi @Raja95 and CC @Christian_Dahlqvist

I did not read every post but... So perhaps you already discussed this...

The indexing on a large index when the ids are provided / not generated by elasticsearch will always get slower as the size of the index increases.

This is because with each document index elasticsearch must check that if is unique which takes longer as the index grows

This is a fact.

Add in top you are using ES 6.1 I do not find this surprising at all that you are seeing this behavior.

I just went through this with another customer ... They struggled with this concept then I convinced them to try without using the provided id.. and were astonished by the results.

Sooo what I have recommended for others is store the provided is in another field and use the ES generated id..

Yup I am sure that will required some re-architecting... But you unless you break up the index into smaller chunks you will continue to run into this.

Can you make it better with optimizations and good HW and good hygiene... Yes But this will not change the underlying cause

2 Likes

There is one other aspect that is worth considering and that is the number of primary shards. When you send a bulkvrequest to Elasticsearch you want many documents to go to the same shard as that reduces the amount of IO (each write is synced) and with 200 primary shards you are likely to get a lot of small writes. The fact that merging according to your stats is throttling could indicate that your storage is slow, which would be made worse by this as well as specifying your own external IDs.

1 Like

@Raja95 - I've got a bit confused with the restarts/deletes/cancelled bulk ingest, and when the dumps were taken and what had happened in before and in between.

But the contribution from Stephen and Christian give you lost of possibilities to explore

  • change to letting Elasticsearch generate the IDs
  • Get and use faster storage
  • Reduce the number of primary shards
  • Split one large index into many smaller ones
  • Change HW model to have a few smaller servers, where effectively you can aggregate the IO bandwidth horizontally
  • ...

Some things might require you to start over!

Seems also you'e gone from 55m to 77m docs ingested in last few days, target was 114m? Though not ideal, some of these issues are primarily impacting you at ingest only, eventually all the ingest/merge/... IO will be done, and you'll have that big RAM system to use with lots of cache for queries. But as the slow IO and self-made IDs are not going to improve as you increase the index size, you need to decide if changing things is quicker/better.

btw, though it was implied I'll make it explicit, you can stick with your own IDs IF you split into N smaller indices. The ID only needs to be unique in the index. so e.g. changing icc_case_notes_crpte into a data stream and have an ILM policy that splits that into say 5m-docs-per-index icc_case_notes_crpte_0001, icc_case_notes_crpte_0002, ...

Last, you can also validate by trying to ingest into a different (dummy) index, say 1m rows. If the "unique ID check" issue is really the main one, you should see way better indexing speed into that new/clean index.