Full Index Rebuild with 5M documents on Elastic search 7.6 takes 1 week to complete

Hey There

Hope all are doing well

Please have a look in to the below issue and provide your valuable suggestion on this.

Currently Elastic 7.6 is running on cluster with 3 nodes and it's having 5M records in it.

While performing the Full rebuild of index is taking huge amount of time. But other server metrics like CPU and GC is within the limit

And similarly I had tried ES re index process which is also time consuming. Applied few workarounds like by setting Refresh interval to -1 and Replica to 0 but it still takes very long time.

I have almost 168 shards but the segment count is lil more which is showing 800+.

Kindly help and suggest to resolve this issue. Any leads would be much appreciated.

Thanks

How large, in GB, is the index?

1 Like

Thanks Mark for your response.

Max index size has used during 5M testing was up to 30GB. Please find below

es1_development-authoring 2 p STARTED 249395 17.9gb 100.105.8.84 node-3
es1_development-authoring 2 r STARTED 248955 18.5gb 100.105.8.74 node-1
es1_development-authoring 1 p STARTED 247935 18.2gb 100.105.8.75 node-2
es1_development-authoring 1 r STARTED 245468 19.1gb 100.105.8.74 node-1
es1_development-authoring 3 p STARTED 247987 18.5gb 100.105.8.75 node-2
es1_development-authoring 3 r STARTED 246748 18.4gb 100.105.8.74 node-1
es1_development-authoring 4 p STARTED 248449 18.7gb 100.105.8.84 node-3
es1_development-authoring 4 r STARTED 246067 17.9gb 100.105.8.75 node-2
es1_development-authoring 0 r STARTED 246480 20.6gb 100.105.8.84 node-3
es1_development-authoring 0 p STARTED 247545 19gb 100.105.8.74 node-1
es1_development-review 2 r STARTED 247798 18gb 100.105.8.84 node-3
es1_development-review 2 p STARTED 245559 23gb 100.105.8.74 node-1
es1_development-review 1 p STARTED 246717 18gb 100.105.8.75 node-2
es1_development-review 1 r STARTED 244555 17.9gb 100.105.8.74 node-1
es1_development-review 3 r STARTED 246507 17.8gb 100.105.8.75 node-2
es1_development-review 3 p STARTED 244215 17.9gb 100.105.8.74 node-1
es1_development-review 4 r STARTED 246592 20.4gb 100.105.8.84 node-3
es1_development-review 4 p STARTED 246592 18gb 100.105.8.75 node-2
es1_development-review 0 r STARTED 246243 17.8gb 100.105.8.84 node-3
es1_development-review 0 p STARTED 244675 22.9gb 100.105.8.74 node-1
es1_development-published 2 r STARTED 227358 25gb 100.105.8.75 node-2
es1_development-published 2 p STARTED 227358 30.4gb 100.105.8.74 node-1
es1_development-published 1 p STARTED 227314 31.5gb 100.105.8.75 node-2
es1_development-published 1 r STARTED 227314 30.9gb 100.105.8.74 node-1
es1_development-published 3 p STARTED 227912 29gb 100.105.8.84 node-3
es1_development-published 3 r STARTED 227912 31.7gb 100.105.8.74 node-1
es1_development-published 4 r STARTED 227064 24.9gb 100.105.8.84 node-3
es1_development-published 4 p STARTED 227064 28.1gb 100.105.8.75 node-2
es1_development-published 0 p STARTED 227586 27.4gb 100.105.8.84 node-3
es1_development-published 0 r STARTED 227586 28.4gb 100.105.8.74 node-1
es1_development-approval 2 p STARTED 241330 10.7gb 100.105.8.84 node-3
es1_development-approval 2 r STARTED 241330 10.8gb 100.105.8.74 node-1
es1_development-approval 1 p STARTED 242573 10.6gb 100.105.8.84 node-3
es1_development-approval 1 r STARTED 242573 10.7gb 100.105.8.74 node-1
es1_development-approval 3 r STARTED 242309 10.6gb 100.105.8.84 node-3
es1_development-approval 3 p STARTED 242309 10.7gb 100.105.8.75 node-2
es1_development-approval 4 p STARTED 242142 10.8gb 100.105.8.84 node-3
es1_development-approval 4 r STARTED 242142 10.8gb 100.105.8.75 node-2
es1_development-approval 0 p STARTED 242079 10.7gb 100.105.8.75 node-2
es1_development-approval 0 r STARTED 242079 10.6gb 100.105.8.74 node-1

Ok, do you have 168 shards for this one, 30GB index?

Total it shows 168 for all 3 nodes in Kibana.

Yeah, but how many for this single index you are reindexing?

What you have listed are the individual shards, so it looks like the indices you have shown all have 5 primary shards and most of them seem to be over 100GB in size. It would help if you could provide some details about the hardware and configuration of your cluster as well as details about exactly how you are reindexing.

Yes It uses default config that is 5 shards and all total Kibana shows 31 indices

Server Configuration :- 8 core ocpu with 54 GB RAM and where 26 GB assigned to heap on each node

So far it has used total storage on 3 nodes almost 1000 GB

Please find below index setting

{
"settings": {
"index": {
"number_of_shards": "5",
"provided_name": "es1_development-authoring",
"creation_date": "1605507707367",
"analysis": {
"filter": {
"stemmer_filter": {
"type": "stemmer",
"language": "english"
},
"word_delimiter_filter": {
"split_on_numerics": "false",
"split_on_case_change": "false",
"generate_word_parts": "true",
"type": "word_delimiter",
"generate_number_parts": "true"
}
},
"char_filter": {
"html_strip_char_filter": {
"type": "html_strip"
}
},
"normalizer": {
"lowercase_normalizer": {
"filter": [
"lowercase"
],
"type": "custom"
}
},
"analyzer": {
"stem_analyzer": {
"filter": [
"lowercase",
"word_delimiter_filter",
"stemmer_filter"
],
"char_filter": [
"html_strip_char_filter"
],
"tokenizer": "whitespace"
},
"word_delimiter_analyzer": {
"filter": [
"lowercase",
"word_delimiter_filter"
],
"char_filter": [
"html_strip_char_filter"
],
"type": "custom",
"tokenizer": "whitespace"
}
}
},
"number_of_replicas": "1",
"uuid": "cB3GcK07RAmlmkdieUFv7A",
"version": {
"created": "7060099"
}
}
},
"defaults": {
"index": {
"flush_after_merge": "512mb",
"final_pipeline": "_none",
"max_inner_result_window": "100",
"unassigned": {
"node_left": {
"delayed_timeout": "1m"
}
},
"max_terms_count": "65536",
"lifecycle": {
"name": "",
"parse_origination_date": "false",
"indexing_complete": "false",
"rollover_alias": "",
"origination_date": "-1"
},
"routing_partition_size": "1",
"force_memory_term_dictionary": "false",
"max_docvalue_fields_search": "100",
"merge": {
"scheduler": {
"max_thread_count": "4",
"auto_throttle": "true",
"max_merge_count": "9"
},
"policy": {
"reclaim_deletes_weight": "2.0",
"floor_segment": "2mb",
"max_merge_at_once_explicit": "30",
"max_merge_at_once": "10",
"max_merged_segment": "5gb",
"expunge_deletes_allowed": "10.0",
"segments_per_tier": "10.0",
"deletes_pct_allowed": "33.0"
}
},
"max_refresh_listeners": "1000",
"max_regex_length": "1000",
"load_fixed_bitset_filters_eagerly": "true",
"number_of_routing_shards": "1",
"write": {
"wait_for_active_shards": "1"
},
"verified_before_close": "false",
"mapping": {
"coerce": "false",
"nested_fields": {
"limit": "50"
},
"depth": {
"limit": "20"
},
"field_name_length": {
"limit": "9223372036854775807"
},
"total_fields": {
"limit": "1000"
},
"nested_objects": {
"limit": "10000"
},
"ignore_malformed": "false"
},
"source_only": "false",
"soft_deletes": {
"enabled": "false",
"retention": {
"operations": "0"
},
"retention_lease": {
"period": "12h"
}
},
"max_script_fields": "32",
"query": {
"default_field": [
"*"
],
"parse": {
"allow_unmapped_fields": "true"
}
},
"format": "0",
"frozen": "false",
"sort": {
"missing": ,
"mode": ,
"field": ,
"order":
},
"priority": "1",
"codec": "default",
"max_rescore_window": "10000",
"max_adjacency_matrix_filters": "100",
"analyze": {
"max_token_count": "10000"
},
"gc_deletes": "60s",
"optimize_auto_generated_id": "true",
"max_ngram_diff": "1",
"translog": {
"generation_threshold_size": "64mb",
"flush_threshold_size": "512mb",
"sync_interval": "5s",
"retention": {
"size": "512MB",
"age": "12h"
},
"durability": "REQUEST"
},
"auto_expand_replicas": "false",
"mapper": {
"dynamic": "true"
},
"requests": {
"cache": {
"enable": "true"
}
},
"data_path": "",
"highlight": {
"max_analyzed_offset": "1000000"
},
"routing": {
"rebalance": {
"enable": "all"
},
"allocation": {
"enable": "all",
"total_shards_per_node": "-1"
}
},
"search": {
"slowlog": {
"level": "TRACE",
"threshold": {
"fetch": {
"warn": "-1",
"trace": "-1",
"debug": "-1",
"info": "-1"
},
"query": {
"warn": "-1",
"trace": "-1",
"debug": "-1",
"info": "-1"
}
}
},
"idle": {
"after": "30s"
},
"throttled": "false"
},
"fielddata": {
"cache": "node"
},
"default_pipeline": "_none",
"max_slices_per_scroll": "1024",
"shard": {
"check_on_startup": "false"
},
"xpack": {
"watcher": {
"template": {
"version": ""
}
},
"version": "",
"ccr": {
"following_index": "false"
}
},
"percolator": {
"map_unmapped_fields_as_text": "false"
},
"allocation": {
"max_retries": "5"
},
"refresh_interval": "1s",
"indexing": {
"slowlog": {
"reformat": "true",
"threshold": {
"index": {
"warn": "-1",
"trace": "-1",
"debug": "-1",
"info": "-1"
}
},
"source": "1000",
"level": "TRACE"
}
},
"compound_format": "0.1",
"blocks": {
"metadata": "false",
"read": "false",
"read_only_allow_delete": "false",
"read_only": "false",
"write": "false"
},
"max_result_window": "10000",
"store": {
"stats_refresh_interval": "10s",
"type": "",
"fs": {
"fs_lock": "native"
},
"preload":
},
"queries": {
"cache": {
"enabled": "true"
}
},
"warmer": {
"enabled": "true"
},
"max_shingle_diff": "3",
"query_string": {
"lenient": "false"
}
}
}
}

What type of storage are you using? Exactly how are you reindexing the data? If you are using the reindex API, are you slicing to improve concurrency?

It may also help if you tell us a bit about your data as the documents look quite large.

Hey Christian

Please find below information

I am using block volume storage and each node has BV as 1T in each
Local SSD is almost 250GB on each node.

Yes it uses Reindex API of Elasticsearch and it is configured for auto slicing. ES will decide on the number of slices depends on the volume and shards.
As part of data, I have seeded .pdf(56KB),.png(10KB), .docx(10k to 100k) and .txt (10K size)

So , For 5M the total size is almost 300-400 GB.

Thanks

Currently in my system, for Re indexing it takes 18-19 hrs to complete. Can we still improve ?

But where as full index rebuild is very expensive and taking 1 week of time to complete

In assume this is where data is stored. What does disk I/O and iowait look like for these volumes during reindexing?

Are you storing binary content ion base64 encoded for in the index?

Yes you are right. This is where the data is stored. And I dint find any disk i/o wait issue during the process. In fact the cpu consumption was almost 30-40%. A

And exactly not very sure about internally how it stores in indexes?

In the current environment setup, I have already validated small amount of data set that 270k records and during that time it was faster ES reindex and Full Index rebuild.

But when I increased upto 5M , slowness observed.

Hi

I think yes based on below statement

Records/documents are stored in Table segment and if there are any index created on the table.
rowid , Column value would be stored in index segment.A logical rowid is a base64-encoded representation of the table primary key

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.