Hi everyone,
I need to reindex a 4.1TB index that was created with very little amount of shards originally (no capacity planning & growth forecast was done)
Nothing is being written currently in the source index and it's currently forced merged into a single segment for better "read/search" performance.
Target index is created from scratch (empty) and both refresh interval & replicas are disabled
I'm using slicing in the reindex tasks, one per shard in the source index (primary & replicas - is this ok ? should it be done only for primary shards ? )
Also batch size is 1000 records (when I set it to 2000 it blows up my cluster the whole thing starts giving error code 500 )
Is there something else that can be done here to make it run faster ? Also, is there a way to use scrolling (specify longer scroll TTL time in the body of reindex for search context )
POST _reindex?wait_for_completion=false&slices=20
{
"source": {
"index": "puma.compilation.pipeline.96f19f5b-bc84-4d4b-8694-b80a293e78e4-latest",
"size": 1000,
"query": {
"range": {
"ibi_logtime": {
"gte": "now-9M/M"
}
}
}
},
"dest": {
"index": "puma.compilation.pipeline.96f19f5b-bc84-4d4b-8694-b80a293e78e4-optimized"
}
}
Appreciate any feedback that can be provided on this,