I need to migrate existing ES data from version 5.2.2 to 7.6.0.
I do that by reindexing index per index to V6 and then bulk inserting index per index to a new created index and then delete the source index.
I cannot use reindexing for that because I want to modfiy the data on the fly and I don't want to use ES specific update script.
This is my code for reindexing to v6:
var result = HttpClient.PostAsync(ReIndexUrl, new StringContent(JsonConvert.SerializeObject(reindexModel.Value), Encoding.UTF8, "application/json")).Result;
var response = result.Content.ReadAsStringAsync().Result;
var responseObj = JsonConvert.DeserializeObject<ReindexResponse>(response);
return responseObj.Total;
Note that reindexModel.Value is like this:
{ source = new { index = "sourceIndex" , type = "myDocType", size = 5000}, dest = new { index = targetIndex, type = "_doc" }
and ReindexResponse.Total is a long.
This is my code for reindexing to v7:
var dataPerIndex = data.Select(
item => new BulkIndexOperation<T>(item)
Index = indexName
var allBulksRequest = new BulkRequest
Operations = new BulkOperationsCollection<IBulkOperation>(dataPerIndex),
Refresh = Refresh.False
if (allBulksRequest.Operations.Any())
var bulkResponse = elasticClient.Bulk(allBulksRequest);
if (bulkResponse.Errors || bulkResponse.ItemsWithErrors.Any())
throw new Exception($"BulkInsert for index: {indexName} failed with errors: {bulkResponse.DebugInformation}");
Note that size of data is streamed from source index and is chunked to 5000 just like I do set size = 5000 when reindexing to v6.
This works fine but takes quite a while!
In my example with 160 indices with roughly 2GB size total, the process took about 45 minutes!
Not to imaging what happens in case of having like 200GB of data.
Interesting: Increasing value 5000 to like 10000 (which is max value without tweaking) did not improve performance at all.
Any ideas on that?
How do you solve this issue?