Hi ,
I have a corpus of 100k ( 1 lakh products ) in json , I am bulk indexing to one of the cluster with bulk_size of 10k .
GET product__v99/_stats
{
"indexing": {
"index_total": 100000,
"index_time_in_millis": 46208,
"index_current": 0,
"index_failed": 0,
"delete_total": 0,
"delete_time_in_millis": 0,
"delete_current": 0,
"noop_update_total": 0,
"is_throttled": false,
"throttle_time_in_millis": 0
},
"bulk": {
"total_operations": 10,
"total_time_in_millis": 46489,
"total_size_in_bytes": 224500000,
"avg_time_in_millis": 3029,
"avg_size_in_bytes": 14622169
}
}
But when I check the csv from esrally it says Cumulative indexing time of primary shards = 38.31 mins .
Metric,Task,Value,Unit
Cumulative indexing time of primary shards,,38.31916666666667,min
Min cumulative indexing time across primary shards,,0,min
Median cumulative indexing time across primary shards,,0.00018333333333333334,min
Max cumulative indexing time across primary shards,,8.733716666666668,min
Cumulative indexing throttle time of primary shards,,0,min
Min cumulative indexing throttle time across primary shards,,0,min
Median cumulative indexing throttle time across primary shards,,0,min
Max cumulative indexing throttle time across primary shards,,0,min
I am really confused because there is a huge differnece in indexing time from report vs reality .
are only 2 threads in write pool , index is on 2 shards.