Hello together,
I have benchmarked the performance of our Elasticsearch server and I would like to show you one interesting finding.
As you can see, all the different settings: Changes in Bulk sizes (5 vs 11mb) and threads sizes (1,2,5 and 10 thread(s)) have the common result when it comes to the decrease of the indexing rate between 1 mil and 2 mil documents. 1 Mil documents have about 850-950mb. Furthermore, the blue and red line are 1 thread results. They perform surprisingly at best. Here, I have to say that I did not change any setting of Elasticsearch in regarding to the treadpools, I just send my HTTP requests in parallel. My settings are:
curl -XPOST localhost:9200/bench -d '{ "transient" : { "indices.store.throttle.type" : "none" }, "mappings" : { "bench" : { "_source" : { "enabled" : false }, "_all" : {"enabled" : false}, "properties" : { "level" : { "type" : "string", "index" : "not_analyzed" }, "time" : { "type" : "string", "index" : "not_analyzed" }, "timel" : { "type" : "string", "index" : "not_analyzed" }, "id" : { "type" : "string", "index" : "not_analyzed" }, "cat" : { "type" : "string", "index" : "not_analyzed" }, "comp" : { "type" : "string", "index" : "not_analyzed" }, "host_org" : { "type" : "string", "index" : "not_analyzed" }, "req" : { "type" : "string", "index" : "not_analyzed" }, "app" : { "type" : "string", "index" : "not_analyzed" }, "usr" : { "type" : "string", "index" : "not_analyzed" }, "vin" : { "type" : "string", "index" : "not_analyzed" }, "thread" : { "type" : "string", "index" : "not_analyzed" }, "origin" : { "type" : "string", "index" : "not_analyzed" }, "msg_text" : { "type" : "string", "index" : "not_analyzed" }, "clat" : { "type" : "string", "index" : "not_analyzed" }, "clon" : { "type" : "string", "index" : "not_analyzed" }, "location": { "type": "string", "index" : "not_analyzed" } } } } }' curl -XPUT 'localhost:9200/bench/_settings' -d ' { "index" : { "number_of_replicas" : 0 } }' curl -XPUT localhost:9200/bench/_settings -d ' { "index" : { "refresh_interval" : "-1" } }' curl -XPOST 'http://localhost:9200/bench/_forcemerge?max_num_segments=5'
Furthermore the heap size was changed to 31gb.
Do you have any explanations to the certain decrease between 1 and 2 mil documents?
Thank you very much in advance.
Best regards