@JoarSvensson - I just finished doing a couple of large scale tests minutes ago...
- Increased batch size for bulk-index-request to 5k instead of earlier 1k - this was total failure. ES started throwing out a bunch of exceptions, all of which had this - 'NotSerializableExceptionWrapper[Failed to acknowledge mapping update within [30s]'
- I decreased batch size to 4k - similar story..ES did not throw out exceptions, but the bulk write rate was about 20k which was way lower as compared 40K to using batch size 1k ...
Also, I tried to disable auto throttling on merges(index. and setting max_thread_count to 1 which did not help either...
Its worth mentioning that our process runs in two phases , first phase is indexing heavy, and second phase is query heavy. The way it works is it
- first indexes all documents in ES for a client ( phase1) .
- as soon as all docs are written for a client, then it moves to Phase2 where the docs are updated according to our business rules and are indexed again..
What I have seen with 2.2.0 is that while phase1 is running for some clients, phase2 begins processing way faster than 1.7.3. I hunch is that in 1.7.3 because we set indices.throttle.type = none, indexing is topmost priority so we finish phase1 lot quicker, and Phase2 processing is automatically slowed down while phase1 is running...But for 2.2.0, Phase2 is way faster for some reason ( most probably because indexing is not the topmost priority anymore, and our queries are faster because of ES 2.2.0 optimizations ) .....
I would like to do something similar to 1.7.3 where we used to assign top most priority to indexing, but I am not aware of any settings which would do that in ES 2.2.0...I have already tried (index.merge.scheduler.auto_throttle = false) and (index.merge.scheduler.max_thread_count=1) with no luck....any clues on how to proceed further?
Thanks,
Madhav.