I'm trying to upload about 7 million documents to ES 6.3 and I've been running into and issue where the bulk upload slows to a crawl at about 1 million docs (I have no documents previous to this in the index).
I have a 3 node ES setup with 16GB with 8GB JVM settings, 1 index, 5 shards.
I have turned off refresh ("-1"), set replica to 0, increased the index buffer size to 30%.
On my upload side I have 22 threads running 150 docs per request of bulk insert.
For all of my nodes and upload machines the CPU, Memory, SSD Disk IO is low.
I've been able to get about 30k-40k inserts per/minute, but that seems really slow to me since others have been able to do 2k-3k per/sec. My documents do have nested json, but they don't seem to be very large to me (Is there way to check a single size doc or average?).
I would like to be able to bulk upload these documents in less than 12 - 24hrs and seems like ES should handle that, but once I get to 1 million it seems like it slows to a crawl.
I'm pretty new to ES so any help would be appreciated. I know this seems like question that has already been asked, but I've tried just about everything that I could find and wonder why my upload speed is a factor slower.
I've also checked the logs and only saw some errors about mapping field couldn't change, but nothing about memory over or anything like that.
ES 6.3 is great, but I'm also finding that the API has changed a bunch to 6 and settings that people were using are no longer supported.