I am trying to index 80M tweets with a bulk loader that I wrote using
the Java API on 0.16.4.
When I get to about 2M tweets, indexing performance drops from 4K
indexed tweets per second to under 300 tweets per second. CPU load
goes from 30% to 3%. Heap memory bounces between 300MB and 600Mb and
then spikes to 900Mb.
Here are my index settings:
index.engine.robin.refresh_interval, 10 indices.memory.index_buffer_size, 0.50 index.number_of_shards, 4 index.number_of_replicas, 0 index.merge.policy.merge_factor, 30 index.merge.policy.use_compound_file, false; index.refresh_interval, "-1"
I am doing this on a index I just created on a local node.
I am running on a new quad-core i7 Macbook Pro.