Regarding elasticsearch insert operations

Hi

I am using elastic search to store large amount of data. But I found that it is consuming large amount of time in inserting the data after a certain limit. I am just using one node with a single shard configuration and the size of the node is 300GB. Most of the configurations are default configurations. The system I am working on right now is having 23GB RAM. I have provided around 17GB RAM to elasticsearch. The machine has 8 core processors.

It was working fine but after 198GB of consumption it is taking large amount of time in insertion. The bulk API batch size is around 100. It is right now taking 28 seconds to persist the data. Is there anything which I need to do with respect to the configuration.?

I checked that there are 66 segments created right now.

Please tell if you need any other information from my side. I want to resolve the issue why it is taking such large time in insertion.

Hi,

One node with one shard is creating contention with a 200-300G index. This
is not how Elasticsearch scales, you stretched some resources of the
system. Most probably I/O and heap (GC). Note, 17 GB heap can trigger GC of
dozens of seconds or even minutes, because it is very large. There are
several options you can choose. Probably my blog can help
http://jprante.github.com/2012/11/28/Elasticsearch-Java-Virtual-Machine-settings-explained.html

My advice is to scale out, by adding more shards and more nodes, not to
scale up.

Cheers,

Jörg

--