Timeout during bulk indexing when doc size increased

Hi,

I have the following setup:

Two ES nodes on ec2 (fedora) with 5 shards 1 replica each, default
elasticsearch.conf, a separate ec2 instance (ubuntu 10.10) that
indexes documents to ES via pyes 0.16

I'm indexing documents in batches of less than 5 thousand docs.

If each document is less than 2Kb, everything runs smoothly.
If I populate a string field that has a paragraph of text in it,
bringing the size of a doc to ~4Kb, the insertion process times out.

The timeout I set in pyes when I create the connection object is 2
minutes, my input files are getting larger and I will need to be able
to push more data per document into ES. I'm wondering what is causing
the timeout and what I need to change in the config in order to push
into the ES reliably.

If you can share some rules of thumb, I'd greatly appreciate that
Thank you

First, depending on your instance type, I would suggest changing the default memory allocation to explicitly set the memory set for the ES (java) process.

I don't know what times out and why, did you see any failures in elasticsearch logs? If you increase the time out value, does it work?

On Thursday, February 16, 2012 at 8:33 PM, Dragan wrote:

Hi,

I have the following setup:

Two ES nodes on ec2 (fedora) with 5 shards 1 replica each, default
elasticsearch.conf, a separate ec2 instance (ubuntu 10.10) that
indexes documents to ES via pyes 0.16

I'm indexing documents in batches of less than 5 thousand docs.

If each document is less than 2Kb, everything runs smoothly.
If I populate a string field that has a paragraph of text in it,
bringing the size of a doc to ~4Kb, the insertion process times out.

The timeout I set in pyes when I create the connection object is 2
minutes, my input files are getting larger and I will need to be able
to push more data per document into ES. I'm wondering what is causing
the timeout and what I need to change in the config in order to push
into the ES reliably.

If you can share some rules of thumb, I'd greatly appreciate that
Thank you