I have the following setup:
Two ES nodes on ec2 (fedora) with 5 shards 1 replica each, default
elasticsearch.conf, a separate ec2 instance (ubuntu 10.10) that
indexes documents to ES via pyes 0.16
I'm indexing documents in batches of less than 5 thousand docs.
If each document is less than 2Kb, everything runs smoothly.
If I populate a string field that has a paragraph of text in it,
bringing the size of a doc to ~4Kb, the insertion process times out.
The timeout I set in pyes when I create the connection object is 2
minutes, my input files are getting larger and I will need to be able
to push more data per document into ES. I'm wondering what is causing
the timeout and what I need to change in the config in order to push
into the ES reliably.
If you can share some rules of thumb, I'd greatly appreciate that