Pyes indexing problem


(paulp) #1

Hi,

I am currently evaluating ES as an alternative to SolR.
My test environment consists of 4 m1.xlarge instances cluster on EC2, 8GB
(out of 16GB) is allocated for ES JVM.
The index is broken into 5 shards with 1 replica.
For indexing JSON docs I'm using "Pyes" with "Bulking" enabled (chunks of
100 docs).

I've encountered a problem when indexing about ~1.5 million of docs (same
index) where I get time-out from ES and unable to index any new documents .
The system load is normal, no swapping accrued, I also I couldn't find
anything relevant in ES logs.

Has anyone faced similar problem?

--
Regards,
Paul

**


(es_learner) #2

What params did you use when you called ES() for a connection?

e.g.
from pyes import ES
connection=ES([index_server], bulk_size=BS, timeout=TS)

By default BS=400 and TS=5 (secs)

Try adjusting the timeout value. timeout=None will not time out.


(system) #3