I'm evaluating ElasticSearch as a replacement for Solr, as my data is
getting too large for Solr to cope with comfortably.
I'm importing ~19m documents, each of which is ~512 bytes. I'm using a
two-node ElasticSearch 0.16.1 cluster on EC2 m1.large instances.
They're running on Ubuntu Maverick servers, with Sun Java 1.6.0_21-
b06. I wrote a bulk import script which uses pyes over HTTP. I'm using
the default pyes chunk size of 400 documents.
After indexing around 3m documents, connections to ES time out:
File "/home/ieure/bulkload/lib/python2.6/site-packages/pyes-0.15.0-
py2.6.egg/pyes/es.py", line 634, in index
self.flush_bulk()
File "/home/ieure/bulkload/lib/python2.6/site-packages/pyes-0.15.0-
py2.6.egg/pyes/es.py", line 660, in flush_bulk
self.force_bulk()
File "/home/ieure/bulkload/lib/python2.6/site-packages/pyes-0.15.0-
py2.6.egg/pyes/es.py", line 668, in force_bulk
self._send_request("POST", "/_bulk", self.bulk_data.getvalue())
File "/home/ieure/bulkload/lib/python2.6/site-packages/pyes-0.15.0-
py2.6.egg/pyes/es.py", line 205, in _send_request
response = self.connection.execute(request)
File "/home/ieure/bulkload/lib/python2.6/site-packages/pyes-0.15.0-
py2.6.egg/pyes/connection_http.py", line 167, in _client_call
return getattr(conn.client, attr)(*args, **kwargs)
File "/home/ieure/bulkload/lib/python2.6/site-packages/pyes-0.15.0-
py2.6.egg/pyes/connection_http.py", line 59, in execute
response =
self.client.urlopen(Method._VALUES_TO_NAMES[request.method], uri,
body=request.body, headers=request.headers)
File "/home/ieure/bulkload/lib/python2.6/site-packages/pyes-0.15.0-
py2.6.egg/pyes/urllib3/connectionpool.py", line 286, in urlopen
raise TimeoutError("Request timed out after %f seconds" %
self.timeout)
pyes.urllib3.connectionpool.TimeoutError: Request timed out after
5.000000 seconds
There was nothing in the logs at all, so I turned on whatever debug
settings I could find and tried again. Got the same result, and I
didn't see much in the logs. The only thing that looked suspicious was
this:
[2011-05-26 23:46:53,695][DEBUG][index.merge.scheduler ] [Marko,
Cain] [canonical][0] merge [_5yf] done, took [27.3s]
However, this appears a many times (I count 177, but this is for more
than just the import which died) prior to this last time, and none of
those seemed to have an effect.
I'm importing on the same box as I'm running ElasticSearch, so this is
just over localhost; I doubt this is a network issue.
I don't know what's going on, and I'd like to get this working. I
asked on IRC, but nobody responded. How do I go about diagnosing and
solving this problem?