I've installed ES 0.90.5 on a RHEL 6 server using the RPM. Using the
default config, I start indexing my data. After about 200 documents or so,
I'm getting 503's every 5 documents. The documents aren't large (e.g. https://gist.github.com/markwoon/7206263). It goes away after a couple
seconds, but I consistently get a 503 after 5 more documents.
I'm new to ES, so any pointers would be helpful on how I can improve
indexing speed.
I was able to increase the the indexing speed significantly doing the
following (this is not exactly good for real-time indexing/searching
requirements):
before the indexing:
number_of_shards:3 (something more than 1, it depends on your needs and
resources)
number_of_replicas: 0
refresh_interval:-1
merge.policy.max_merged_segment: 1gb
Do indexing
after the indexing:
set number_of_replicas: 1 (or whatever is good for your system)
refresh manually with the API
On Monday, October 28, 2013 4:04:51 PM UTC-7, Mark Woon wrote:
I've installed ES 0.90.5 on a RHEL 6 server using the RPM. Using the
default config, I start indexing my data. After about 200 documents or so,
I'm getting 503's every 5 documents. The documents aren't large (e.g. Sample JSON for MEDLINE data (PMID:12336567). · GitHub). It goes away after a couple
seconds, but I consistently get a 503 after 5 more documents.
I'm new to ES, so any pointers would be helpful on how I can improve
indexing speed.
Solved my problem, and it had nothing to do with ES.
There was a proxy in front of the ES server that was rate limiting it...
-Mark
On Monday, October 28, 2013 4:04:51 PM UTC-7, Mark Woon wrote:
I've installed ES 0.90.5 on a RHEL 6 server using the RPM. Using the
default config, I start indexing my data. After about 200 documents or so,
I'm getting 503's every 5 documents. The documents aren't large (e.g. Sample JSON for MEDLINE data (PMID:12336567). · GitHub). It goes away after a couple
seconds, but I consistently get a 503 after 5 more documents.
I'm new to ES, so any pointers would be helpful on how I can improve
indexing speed.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.