We're trying to load 400k documents using the java bulk api (Client
and BulkRequestBuilder) and are running into some timeout issues that we
think could be remedied by disabling indexing during the publish, then
re-enabling immediately upon publish completion. Is there a way to use the
Client to execute the below PUT call, or something similar?
Setting the refresh interval to -1 is the only setting I change before bulk
indexing to a live index. For new indices, I remove all replicas first and
add them in after indexing is done (waiting for a green state before
actually using the index). Indexing should be completely under your control
unless you are using a river. Can you implement a singleton that controls
indexing?
We're trying to load 400k documents using the java bulk api (Client
and BulkRequestBuilder) and are running into some timeout issues that we
think could be remedied by disabling indexing during the publish, then
re-enabling immediately upon publish completion. Is there a way to use the
Client to execute the below PUT call, or something similar?
Thanks for the response. That sounds like a pretty viable solution, and
I'll give it a try shortly.
Thanks,
Russell
On Friday, October 5, 2012 11:32:19 AM UTC-4, Russell Snyder wrote:
Hi,
We're trying to load 400k documents using the java bulk api (Client
and BulkRequestBuilder) and are running into some timeout issues that we
think could be remedied by disabling indexing during the publish, then
re-enabling immediately upon publish completion. Is there a way to use the
Client to execute the below PUT call, or something similar?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.