We recently encountered issues for the following scenario:
On a standard 3-node cluster, we create a typical index with 4/5 mappings (each mapping with ~10-15 fields and some custom analyzers). Immediately after the index creation call in the Java client, we invoke the TransportClient to create at-least one record each. However, sometimes, we get an IndexMissingException when we insert a document in the created index.
I think the below is related, but since the thread is dated, I wanted to know whether there are other approaches.
As per the original question, i believe "index.refresh_interval" dictates the time between indexed data to be available for search. The value is 1 sec, by default.
You are correct, there is a bit of a lag. The Create Index API is asynchronous, it will return as soon as the master has acknowledged the request (but not necessarily after the shards have been created on the individual nodes).
To prevent this kind of problem, you can wait for the status to return to yellow or green. During the brief moment that the shards are being created your cluster will be red, so it works as an effective block.
You can do this via the cluster health api: GET localhost:9200/_cluster/health?wait_for_status=yellow
This is basically how our integration tests circumvent this very issue.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.