I am running unit tests with a local cluster. They are generally very reliable, but every once in a while one of them fails because a test (A) indexes something, (B) then calls refresh for the client, (C) and then tries to look up the recently indexed item, and it is not there. The same test will work 95%+ of the time. My guess is that the refresh is not always fast enough (I read it is "near" real-time). Is there anything I can do to make these tests more reliable? I could add a little sleep time after the refresh, but that seems like it would be an unreliable kludge (not to mention would slow down my tests a lot).
HOW I REFRESH:
elasticSearchClient.admin().indices().refresh(Requests.refreshRequest()).actionGet();
HOW I SET UP THE LOCAL TEST CLUSTER:
hostNode = NodeBuilder.nodeBuilder().loadConfigSettings(false).clusterName("test.cluster").local(true).settings(
ImmutableSettings.settingsBuilder()
.put("index.number_of_shards", 1)
.put("index.number_of_replicas", 1)
.build()
The refresh API, once returned (successfully), guarantees that anything that
happened before it was executed will become searchable. The near real time
aspect comes from the fact that its not called on every operations, just
periodically (internally).
Is there a chance for a concurrency glitch here (are A, B, and C executed
serially)? or maybe the indices have not recovered yet?
I am running unit tests with a local cluster. They are generally very
reliable, but every once in a while one of them fails because a test (A)
indexes something, (B) then calls refresh for the client, (C) and then
tries
to look up the recently indexed item, and it is not there. The same test
will work 95%+ of the time. My guess is that the refresh is not always
fast
enough (I read it is "near" real-time). Is there anything I can do to make
these tests more reliable? I could add a little sleep time after the
refresh, but that seems like it would be an unreliable kludge (not to
mention would slow down my tests a lot).
A, B, and C are called serially in one single thread.
As for recovery, I don't think that is the issue, as I am not trying to search data that was indexed before program startup; I am only searching docs that were indexed with that client same instance. I'm not relying on any persistent data from before the test ran, if that is what you mean by index recovery.
I was not checking for successful invocation of the refresh, so I tried the below method for refreshing. However, I did still see one intermittent failure with the below refresh.
private void refresh() {
int maxTries = 3;
for (int i = 0; i < maxTries; i++) {
BroadcastOperationResponse response = elasticSearchClient.admin().indices().refresh(Requests.refreshRequest()).actionGet();
if (response.failedShards() > 0) {
System.out.println("did not refresh index search client successfully; retry");
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
} else {
return;
}
}
Assert.fail("unable to refresh index search client.");
}
A, B, and C are called serially in one single thread.
As for recovery, I don't think that is the issue, as I am not trying to
search data that was indexed before program startup; I am only searching
docs that were indexed with that client same instance. I'm not relying on
any persistent data from before the test ran, if that is what you mean by
index recovery.
I was not checking for successful invocation of the refresh, so I tried the
below method for refreshing. However, I did still see one intermittent
failure with the below refresh.
private void refresh() {
int maxTries = 3;
for (int i = 0; i < maxTries; i++) {
BroadcastOperationResponse response =
elasticSearchClient.admin().indices().refresh(Requests.refreshRequest()).actionGet();
if (response.failedShards() > 0) {
System.out.println("did not refresh index search client
successfully; retry");
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
} else {
return;
}
}
Assert.fail("unable to refresh index search client.");
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.