I apologize in advance if this is an already answered question. I
couldn't find a reference to it, but you can never tell.
I am testing ElasticSearch from a grails application. Its just a
simple way to test my indexing and searching using the Java API.
Periodically I will also use the GXContentBuilder as well. I am
trying to keep a steady state in my tests. Meaning, I blow out any
existing indices and reindex before running my tests. Not before each
individual test mind you, but before running the 'search integration
tests'.
My problem is this: I can't find a reliable way to determine when the
indexes are ready. At first I thought a cluster health
'waitForYellow' check might work, but it doesn't. The following
search will show no results. Neither will an IndicesStatusRequest
show any documents in the index as well. I also tried a waitForGreen,
which did work. Until I realized that it was just timing out. Since
it's only one node it can't really 'go green'. So, what I ultimately
determined was that doing a Thread.currentThread().sleep(1000) would
do the trick. But its the cardinal sin of integration testing. I
have an awesome workstation with an SSD and new 'Sandy Bridge'
processor, so what seems to work on my machine will inevitably
breakdown when it's run on a CI server that's also running other
builds, etc. But only every once in awhile.
So, what I would really like is some type of blocking call I can make
that will tell me when the index is ready. Or at a minimum, some type
of argument to my IndexRequest that will make it block until the
document is ready to be searched. Of course, that isn't something I
would do in production, as having the call return a future is exactly
what I want on a save/update operation. But it would help a lot in
making some reliable integration tests.
I apologize in advance if this is an already answered question. I
couldn't find a reference to it, but you can never tell.
I am testing Elasticsearch from a grails application. Its just a
simple way to test my indexing and searching using the Java API.
Periodically I will also use the GXContentBuilder as well. I am
trying to keep a steady state in my tests. Meaning, I blow out any
existing indices and reindex before running my tests. Not before each
individual test mind you, but before running the 'search integration
tests'.
My problem is this: I can't find a reliable way to determine when the
indexes are ready. At first I thought a cluster health
'waitForYellow' check might work, but it doesn't. The following
search will show no results. Neither will an IndicesStatusRequest
show any documents in the index as well. I also tried a waitForGreen,
which did work. Until I realized that it was just timing out. Since
it's only one node it can't really 'go green'. So, what I ultimately
determined was that doing a Thread.currentThread().sleep(1000) would
do the trick. But its the cardinal sin of integration testing. I
have an awesome workstation with an SSD and new 'Sandy Bridge'
processor, so what seems to work on my machine will inevitably
breakdown when it's run on a CI server that's also running other
builds, etc. But only every once in awhile.
So, what I would really like is some type of blocking call I can make
that will tell me when the index is ready. Or at a minimum, some type
of argument to my IndexRequest that will make it block until the
document is ready to be searched. Of course, that isn't something I
would do in production, as having the call return a future is exactly
what I want on a save/update operation. But it would help a lot in
making some reliable integration tests.
I am, but I'm not using it for indexing, and I may remove it
altogether soon. I need more control than I can get from the plugin
over how my indices are created. The application is multi-tenant and
I am creating multiple indices depending upon some factors that can
only be determined at run-time. I've forked the plugin, and even made
some pull requests that have been accepted, and the plugin could
support pluggable strategies for index names, rather than the simple
package name one that exists now. However, it couldn't support it
without a fairly major rewrite of how mappings are applied. Given my
time constraints, it didn't make sense, especially without knowing if
that kind of change would be accepted. I'm also a bit concerned with
some other areas of the plugin, such as the synchronous blocks in the
queue used to write out changes. All of this could be addressed, and
the plugin makes it clear that it's not ready for production yet, but
my timetables are just too aggressive and I think the straight Elastic
Search API is actually quite good. Although, if I was using it, I
would still likely be looking for a way to bypass the integration
testing phase of grails. It takes too long, so it's not very useful
for developers writing tests. I can run just my test of search in 6
or 7 seconds. It takes that long just to resolve ivy dependencies.
(Something they're addressing in 1.4 I believe)
I apologize in advance if this is an already answered question. I
couldn't find a reference to it, but you can never tell.
I am testing Elasticsearch from a grails application. Its just a
simple way to test my indexing and searching using the Java API.
Periodically I will also use the GXContentBuilder as well. I am
trying to keep a steady state in my tests. Meaning, I blow out any
existing indices and reindex before running my tests. Not before each
individual test mind you, but before running the 'search integration
tests'.
My problem is this: I can't find a reliable way to determine when the
indexes are ready. At first I thought a cluster health
'waitForYellow' check might work, but it doesn't. The following
search will show no results. Neither will an IndicesStatusRequest
show any documents in the index as well. I also tried a waitForGreen,
which did work. Until I realized that it was just timing out. Since
it's only one node it can't really 'go green'. So, what I ultimately
determined was that doing a Thread.currentThread().sleep(1000) would
do the trick. But its the cardinal sin of integration testing. I
have an awesome workstation with an SSD and new 'Sandy Bridge'
processor, so what seems to work on my machine will inevitably
breakdown when it's run on a CI server that's also running other
builds, etc. But only every once in awhile.
So, what I would really like is some type of blocking call I can make
that will tell me when the index is ready. Or at a minimum, some type
of argument to my IndexRequest that will make it block until the
document is ready to be searched. Of course, that isn't something I
would do in production, as having the call return a future is exactly
what I want on a save/update operation. But it would help a lot in
making some reliable integration tests.
The refresh API call can be issued. When it returns, the index (and any
other requests made prior to the refresh call) are completed.
There is also the possiblity to pass "refresh=true" as a query parameter
when creating the index via REST call. I'm not sure if this is possible
using a Java API call.
I apologize in advance if this is an already answered question. I
couldn't find a reference to it, but you can never tell.
I am testing Elasticsearch from a grails application. Its just a
simple way to test my indexing and searching using the Java API.
Periodically I will also use the GXContentBuilder as well. I am
trying to keep a steady state in my tests. Meaning, I blow out any
existing indices and reindex before running my tests. Not before each
individual test mind you, but before running the 'search integration
tests'.
My problem is this: I can't find a reliable way to determine when the
indexes are ready. At first I thought a cluster health
'waitForYellow' check might work, but it doesn't. The following
search will show no results. Neither will an IndicesStatusRequest
show any documents in the index as well. I also tried a waitForGreen,
which did work. Until I realized that it was just timing out. Since
it's only one node it can't really 'go green'. So, what I ultimately
determined was that doing a Thread.currentThread().sleep(1000) would
do the trick. But its the cardinal sin of integration testing. I
have an awesome workstation with an SSD and new 'Sandy Bridge'
processor, so what seems to work on my machine will inevitably
breakdown when it's run on a CI server that's also running other
builds, etc. But only every once in awhile.
So, what I would really like is some type of blocking call I can make
that will tell me when the index is ready. Or at a minimum, some type
of argument to my IndexRequest that will make it block until the
document is ready to be searched. Of course, that isn't something I
would do in production, as having the call return a future is exactly
what I want on a save/update operation. But it would help a lot in
making some reliable integration tests.
First, indexing doc is complete once the index API execution returns. When it will be be visible for search thats another question.
By default, there is an ongoing async refreshing going on to make changes done visible for search. It defaults to 1 seconds. You can force a refresh by calling the refresh API. You can also force a refresh by setting the refresh flag to true on hte index request (but, don't use that in production!).
-shay.banon
On Tuesday, May 24, 2011 at 4:58 PM, James Cook wrote:
I apologize in advance if this is an already answered question. I
couldn't find a reference to it, but you can never tell.
I am testing Elasticsearch from a grails application. Its just a
simple way to test my indexing and searching using the Java API.
Periodically I will also use the GXContentBuilder as well. I am
trying to keep a steady state in my tests. Meaning, I blow out any
existing indices and reindex before running my tests. Not before each
individual test mind you, but before running the 'search integration
tests'.
My problem is this: I can't find a reliable way to determine when the
indexes are ready. At first I thought a cluster health
'waitForYellow' check might work, but it doesn't. The following
search will show no results. Neither will an IndicesStatusRequest
show any documents in the index as well. I also tried a waitForGreen,
which did work. Until I realized that it was just timing out. Since
it's only one node it can't really 'go green'. So, what I ultimately
determined was that doing a Thread.currentThread().sleep(1000) would
do the trick. But its the cardinal sin of integration testing. I
have an awesome workstation with an SSD and new 'Sandy Bridge'
processor, so what seems to work on my machine will inevitably
breakdown when it's run on a CI server that's also running other
builds, etc. But only every once in awhile.
So, what I would really like is some type of blocking call I can make
that will tell me when the index is ready. Or at a minimum, some type
of argument to my IndexRequest that will make it block until the
document is ready to be searched. Of course, that isn't something I
would do in production, as having the call return a future is exactly
what I want on a save/update operation. But it would help a lot in
making some reliable integration tests.
So, after a little back and forth, I was able to figure out the issue:
Issuing a RefreshRequest on an index does what I needed to do, and
works beautifully. I can create a local node, recreate all my
indices, load all my test data, and perform a number of searches all
in under 10 seconds. It makes for fast testing.
While I was having some issues with the index being done, but the
document not being available for search, I was also getting some odd
behavior that I thought was related but wasn't. I was creating and
populating my index without doing a 'put mapping'. It worked fine,
but only on every other test, and I don't know why. It would return a
correct number of totalHits, but with no actual hits, and a
ShardFailure. Once I added the explicit put mapping, the issue went
away.
First, indexing doc is complete once the index API execution returns. When it will be be visible for search thats another question.
By default, there is an ongoing async refreshing going on to make changes done visible for search. It defaults to 1 seconds. You can force a refresh by calling the refresh API. You can also force a refresh by setting the refresh flag to true on hte index request (but, don't use that in production!).
-shay.banon
On Tuesday, May 24, 2011 at 4:58 PM, James Cook wrote:
I apologize in advance if this is an already answered question. I
couldn't find a reference to it, but you can never tell.
I am testing Elasticsearch from a grails application. Its just a
simple way to test my indexing and searching using the Java API.
Periodically I will also use the GXContentBuilder as well. I am
trying to keep a steady state in my tests. Meaning, I blow out any
existing indices and reindex before running my tests. Not before each
individual test mind you, but before running the 'search integration
tests'.
My problem is this: I can't find a reliable way to determine when the
indexes are ready. At first I thought a cluster health
'waitForYellow' check might work, but it doesn't. The following
search will show no results. Neither will an IndicesStatusRequest
show any documents in the index as well. I also tried a waitForGreen,
which did work. Until I realized that it was just timing out. Since
it's only one node it can't really 'go green'. So, what I ultimately
determined was that doing a Thread.currentThread().sleep(1000) would
do the trick. But its the cardinal sin of integration testing. I
have an awesome workstation with an SSD and new 'Sandy Bridge'
processor, so what seems to work on my machine will inevitably
breakdown when it's run on a CI server that's also running other
builds, etc. But only every once in awhile.
So, what I would really like is some type of blocking call I can make
that will tell me when the index is ready. Or at a minimum, some type
of argument to my IndexRequest that will make it block until the
document is ready to be searched. Of course, that isn't something I
would do in production, as having the call return a future is exactly
what I want on a save/update operation. But it would help a lot in
making some reliable integration tests.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.