Hi All,
A little background first. I have a system where I have a set of rolling
indexes.
For example
Index 22
Index 23, etc.
There is an alias set up that points to the most recent, complete index.
current -> 22.
After the importer finishes importing the new version of the index, it
updates the alias and deletes the older versions.
When doing queries against elasticsearch, I work against the current alias
and if it 404's, I simply repeat the operation because of an alias
delete/create race.
This is all working ok for us so far.
Now, I want to do one additional thing. I want to be able to take the
search queries being issued by the user that are paged, and use those to do
a complete, atomic dump of the documents in the query. Basically the same
thing as running the query with size=Infinity. I want to do this, so I
don't miss or duplicate any entries in the result set. I'm ok if it takes a
long time. This does not seem to work though. If I do the search with
size=250000000 and search_type=query_and_fetch it doesn't work. It just
sits there doing nothing as far as I can tell. Is there a mode similar to
this I could use that would?
The only other option I can see is the scroll api. I'm not sure that is
safe to use with the current alias/delete index rolling that I am doing. Is
it safe? If you have an open scroll_id, what happens when you try and
delete the real index? Does the scroll_id go invalid? Does it try and point
to the new data instead of the old resulting in badness?
Thanks,
Kevin
--