Well, that problem was one of the more interesting ones to track down... .
It was introduced in master, so first of all, big thanks for you (and Paul)
for spending the time and helping flush out any potential problems.
The bug was a sneaky one. At first glance you would think that data gets
lost, but its not, its really there, and other queries you do (like match
all) do return the relevant data. The problem was actually with the specific
field you were searching on: appAccountIds. By mistake, when parsing a
serialized mapping form (for example, during recovery), it was getting
underscore cased, so it was renamed to app_account_ids. Now, because this
field is a numeric type, a specific query needs to be built in order to do
matching on it, but, because it was renamed, and there was no mapping
for appAccountIds, it was using the regular text based query, which failed
to find hits.
I have just pushed a fix to this in master.
Regarding the memory aspect, you can set it using
index.store.fs.memory.enabled to true, but I suggest against it. I don't
think you will see much better performance out of it (especially thanks to
file system caching). And, if you use it, you won't be able to move to local
gateway (as you should not used it with local gateway) in the future. It
does make sense in very advance cases, where one would want to load the term
info in memory, or other lucene file constructs, but really, things should
be pretty fast without it.
On Thu, Nov 11, 2010 at 3:09 PM, diptamay firstname.lastname@example.org wrote:
I am running against a trunk build, yesterday's version. I will do an
update today and see. I understand the exceptions, was more concerned
about the search queries not returning.
Regarding the configuration:
- the index.memory.enabled.... I remember it existing ever since I
started looking at ES 0.8. I remember something like shard data gets
cached in memory for faster performance, and then I think you changed
that to per node memory caching. Isn't it so? If not what setting do
I use to have in memory caching for faster performance and if so how
do I also control the memory size for it, like 20 mb or 100 mb etc
- I am using the fs gateway since we would be using the same in our
QA and PROD environments for centralized nfs backups.
On Nov 11, 5:02 am, Shay Banon shay.ba...@elasticsearch.com wrote:
Regarding the recovery block exception, if you do a full shutdown and
start the cluster, clients will get failures while the cluster is down.
Until the cluster metadata is recovered, the cluster continues to be
(in a blocked state), it is just in its initialization phase. Then, each
index is recovered from the gateway, until it is recovered to be able to
properly answer queries, it is blocked as well, but on the index level
(so once an index is recovered, that index no loger blocked).
I wil run the test to see if I can verify the queries not returning
results post restart. Which version are you running on?
Two things regarding the configuration:
- Why do you set index.memory.enabled? It does not control anything,
did you find it?
- This is a single node with file system based gateway, any reason you
not using the local gateway?
On Thu, Nov 11, 2010 at 10:40 AM, Paul ppea...@gmail.com wrote:
Give latest master a shot. I was having similar issues as discussed
If you're on latest, must be a different issue.
On Nov 10, 10:27 pm, diptamay dipta...@gmail.com wrote:
Thanks but as I mentioned earlier, point 2 of the scenario should not
happen even when queries were fired while the cluster was coming up
i.e. the search results should return once the cluster is backup and
not stop returning results. The exceptions are understandable while
the cluster is coming up, though.
When users are hitting search from a website (say an html page making
ajax calls to ES through the REST api), the blocking concept to check
the state of a cluster is not ideal. What I am simulating using my
test case is an actual user hitting the ES servers through the REST
api, who is not aware of any concepts of a cluster.
On Nov 10, 11:33 pm, Ryan Crumley crum...@gmail.com wrote:
Sorry I should have explained in a little more detail. The cluster
api will allow you to block until the index becomes available so
get the "not recovered from gateway" errors. See the documentation
On Wed, Nov 10, 2010 at 9:39 PM, diptamay dipta...@gmail.com
Ideally, in the scenario described above, point 2 should not
even when queries were fired while the cluster was coming up.
On Nov 10, 10:32 pm, diptamay dipta...@gmail.com wrote:
How would this help?
On Nov 10, 10:12 pm, Ryan Crumley crum...@gmail.com wrote:
Try the cluster health api:
On Wed, Nov 10, 2010 at 8:20 PM, diptamay <
Please checkout g...@github.com:diptamay/es-issue.git and
at the issue listed below, running against the latest trunk
Most of the times when an ES server is stopped and
search queries are fired while the server is starting up,
following issues are seen:
"error" : "ClusterBlockException[blocked by: [1/not
gateway];[3/index not recovered];]"
2) Once the above error is encountered and the server has
completely, the search queries which were executing
before the stoppage, are not returning results anymore.
Steps to setup and reproduce:
- Ensure ES is running at localhost:9200 (look at
2) run ./automate.sh.
a) This will create an es-test index and load the sample
b) Then it fires a query which returns results correctly.
in an infinite loop
3) stop running ./automate.sh
4) run ./break-it.sh. This will keep running the query
in an infinite loop. Keep it running
5) stop and start ES at localhost:9200
6) You should see the issues listed above. If not repeat
Let me know if you need anything else.
Note: Configuration of ES:
snapshot_interval : 30s
number_of_shards : 2
number_of_replicas : 1