Hi Folks
We are facing an issue intermittently of range queries with pagination
missing some records or giving duplicate ids.
Let me describe our system.
Lets say we have certain number of records and during which queries are
being made in ES, we can assume that no record is getting created/updated
or deleted.
The query we have is something like :
{"query":{"bool":{"must":[{"range":{"lastname":{"from":"Doe","to":null,"include_lower":false,"include_upper":true}}},{"range":{"firstname":{"from":"joe","to":null,"include_lower":false,"include_upper":true}}}]}}}
Now we have also have from":X,"size":Y, and we will issue multiple queries
with from increasing every time as X = X + Y.
The idea is that every time we will get unique records.
Unfortunately from time to time it does not happen. ( As I said its a
closed system, so lets assume nobody is updating/deleting/creating data.)
We see some records which are not there and some records which are
duplicate.
Anybody has seen similar issue, or can shed some light as to how we should
debug this?
Hi Folks
We are facing an issue intermittently of range queries with pagination missing some records or giving duplicate ids.
Let me describe our system.
Lets say we have certain number of records and during which queries are being made in ES, we can assume that no record is getting created/updated or deleted.
The query we have is something like :
{"query":{"bool":{"must":[{"range":{"lastname":{"from":"Doe","to":null,"include_lower":false,"include_upper":true}}},{"range":{"firstname":{"from":"joe","to":null,"include_lower":false,"include_upper":true}}}]}}}
Now we have also have from":X,"size":Y, and we will issue multiple queries with from increasing every time as X = X + Y.
The idea is that every time we will get unique records.
Unfortunately from time to time it does not happen. ( As I said its a closed system, so lets assume nobody is updating/deleting/creating data.)
We see some records which are not there and some records which are duplicate.
Anybody has seen similar issue, or can shed some light as to how we should debug this?
Hi David
We are aware of scroll API, and are not using it as it will not scale.
That is the very reason I was stressing the fact that there is no
update/delete/create; as with multiple queries all bets are off if any of
this thing happen.
However with steady state)no change in data) I would expect them to work.
To answer your question : It happens in different pagination group.
regards and thanks
amish
On Wednesday, January 28, 2015 at 11:06:24 PM UTC-8, Amish Asthana wrote:
Hi Folks
We are facing an issue intermittently of range queries with pagination
missing some records or giving duplicate ids.
Let me describe our system.
Lets say we have certain number of records and during which queries are
being made in ES, we can assume that no record is getting created/updated
or deleted.
Now we have also have from":X,"size":Y, and we will issue multiple queries
with from increasing every time as X = X + Y.
The idea is that every time we will get unique records.
Unfortunately from time to time it does not happen. ( As I said its a
closed system, so lets assume nobody is updating/deleting/creating data.)
We see some records which are not there and some records which are
duplicate.
Anybody has seen similar issue, or can shed some light as to how we should
debug this?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.