Pagination with range queries giving duplicate results

Hi Folks
We are facing an issue intermittently of range queries with pagination
missing some records or giving duplicate ids.

Let me describe our system.
Lets say we have certain number of records and during which queries are
being made in ES, we can assume that no record is getting created/updated
or deleted.

The query we have is something like :
{"query":{"bool":{"must":[{"range":{"lastname":{"from":"Doe","to":null,"include_lower":false,"include_upper":true}}},{"range":{"firstname":{"from":"joe","to":null,"include_lower":false,"include_upper":true}}}]}}}

Now we have also have from":X,"size":Y, and we will issue multiple queries
with from increasing every time as X = X + Y.

The idea is that every time we will get unique records.
Unfortunately from time to time it does not happen. ( As I said its a
closed system, so lets assume nobody is updating/deleting/creating data.)
We see some records which are not there and some records which are
duplicate.

Anybody has seen similar issue, or can shed some light as to how we should
debug this?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c5bb300a-ee7c-48d5-a408-f9a1a1b267b6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Do you get duplicates within the same page or only in another page?

To ensure consistent pagination, I would use scroll API. Could you try with it?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 29 janv. 2015 à 08:06, Amish Asthana asthanaamish@gmail.com a écrit :

Hi Folks
We are facing an issue intermittently of range queries with pagination missing some records or giving duplicate ids.

Let me describe our system.
Lets say we have certain number of records and during which queries are being made in ES, we can assume that no record is getting created/updated or deleted.

The query we have is something like :
{"query":{"bool":{"must":[{"range":{"lastname":{"from":"Doe","to":null,"include_lower":false,"include_upper":true}}},{"range":{"firstname":{"from":"joe","to":null,"include_lower":false,"include_upper":true}}}]}}}

Now we have also have from":X,"size":Y, and we will issue multiple queries with from increasing every time as X = X + Y.

The idea is that every time we will get unique records.
Unfortunately from time to time it does not happen. ( As I said its a closed system, so lets assume nobody is updating/deleting/creating data.)
We see some records which are not there and some records which are duplicate.

Anybody has seen similar issue, or can shed some light as to how we should debug this?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c5bb300a-ee7c-48d5-a408-f9a1a1b267b6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7F981207-6D46-4973-87BD-B5C2D5E4702A%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Hi David
We are aware of scroll API, and are not using it as it will not scale.
That is the very reason I was stressing the fact that there is no
update/delete/create; as with multiple queries all bets are off if any of
this thing happen.
However with steady state)no change in data) I would expect them to work.

To answer your question : It happens in different pagination group.
regards and thanks
amish

On Wednesday, January 28, 2015 at 11:06:24 PM UTC-8, Amish Asthana wrote:

Hi Folks
We are facing an issue intermittently of range queries with pagination
missing some records or giving duplicate ids.

Let me describe our system.
Lets say we have certain number of records and during which queries are
being made in ES, we can assume that no record is getting created/updated
or deleted.

The query we have is something like :

{"query":{"bool":{"must":[{"range":{"lastname":{"from":"Doe","to":null,"include_lower":false,"include_upper":true}}},{"range":{"firstname":{"from":"joe","to":null,"include_lower":false,"include_upper":true}}}]}}}

Now we have also have from":X,"size":Y, and we will issue multiple queries
with from increasing every time as X = X + Y.

The idea is that every time we will get unique records.
Unfortunately from time to time it does not happen. ( As I said its a
closed system, so lets assume nobody is updating/deleting/creating data.)
We see some records which are not there and some records which are
duplicate.

Anybody has seen similar issue, or can shed some light as to how we should
debug this?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e005e536-5406-4321-b736-93b200765393%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.