Inconsistent paging

Hi,

We've noticed a strange behavior in elasticsearch during paging.

In one case we use a paging size of 60 and we have 63 documents. So the
first page is using size 60 and offset 0. The second page is using size 60
and offset 60. What we see is that the result is inconsistent. Meaning, on
the 2nd page, we sometimes get results that were before in the 1st page.

The query we use has an order by some numberic field that has many
documents with the same value (0).
It looks like the ordering between documents according to the same value,
which is 0, isn't consistent.

Did anyone encounter such behavior? Any suggestions on resolving this?

We're using version 1.3.1.

Thanks,
Ron

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKHuyJpcYKepYzh%2BBU2MSD2RQ19zjHYiXgf3anWBL9esq9fkGQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

You need to use scroll if you have that requirement.

See: Elasticsearch Platform — Find real-time answers at scale | Elastic

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 18 août 2014 à 08:02, Ron Sher ron.sher@gmail.com a écrit :

Hi,

We've noticed a strange behavior in elasticsearch during paging.

In one case we use a paging size of 60 and we have 63 documents. So the first page is using size 60 and offset 0. The second page is using size 60 and offset 60. What we see is that the result is inconsistent. Meaning, on the 2nd page, we sometimes get results that were before in the 1st page.

The query we use has an order by some numberic field that has many documents with the same value (0).
It looks like the ordering between documents according to the same value, which is 0, isn't consistent.

Did anyone encounter such behavior? Any suggestions on resolving this?

We're using version 1.3.1.

Thanks,
Ron

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKHuyJpcYKepYzh%2BBU2MSD2RQ19zjHYiXgf3anWBL9esq9fkGQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8DAEA97B-687A-44A6-B638-189A49D6310E%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Hi Ron,

The cause of this issue is that Elasticsearch uses Lucene's internal doc
IDs as tie-breakers. Internal doc IDs might be completely different across
replicas of the same data, so this explains why documents that have the
same sort values are not consistently ordered.

There are 2 potential ways to fix that problem:

  1. Use scroll as David mentionned. It will create a context around your
    request and will make sure that the same shards will be used for all pages.
    However, it also gives another warranty, which is that the same
    point-in-time view on the index will be used for each page, and this is
    expensive to maintain.
  2. Use a custom string value as a preference in order to always hit the
    same shards for a given session[1]. This will help with always hitting the
    same shards likely to 1. but without adding the additional cost of a scroll.

[1]

On Mon, Aug 18, 2014 at 8:02 AM, Ron Sher ron.sher@gmail.com wrote:

Hi,

We've noticed a strange behavior in elasticsearch during paging.

In one case we use a paging size of 60 and we have 63 documents. So the
first page is using size 60 and offset 0. The second page is using size 60
and offset 60. What we see is that the result is inconsistent. Meaning,
on the 2nd page, we sometimes get results that were before in the 1st page.

The query we use has an order by some numberic field that has many
documents with the same value (0).
It looks like the ordering between documents according to the same value,
which is 0, isn't consistent.

Did anyone encounter such behavior? Any suggestions on resolving this?

We're using version 1.3.1.

Thanks,
Ron

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKHuyJpcYKepYzh%2BBU2MSD2RQ19zjHYiXgf3anWBL9esq9fkGQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKHuyJpcYKepYzh%2BBU2MSD2RQ19zjHYiXgf3anWBL9esq9fkGQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7FJofXSpDjHnpMVs1poHFREbrQ9DPnPX4YnjFjUKg_ng%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thanks for the answer and sorry for the duplicate (posted from a different
source by mistake)

On Monday, August 18, 2014 11:02:47 AM UTC+3, Adrien Grand wrote:

Hi Ron,

The cause of this issue is that Elasticsearch uses Lucene's internal doc
IDs as tie-breakers. Internal doc IDs might be completely different
across replicas of the same data, so this explains why documents that have
the same sort values are not consistently ordered.

There are 2 potential ways to fix that problem:

  1. Use scroll as David mentionned. It will create a context around your
    request and will make sure that the same shards will be used for all pages.
    However, it also gives another warranty, which is that the same
    point-in-time view on the index will be used for each page, and this is
    expensive to maintain.
  2. Use a custom string value as a preference in order to always hit the
    same shards for a given session[1]. This will help with always hitting
    the same shards likely to 1. but without adding the additional cost of a
    scroll.

[1]
Elasticsearch Platform — Find real-time answers at scale | Elastic

On Mon, Aug 18, 2014 at 8:02 AM, Ron Sher <ron....@gmail.com <javascript:>

wrote:

Hi,

We've noticed a strange behavior in elasticsearch during paging.

In one case we use a paging size of 60 and we have 63 documents. So the
first page is using size 60 and offset 0. The second page is using size 60
and offset 60. What we see is that the result is inconsistent. Meaning,
on the 2nd page, we sometimes get results that were before in the 1st page.

The query we use has an order by some numberic field that has many
documents with the same value (0).
It looks like the ordering between documents according to the same value,
which is 0, isn't consistent.

Did anyone encounter such behavior? Any suggestions on resolving this?

We're using version 1.3.1.

Thanks,
Ron

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKHuyJpcYKepYzh%2BBU2MSD2RQ19zjHYiXgf3anWBL9esq9fkGQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKHuyJpcYKepYzh%2BBU2MSD2RQ19zjHYiXgf3anWBL9esq9fkGQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/277ec3ee-f7bf-4862-a816-efe2937a9609%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.