Duplicated results across pages in top_children queries


(Matthew A. Brown) #1

Howdy Shay (and world of ElasticSearch),

We briefly discussed on IRC the phenomenon where when performing a
top_children query in a paginated fashion (using "from" and "size"
params to paginate), it's possible that the same result document may
appear in multiple pages. You indicated that this is a known issue
with top_children, and that increasing the "factor" would help
alleviate duplication.

My question is whether this is considered a bug (and thus slated for a
fix at some point) or more of an irreducible fact about the way
top_children works. I'm mostly just trying to determine whether our
application needs to be robust around potential duplication in the
long run.

Thanks!
Mat


(Shay Banon) #2

Good question :), thats the state currently, I need to spend some time
thinking about ti to see if it can be fixed or thats simply the case with
top_children...

On Wed, Mar 28, 2012 at 8:53 PM, Matthew A. Brown mat.a.brown@gmail.comwrote:

Howdy Shay (and world of ElasticSearch),

We briefly discussed on IRC the phenomenon where when performing a
top_children query in a paginated fashion (using "from" and "size"
params to paginate), it's possible that the same result document may
appear in multiple pages. You indicated that this is a known issue
with top_children, and that increasing the "factor" would help
alleviate duplication.

My question is whether this is considered a bug (and thus slated for a
fix at some point) or more of an irreducible fact about the way
top_children works. I'm mostly just trying to determine whether our
application needs to be robust around potential duplication in the
long run.

Thanks!
Mat


(Matthew A. Brown) #3

Thanks! Keep me posted : )

On Thu, Mar 29, 2012 at 10:10, Shay Banon kimchy@gmail.com wrote:

Good question :), thats the state currently, I need to spend some time
thinking about ti to see if it can be fixed or thats simply the case with
top_children...

On Wed, Mar 28, 2012 at 8:53 PM, Matthew A. Brown mat.a.brown@gmail.com
wrote:

Howdy Shay (and world of ElasticSearch),

We briefly discussed on IRC the phenomenon where when performing a
top_children query in a paginated fashion (using "from" and "size"
params to paginate), it's possible that the same result document may
appear in multiple pages. You indicated that this is a known issue
with top_children, and that increasing the "factor" would help
alleviate duplication.

My question is whether this is considered a bug (and thus slated for a
fix at some point) or more of an irreducible fact about the way
top_children works. I'm mostly just trying to determine whether our
application needs to be robust around potential duplication in the
long run.

Thanks!
Mat


(system) #4