Accessing fields of nested doc in custom score script may cause documents missing in query result


(Junjun Zhang) #1

Hi,

I recently ran into a problem of missing documents in query result when
custom score script is used. After some testing, I found that the problem
seems occur when the script tries to access a field in a nested doc where a
particular root document does not contain any such nested doc.

To reproduce the problem, test data and queries can be found here:
http://goo.gl/iHOc5. The example may not make much sense in real world, but
the idea is to sort products by average rate from users' review. One
particular requirement is to always treat anonymous user's rate as 3 and
assign rate as 3 for products with no reviews.

We can determine whether a user is anonymous or not by
checking review.user.member_id field is empty or not:
doc['review.user.member_id'].empty, this seems work fine except that
products with no reviews are dropped out in the result as the first query
example shows. Is this a bug? As there is no query/filter that excludes
documents, shouldn't all documents be returned?

Also, there seems no way to determine whether a review exists or not. The
second query example shows doc['review'].empty does not work, this makes
sense because indeed, there is not such field as 'review' under the
'product' index, 'review' is a nested document. However, the question
remains: is there a way to determine the existence of a nested doc?

Any help will be greatly appreciated!

Best regards,

Junjun

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(btiernay) #2

I wonder if this is a direct result of how nested docs work (
http://www.slideshare.net/AnneVeling/nested-and-parentchild-docs-in-elasticsearch).
Still, it looks as though from an api perspective your queries should be
honored. I think at the very least this should be documented, but I would
file an issue with elasticsearch on github regardless.

Not sure if this is related:

Cheers,

Bob

On Friday, 10 May 2013 12:38:16 UTC-4, Junjun Zhang wrote:

Hi,

I recently ran into a problem of missing documents in query result when
custom score script is used. After some testing, I found that the problem
seems occur when the script tries to access a field in a nested doc where a
particular root document does not contain any such nested doc.

To reproduce the problem, test data and queries can be found here:
http://goo.gl/iHOc5. The example may not make much sense in real world,
but the idea is to sort products by average rate from users' review. One
particular requirement is to always treat anonymous user's rate as 3 and
assign rate as 3 for products with no reviews.

We can determine whether a user is anonymous or not by
checking review.user.member_id field is empty or not:
doc['review.user.member_id'].empty, this seems work fine except that
products with no reviews are dropped out in the result as the first query
example shows. Is this a bug? As there is no query/filter that excludes
documents, shouldn't all documents be returned?

Also, there seems no way to determine whether a review exists or not. The
second query example shows doc['review'].empty does not work, this makes
sense because indeed, there is not such field as 'review' under the
'product' index, 'review' is a nested document. However, the question
remains: is there a way to determine the existence of a nested doc?

Any help will be greatly appreciated!

Best regards,

Junjun

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Bruno Miranda) #3

Did you ever find a resolution to this issue? I am running into something similar.


(system) #4