Has_parent + compound query scoring problem

Hi All,

I found a weird scoring behavior of elasticsearch when using has_parent
query together with normal query.

I have two document types in the index:

  • document: Normal document
  • review: Reviews that are child elements of the documents

I would like to create a compound query that searches for both types. The
features of the query basically:

  • Search in properties of document (title, author, abstract, ...)
  • Search in properties of review (reviewer, review text, ...)
  • Search also in parent properties of review , i.e. the properties of
    the parent document joined by has_parent query (title, abstract, ...)
  • Have multiple query parts connected by bool or dis_max, e.g. basic
    query_string query and additional proximity boost part (phrase match +
    slop) or fuzzy part

I would like to have a harmonized scoring behavior, i.e. if I search for a
term in a document title, then I would like to receive the matching
document and all of its reviews with the same score values.
This seemed to work when I started, however, after a while it became weird,
and I got many inexpiable scores.
Unfortunately, we cannot use the nice explain="true" feature for the
has_parent query ("not implemented..."), so it is limited for me to debug
the problem.

I've created a really small curl-based example on gist:

The last two queries represents the main problem:

  • Test query 1: Two query_string query combined with dis_max, one is
    for documents, the other is for reviews and thus has_parent is applied. It
    is working fine, the document and the review have the same scores (0,375).
  • Test query 2: The very same query above applied twice, combined with
    a bool query. I expected to have again same scores, but the result scores
    are different for the document and for the review (???).

After checking the scores, it seems to me that the problem relates to the
query_norm value that is maybe different at the has_parent parts.
For "Test query 1" the result score is 0.375 for both (document, review).
For "Test query 2" the matched review (by has_parent) got exactly 2 x 0.375
= 0.75, while the document score is less that - I guess - comes mainly from
the less query_norm value.
However, I could not confirm it, since I cannot see explain for has_parent
parts...

Can anybody help me?
Thank you in advance!

Regards,
Csaba D.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b9da5514-5c7f-4314-8f40-f3dad8764f6a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.