Has parent score type - BUG?

I was glad to see the "score_type" feature added in 0.20.2 which allows
the parents scoring to be transferred to the child.

http://www.elasticsearch.org/guide/reference/query-dsl/has-parent-query.html

But I think there is a bug when I use it.

I have not yet reduced it to a simple example, but I have done the
following.

  • checked all children have a routing value equal to their parents ID.
  • checked all parents have a routing value equals to their own ID.
  • run a particular query with and without the "score_type" : "score" in
    the "has_parent" query.

"score_type": "none" (or without the new "score_type" field)
I get all the documents I expect.
"score_type": "score"
SOMETIMES I GET 3 DOCUMENTS SOMETIMES I GET 4 IN THE RESULT.
I just have to try it in the head plugin and hit submit and value
changes between 3 and 4, but not consistently.
There are no node failures. I only have two shards.

Here is the most reduced query I could come up with that is like what
I'd like to do which is find children that have parents. Where some of
the scoring comes from the parents.
Of course in my real query the criteria for the parents is more complex
than "match_all".

curl 'http://localhost:9200/myindex/MyChildType/_search' - d '
{
"from": 0,
"timeout": 4000,
"query": {
"filtered": {
"query": {
"bool": {
"should": [
{
"has_parent": {
"query": {
"match_all": {}
},
"parent_type": "MjDocument",
"score_type": "none" <-- change this to
"score_type": "score" and you get fewer results expected 9, sometimes 3
sometimes 4.
}
}
],
"boost": 1000
}
},
"filter": {
"prefix": {
"Path.NALocation": "Bugs\Phrase Boosting\Subphrase\"
}
}
}
}
}'

I know that prefix queries are slow, I just use it here to find exactly
the set of files I was testing with. The filter works.

Am I doing the "has_parent" wrong or there something else wrong here?

Point 2:
The doc for this new feature needs some work.

"The supported score types are |score| or |none|. The default is |none|
and yields the same behavior as in previous versions. If the score type
is set to another value than |none|, the scores of all the matching
parent documents are aggregated into the associated child documents. "

http://www.elasticsearch.org/guide/reference/query-dsl/has-parent-query.html

  1. "same .. as in previous" Now how is a new user (or someone reviewing
    existing behavior) supposed to know what the old behavior is or was?
  2. " the scores of all the matching parent documents are aggregated into
    the associated child documents" It looks like someone copied this from
    the same sentence in the "has_child".
    I assume no parents score_s_ are aggregated together only that "... the
    score of each matching parent document is aggregated into the score
    for each of its child documents."
    Document_s_ to Document_s_ doesn't actually provide any useful
    documentation about what goes where and could be replaced with "some
    scores are moved about" :slight_smile:

If someone can confirm 1, 2 and provide a description of expected "old"
behavior (for "none"), I'll even submit a pull request for the changes
to this page.

Also if someone can think of a different or more efficient way to score
filtered children based at least partially on the score of the parents,
please suggest.

-Paul

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Paul,

Were you able to create a test case that fails more or less consistently?
Also have you tried running your has_parent query with the latest 0.20
release or 0.90.x?

Yes, the has_parent docs need to be corrected. The "old behavioir" is that
parent scores aren't pushed to the child documents. The score is equal to
the boost (defaults to 1) specified in the has_parent query. I'll update
the documentation for has_parent.

Martijn

On 13 March 2013 22:30, P. Hill parehill1@gmail.com wrote:

I was glad to see the "score_type" feature added in 0.20.2 which allows
the parents scoring to be transferred to the child.

Elasticsearch Platform — Find real-time answers at scale | Elastic**
parent-query.htmlhttp://www.elasticsearch.org/guide/reference/query-dsl/has-parent-query.html

But I think there is a bug when I use it.

I have not yet reduced it to a simple example, but I have done the
following.

  • checked all children have a routing value equal to their parents ID.
  • checked all parents have a routing value equals to their own ID.
  • run a particular query with and without the "score_type" : "score" in
    the "has_parent" query.

"score_type": "none" (or without the new "score_type" field)
I get all the documents I expect.
"score_type": "score"
SOMETIMES I GET 3 DOCUMENTS SOMETIMES I GET 4 IN THE RESULT.
I just have to try it in the head plugin and hit submit and value
changes between 3 and 4, but not consistently.
There are no node failures. I only have two shards.

Here is the most reduced query I could come up with that is like what I'd
like to do which is find children that have parents. Where some of the
scoring comes from the parents.
Of course in my real query the criteria for the parents is more complex
than "match_all".

curl 'http://localhost:9200/**myindex/MyChildType/_searchhttp://localhost:9200/myindex/MyChildType/_search'

  • d '
    {
    "from": 0,
    "timeout": 4000,
    "query": {
    "filtered": {
    "query": {
    "bool": {
    "should": [
    {
    "has_parent": {
    "query": {
    "match_all": {}
    },
    "parent_type": "MjDocument",
    "score_type": "none" <-- change this to "score_type":
    "score" and you get fewer results expected 9, sometimes 3 sometimes 4.
    }
    }
    ],
    "boost": 1000
    }
    },
    "filter": {
    "prefix": {
    "Path.NALocation": "Bugs\Phrase Boosting\Subphrase\"
    }
    }
    }
    }
    }'

I know that prefix queries are slow, I just use it here to find exactly
the set of files I was testing with. The filter works.

Am I doing the "has_parent" wrong or there something else wrong here?

Point 2:
The doc for this new feature needs some work.

"The supported score types are |score| or |none|. The default is |none|
and yields the same behavior as in previous versions. If the score type is
set to another value than |none|, the scores of all the matching parent
documents are aggregated into the associated child documents. "

Elasticsearch Platform — Find real-time answers at scale | Elastic**
parent-query.htmlhttp://www.elasticsearch.org/guide/reference/query-dsl/has-parent-query.html

  1. "same .. as in previous" Now how is a new user (or someone reviewing
    existing behavior) supposed to know what the old behavior is or was?
  2. " the scores of all the matching parent documents are aggregated into
    the associated child documents" It looks like someone copied this from the
    same sentence in the "has_child".
    I assume no parents score_s_ are aggregated together only that "... the
    score of each matching parent document is aggregated into the score for
    each of its child documents."
    Document_s_ to Document_s_ doesn't actually provide any useful
    documentation about what goes where and could be replaced with "some scores
    are moved about" :slight_smile:

If someone can confirm 1, 2 and provide a description of expected "old"
behavior (for "none"), I'll even submit a pull request for the changes to
this page.

Also if someone can think of a different or more efficient way to score
filtered children based at least partially on the score of the parents,
please suggest.

-Paul

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@**googlegroups.comelasticsearch%2Bunsubscribe@googlegroups.com
.
For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out
.

--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hey Phill,

is been quite a few months, I was wondering if you ever found a solution to this issue.

I have recently ran into a very similar issue. Where my routing for the child doc is required (aka all child have parents). And when I perform a has parent query on the child doc with match all, I get a very different total for score_type 'none' vs 'score'.

I was able to fix this discrepancy by reloading my parent doctype and documents with the exactly same configuration / data

I was unable to fix this with refresh/flush/optimize

Most importantly not able to reproduce this. It seems to happen over time in a very mysterious way

Version # 0.90.5