Hi, I have problems understanding how that "has_parent" query works.
Situation:
Index with 2 types: "parent_type" and "child_type" with simple parent-child mapping:
"parent_type" => {"properties": {...}}
"child_type" => {"_parent": {"type": "parent_type"},"properties": {...}}
Every parent document has exact one child document. Elasticsearch version 1.3.
Now I run two count queries on "child_type" (http://localhost:9200/index/child_type/_count):
{
"query": {
"has_parent": {
"type": "parent_type",
"query": {
"match_all": {}
}
}
}
}
and
{
"query": {
"match_all": {}
}
}
The first query counts less documents then the second query. I do not understand why there is a difference?
1, some child documents don't have a parent document?
2, missing _routing?
1: This is not the error. I used that query to move documents to a new index and some of them were not moved. Later I checked if there are child-docs with no parents and all of them had valid parents in old index. As far as I know it is not possible to index documents into a type with parent mapping but without providing parent-id in the index-request.
2: I do not have '_routing' mapping. Is this mandatory when using parent-child relationship?
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-routing-field.html
https://www.elastic.co/guide/en/elasticsearch/guide/current/parent-child.html
and it says,
Elasticsearch maintains a map of which parents are associated with which children. It is thanks to this map that query-time joins are fast, but it does place a limitation on the parent-child relationship: the parent document and all of its children must live on the same shard.
I did additional research and found out that parents are always used as default routing value. So you do not need to care about routing in parent-child relations.
Turns out your very first guess was true: some child documents don't have a parent document. I mixed up something during my tests...
Thanks for your replies sanzhiyuan!