Query with "has_child" filter and date range


#1

Hi,

Elasticsearch 5.1 data structure:

"test123": {
    "mappings": {
        "doctype1": {
            "dynamic": "strict",
            "_parent": {
                "type": "doctype2"
            },
            "_routing": {
                "required": true
            },
            "properties": {
                "title": {
                    "type": "text"
                }
            }
        },
	    "doctype2": {...},
        "doctype3": {
            "dynamic": "strict",
            "_parent": {
                "type": "doctype1"
            },
            "_routing": {
                "required": true
            },
            "properties": {
                "start_datetime": {
                    "type": "date"
                }
            }
        }
    }
}

Any idea why the last query returns only 18 hits? (should return 474 because everything in doctype3 has start_datetime = datetime.datetime(2017, 2, 14, 6, 39, 36, 989408)).

Query on http://localhost:9200/test123/doctype1/_search - result: 474

{
   "_source":{
      "includes":[
         "title"
      ]
   },
   "query":{
      "match_all":{

      }
   }
}

Query on http://localhost:9200/test123/doctype1/_search - result: 474

{
   "_source":{
      "includes":[
         "title"
      ]
   },
   "query":{
      "bool":{
         "filter":[
            {
               "has_child":{
                  "type":"doctype3"
               }
            }
         ]
      }
   }
}

Query on http://localhost:9200/test123/doctype3/_search - result: 2495

{
   "_source":{
      "includes":[
         "start_datetime"
      ]
   },
   "query":{
      "bool":{
         "filter":[
            {
               "range":{
                  "start_datetime":{
                     "lte":"2017-02-15",
                     "gte":"2017-02-13"
                  }
               }
            }
         ]
      }
   }
}

Query on http://localhost:9200/test123/doctype1/_search - result: 18, why?

{
   "_source":{
      "includes":[
         "_id",
         "title"
      ]
   },
   "query":{
      "bool":{
         "filter":[
            {
               "has_child":{
                  "type":"doctype3",
                  "query":{
                     "range":{
                        "start_datetime":{
                           "gte":"2017-02-13",
                           "lte":"2017-02-15"
                        }
                     }
                  }
               }
            }
         ]
      }
   }
}

Maybe the query is incorrect?


(Adrien Grand) #2

Note that the below query is invalid. The fact that no child query is provided makes Elasticsearch ignore the query entirely. We had to do this for backward compatibility reasons, as of Elasticsearch 6.0 such a query would fail complaining that no query object is provided under the has_child query. Can you try to run it again with a match_all query, which I think is what you wanted to test?

{
   "_source":{
      "includes":[
         "title"
      ]
   },
   "query":{
      "bool":{
         "filter":[
            {
               "has_child":{
                  "type":"doctype3"
               }
            }
         ]
      }
   }
}

#3

Thanks, good to know because I thought that this query was correct. Now I see that it's not an issue with date range but with parent-child relationship. This query:

{
   "_source":{
      "includes":[
         "title"
      ]
   },
   "query":{
      "bool":{
         "filter":[
            {
               "has_child":{
                  "type":"performance",
                  "query": {
                  	"match_all": {}
                  }
               }
            }
         ]
      }
   }
}

returns incorrect number of results. I noticed that it returns random results after every reindex. Could it be an issue with shards because parent and child should exist in the same shard? Should I set routing during data index? My mapping looks (I removed most of properties) like this:

"test123":{
   "mappings":{
      "doctype1":{
         "dynamic":"strict",
         "properties":{
            "name":{
               "type":"keyword"
            },

         }
      },
      "doctype2":{
         "dynamic":"strict",
         "_parent":{
            "type":"doctype1"
         },
         "_routing":{
            "required":true
         },
         "properties":{
            "title":{
               "type":"text"
            }
         }
      },
      "doctype3":{
         "dynamic":"strict",
         "_parent":{
            "type":"doctype2"
         },
         "_routing":{
            "required":true
         },
         "properties":{
            "id":{
               "type":"integer"
            },
            "start":{
               "type":"date"
            }
         }
      },
      "doctype4":{
         "dynamic":"strict",
         "_parent":{
            "type":"doctype4"
         },
         "_routing":{
            "required":true
         },
         "properties":{
            "section":{
               "type":"keyword"
            }
         }
      }
   }
}

I want to have a possibility to get all data from doctype2 when doctype2 has a doctype3 as a parent and start property of doctype3 is between some data range.


(Adrien Grand) #4

Indeed. You have 3 levels of parent/child so you should make sure to use the id of the grandparent as a routing key for every index operation of a document in doctype3.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.