Sorting Parent Documents by nested inner hits


(Phil Yardley) #1

Hi,

We have a set of documents that need sorting when querying. sorting at the nested level seems to be ok, but we need to sort by a field on a nested object within another nested object and can't for the life of us work this out :frowning:

The basic document structure looks like:

{
	"properties" : {
		"doc" : {
			"properties" : {
				"productCode" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword" } } },
				"productSku" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword" } } },
				//
				// Truncated for clarity
				//
				"products" : {
					"type" : "nested",
					"include_in_parent" : true,
					"properties" : {
						"departureDate" : {"type" : "date", "fields" : { "keyword" : { "type" : "keyword" }, "raw2" : { "type" : "string", "index" : "not_analyzed"} } },
						"travelBy" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword" } } },
						//
						// Truncated for clarity
						//
						"calculatedPrices" : {
							"type" : "nested",
							"properties" : {
								"accommodationCode" : {"type" : "string", "fields" : {"keyword" : {"type" : "keyword" } } },
								"was" : {"type" : "double"},
								"now" : {"type" : "double"},
								"isSupplemented" : {"type" : "boolean"},
								"isDefaultPrice" : {"type" : "boolean"},
								//
								// Truncated for clarity
								//
							}
						}
					}
				}
			}
		}
	}
}

we need to sort by the doc.products.calculatedPrices.now field in either asc or desc order - but only on those that are within the inner hits not values that don't match the rest of the select query.

we can sort fairly easily on doc.products.departureDate for example - but not the next level down.

Is this even possible?

any pointers would be appreciated.

Phil


(Adrien Grand) #2

I don't think it is possible indeed.


(Phil Yardley) #3

Thanks @jpountz,

We've flattened the structure by one level, so we now have one level of nesting we're concerned about...

	"properties" : {
		"doc" : {
			"properties" : {
				"productCode" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword" } } },
				"productSku" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword" } } },
				//
				// Truncated for clarity
				//
				"products" : {
					"type" : "nested",
					"include_in_parent" : true,
					"properties" : {
						"departureDate" : {"type" : "date", "fields" : { "keyword" : { "type" : "keyword" }, "raw2" : { "type" : "string", "index" : "not_analyzed"} } },
						"travelBy" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword" } } },
						//
						// Truncated for clarity
						//
						"accommodationCode" : {"type" : "string", "fields" : {"keyword" : {"type" : "keyword" } } },
						"was" : {"type" : "double"},
						"now" : {"type" : "double"},
						"isSupplemented" : {"type" : "boolean"},
						"isDefaultPrice" : {"type" : "boolean"}
					}
				}
			}
		}
	}
}

We want to sort by the NOW field on product. however, when sorting and applying the nested_filter (the same as the initial query) it is sorting on the lowest /highest possible value in the nest, not those in the inner hits (ie the ones that meet the criteria).

This should be possible surely?

thanks

Phil


(Phil Yardley) #5

example (truncated) output:

    "hits": {
        "total": 18,
        "max_score": null,
        "hits": [
            {
                "_index": "testavail",
                "_type": "couchbaseDocument",
                "_id": "CSBOUA7",
                "_score": null,
                "_source": {
                    "doc": {
                        "productSku": "Bournemouth, The Savoy Hotel B&B"
                    }
                },
                "sort": [
                    362.7
                ],
                "inner_hits": {
                    "doc.products": {
                        "hits": {
                            "total": 4,
                            "max_score": null,
                            "hits": [
                                {
                                    "_nested": {
                                        "field": "doc.products",
                                        "offset": 29
                                    },
                                    "_score": null,
                                    "fields": {
                                        "doc.products.now": [
                                            397.1
                                        ],
                                        "doc.products.productCode.keyword": [
                                            "CSBOUA7"
                                        ],
                                        "doc.products.pax": [
                                            2
                                        ],
                                        "doc.products.promoCode.keyword": [
                                            "GBK318"
                                        ],
                                        "doc.products.isDiscounted": [
                                            true
                                        ]
                                    },
                                    "sort": [
                                        397.1
                                    ]
                                },
                                {
                                    "_nested": {
                                        "field": "doc.products",
                                        "offset": 9
                                    },
                                    "_score": null,
                                    "fields": {
                                        "doc.products.now": [
                                            397.1
                                        ],
                                        "doc.products.productCode.keyword": [
                                            "CSBOUA7"
                                        ],
                                        "doc.products.pax": [
                                            2
                                        ],
                                        "doc.products.promoCode.keyword": [
                                            "GBK318"
                                        ],
                                        "doc.products.isDiscounted": [
                                            true
                                        ]
                                    },
                                    "sort": [
                                        397.1
                                    ]
                                },
                                {
                                    "_nested": {
                                        "field": "doc.products",
                                        "offset": 4
                                    },
                                    "_score": null,
                                    "fields": {
                                        "doc.products.now": [
                                            397.1
                                        ],
                                        "doc.products.productCode.keyword": [
                                            "CSBOUA7"
                                        ],
                                        "doc.products.pax": [
                                            2
                                        ],
                                        "doc.products.promoCode.keyword": [
                                            "GBK318"
                                        ],
                                        "doc.products.isDiscounted": [
                                            true
                                        ]
                                    },
                                    "sort": [
                                        397.1
                                    ]
                                },
                                {
                                    "_nested": {
                                        "field": "doc.products",
                                        "offset": 19
                                    },
                                    "_score": null,
                                    "fields": {
                                        "doc.products.now": [
                                            446.1
                                        ],
                                        "doc.products.productCode.keyword": [
                                            "CSBOUA7"
                                        ],
                                        "doc.products.pax": [
                                            2
                                        ],
                                        "doc.products.promoCode.keyword": [
                                            "GBK318"
                                        ],
                                        "doc.products.isDiscounted": [
                                            true
                                        ]
                                    },
                                    "sort": [
                                        446.1
                                    ]
                                }
                            ]
                        }
                    }
                }
            },

as we can see, the lowest price in the inner hits is 397.1, but it is sorting on an "outer-hit" of 362.7


(Phil Yardley) #6

Hi @mvg, @jpountz

Heavily simplified query below.... according to the example here: https://www.elastic.co/guide/en/elasticsearch/reference/5.5/search-request-sort.html#nested-sorting this should work?

{
	"from" : 0,
	"size" : 10,
	"_source" : {
		"includes" : [
			"doc.productSku"
		]
	},
	"query" : {
		"bool" : {
			"must" : [{
					"match" : {
						"doc.productType" : {
							"query" : "SelfDrive"
						}
					}
				}, {
					"nested" : {
						"query" : {
							"bool" : {
								"must" : [{
										"range" : {
											"doc.products.departureDate" : {
												"gte" : "2017-07-19T00:00:00",
												"lte" : "2017-08-30T00:00:00"
											}
										}
									}, {
										"match" : {
											"doc.products.promoCode" : {
												"query" : "GBK318"
											}
										}
									}, {
										"range" : {
											"doc.products.now" : {
												"gte" : 397.0,
												"lte" : 600.0
											}
										}
									}

								]
							}
						},
						"path" : "doc.products",
						"inner_hits" : {
							"size" : 1000,
							"_source" : false,
							"fielddata_fields" : [
								"doc.products.now"
							]
						}

					}
				}

			]
		}
	},
	"sort" : [{

			"doc.products.now" : {
				"order" : "asc",
				"mode" : "min",
				"nested_path" : "doc.products",
				"nested_filter" : {
					"bool" : {
						"must" : [{
								"match" : {
									"doc.productType" : {
										"query" : "SelfDrive"
									}
								}
							}, {
								"nested" : {
									"query" : {
										"bool" : {
											"must" : [{
													"range" : {
														"doc.products.departureDate" : {
															"gte" : "2017-07-19T00:00:00",
															"lte" : "2017-08-30T00:00:00"
														}
													}
												}, {
													"match" : {
														"doc.products.promoCode" : {
															"query" : "GBK318"
														}
													}
												}, {
													"range" : {
														"doc.products.now" : {
															"gte" : 397.0,
															"lte" : 600.0
														}
													}
												}
											]
										}
									},
									"path" : "doc.products",
									"inner_hits" : {
										"size" : 1000,
										"_source" : false,
										"fielddata_fields" : [
											"doc.products.now"
										]
									}
								}
							}

						]
					}

				}
			}
		}
	]
}

would be grateful for any assistance as this project needs to be ready for UAT next week !

Phil


(Phil Yardley) #7

noticed having this line in the sort sets the parent sort value to "infinity" - but even removing this returns the parent to the min doc level value, not the min of the inner hits... :scream:


(Martijn Van Groningen) #8

It looks like the entire query is repeated in the nested_filter, only the inner query of the nested query should be repeated there. So only this bit:

{
"bool" : {
    "must" : [{
            "range" : {
                "doc.products.departureDate" : {
                    "gte" : "2017-07-19T00:00:00",
                    "lte" : "2017-08-30T00:00:00"
                }
            }
        }, {
            "match" : {
                "doc.products.promoCode" : {
                    "query" : "GBK318"
                }
            }
        }, {
            "range" : {
                "doc.products.now" : {
                    "gte" : 397.0,
                    "lte" : 600.0
                }
            }
        }

    ]
}

Also the inner hits inside the nested sort has no use, so that should be removed.

With these changes the sort value of the top level hits should be equal to the now field of the first inner hits hit.


(Phil Yardley) #9

Hi @mvg,

Once again you saved our bacon :wink:

The documentation may need to be updated for us n00bs as it says to repeat the query rather than the nested part of the query :frowning:

Many thanks

Phil


(system) #10

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.