Get cheapest trip departure based on inner hits

I have a index containing trips. Each trip will have multiple departures, each departure will have it's own departure date and a collection of prices (each with a different currency).

I want to be able to query and find all trips within a certain date range and then for each of those trips, I want to highlight the cheapest price from the matching departures.

My data structure looks something like this:

{
	"id": 123,
	"name": "Cool trip",
	"highlightedDeparture": {
			"startDate": "2020-02-01",
			"yearMonth": 202001,
			"prices": [
				{
					"currencyCode": "GBP",
					"price": 99
				}
			]
		},
	"departures": [
		{
			"startDate": "2020-01-07",
			"yearMonth": 202001,
			"prices": [
				{
					"currencyCode": "GBP",
					"price": 123
				}
			]
		},
		{
			"startDate": "2020-01-14",
			"yearMonth": 202001,
			"prices": [
				{
					"currencyCode": "GBP",
					"price": 456
				}
			]
		},
		{
			"startDate": "2020-01-21",
			"yearMonth": 202001,
			"prices": [
				{
					"currencyCode": "GBP",
					"price": 789
				}
			]
		},
		{
			"startDate": "2020-02-01",
			"yearMonth": 202001,
			"prices": [
				{
					"currencyCode": "GBP",
					"price": 99
				}
			]
		}
	]
}

Currently, when the trip is indexed, the highlighted departure is set by default to the overall cheapest departure and this is pre-calculated prior to indexing currently.

I want to be able to run the following query and somehow have the highlighted departure field 'overridden' by the cheapest for January 2020, which would be the one with a price of 123.

{ 
   "query":{ 
      "bool":{ 
         "must":[ 
            { 
               "nested":{ 
                  "inner_hits":{ 
                     "name":"highlightedAvailability",
                     "sort":[ 
                        { 
                           "departures.prices.price":{ 
                              "nested":{ 
                                 "filter":{ 
                                    "term":{ 
                                       "departures.prices.currencyCode":{ 
                                          "value":"GBP"
                                       }
                                    }
                                 },
                                 "path":"departures.prices"
                              },
                              "order":"asc"
                           }
                        }
                     ]
                  },
                  "path":"departures",
                  "query":{ 
                     "term":{ 
                        "departures.yearMonth":{ 
                           "value":"202001"
                        }
                     }
                  }
               }
            }
         ]
      }
   },
   "_source":{ 
      "excludes":[ 
         "departures"
      ],
      "includes":[ 
         "*"
      ]
   }
}

My desired response from this query would look like this:

{
	"id": 123,
	"name": "Cool trip",
	"highlightedDeparture": {
		"startDate": "2020-01-07",
		"yearMonth": 202001,
		"prices": [
			{
				"currencyCode": "GBP",
				"price": 123
			}
		]
	}
}

Currently, I am achieving this by post-processing the elastic response and selecting the cheapest from the inner hits. This feels inefficient and is adding latency. I'd like to know if this is possible with native Elasticsearch.

FYI - The indexing can be changed if necessary.

I just tried your example locally and I think you are almost there. Adding "size": 1 to your inner_hits query and then checking

hits.hits[0].inner_hits.highlightedAvailability.hits.hits[0]._source

looks like the part that you are after. I might be missing something though...

Yeah, so I'm able to get the inner hits as expected currently from the location you mentioned. I'm just wondering if there's a way for me to override the headlineDeparture field with the inner hit as part of the overall Elastic query itself. Currently I have to manually override it in my code once the query has finished and I have the response, which obviously adds a bit of latency to my overall process. It's not a drastic hit but I just want to be as efficient as possible and if there's a way to do it natively then I'd rather use that.

No, there is no way to override, as this is directly extracted from the JSON, that part will need to be done on the client side.

Is this just flat out not possible in any way? I can change my indexing and document structure if necessary but no worries if not.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.