Elastic Search Filters Are Not Stacking

Hi all,

I am in the process of switching over to the new .net client for Elasticsearch. As part of this switchover, we are trying to rewrite our Elasticsearch implementation of faceted search on our website. For a long time, it has not functioned as we would expect a faceted search to work, with the counts returned by Elastic search never being correct. Part of the reason for this was my predecessor did a lot of post filtering in .net which meant the counts we all incorrect when displayed on the page compared to what Elasticsearch knew about.

Fast forward to now, I am trying to put in place a proper implementation using Elastic search to do the bulk of the work for us, using a combination of must queries along with filters and aggregation to provide the list of facets by which we can filter products. However, despite following what I believe is the correct approach, Elasticsearch still doesn't appear to be playing ball or behaving as a I would expect.

Consider the following document structure:

    public class Venue {
          public string Name {get; set;}
          public string Location {get; set;}
          public string PromotedLocations {get; set;} // Comma separated list of locations
          public List<Product> Products {get; set;} = new List<ProductData>();
    }

    public class Product {
          public int Nights {get; set;}
          public int Guests {get; set;}
          public List<PriceData> Pricing {get; set;} = new List<PriceData>();
    }

    public class PriceData {
          public double SalePrice {get; set;}
          public double WasPrice {get; set;}
          public DateTime StartDate {get; set;}
          public DateTime EndDate {get; set}
    }

What we are trying to achieve is a faceted search that allows a user to search for a location and have all of the Venues that are available in that location returned. The location search is performed as a must query so that is the primary factor the search is conducted on however, we then supply filters based on generated aggregations from our initial query to show the amount of nights and maxGuests supported by a venue through the products that are available there.

However, the filters don't appear to be working as we would expect. If we pass in 2 Nights and 2 Max guests, we would expect for elastic to return only documents that contain items with that combination. However, instead, it seems to be choosing one or the other.

Obviously we don't expect elastic to remove the packages from the hotel if they are not relevant, as we understand that Elastic's purpose is to return relevant documents to the search, so we are then filtering down the internal collection of products to meet the criteria we are searching for. We use elastic to return the documents relevant to our search and filters, then the assumption is that because a hotel has been returned, if we have a filter of 2 nights and 2 max guests then there must be a package in there which satisfies both of those criteria however, Elastic search is returning items that only satisfy one of those criteria.

Are we doing something wrong? Is there a misconfiguration in how filters are applied or is there something we are missing to ensure all filters are relevant to the results returned?

The following is the code we have in place for this:

    Dictionary<string, Aggregation> aggregations = new Dictionary<string, Aggregation>();
        var filters = new List<Query>();
        var mustQueries = new List<Query>();
        var shouldQueries = new List<Query>();
    aggregations.Add("guests", Aggregation.Terms(new TermsAggregation()
            {
                Field = "products.guests",
                Size = 500
            })
        );
        aggregations.Add("nights", Aggregation.Terms(new TermsAggregation()
            {
                Field = "products.nights",
                Size = 500
            })
        );

        var searchRequest = new SearchRequest("our-index-name")
        {
            From = 0,
            Size = 10,
            Aggregations = aggregations
        };

        if (request.Filters.Nights.Any())
        {
            var query = new TermsQuery()
            {
                Field = "products.nights",
                Terms = new TermsQueryField(request.Filters.Nights.Select(n => FieldValue.Long(n)).ToList())
            };
            filters.Add(query);
        }
        
        if (request.Filters.Guests.Any())
        {
            var query = new TermsQuery()
            {
                Field = "products.guests",
                Terms = new TermsQueryField(request.Filters.Guests.Select(n => FieldValue.Long(n)).ToList())
            };
            filters.Add(query);
        }

        if (!string.IsNullOrWhiteSpace(request.SearchTerm))
        {
            var query = new MultiMatchQuery()
            {
                Fields = "location,promotedLocations",
                Query = request.SearchTerm
            };
            mustQueries.Add(query);
        }

        searchRequest.Query = new BoolQuery()
            {
                Must = mustQueries.Any() ? mustQueries : null,
                Should = shouldQueries.Any() ? shouldQueries : null,
                Filter = filters
            };

            SearchResponse<VenueResult> response = await _elasticsearchClient.SearchAsync<VenueResult>(searchRequest);

Following this, we then use linq to filter down the results inside each returned document to match the nights and guests in the filter. However, there are many occurrences where this collection then comes through as empty because elastic sees that the nights is 2 lets say, and if we search for a guests value of 2 as well, it is filtering on the nights being 2 but then it doesn't seem to care for the guests filter we have applied.

Any ideas how to make this work correctly and actually filter out results that do not match the given criteria?

We tried putting the values as additional must parameters as well but that didn't work either.

Just for some additional context, the reason the document is stored in this way is because when we display the items in our search results we show several packages inside a box for each venue but those packages have to be relevant to the given search. So you might have a venue shown with 3 packages for that venue. However, we never want to show duplicate venue blocks in the search. Therefore, the grouping we have in the objects above is necessary.

1 Like

I am not a .NET developer so will not be able to comment on the usage or functionality of the client. It does however seem like you are storing large documents with 2 levels of nesting and your queries appear (as far as I can see) to finlter on multiple fields within sub-documents. For this to work as expected you typically need to map the sub-documents as nested documents in your index template/mapping. This will also require you to use nested query clauses in your queries. As it looks like you are modelling a relationship I would also recommend looking at this part of the docs and posibly re-evaluate how you are modelling your data as it may not be the best way from an Elasticsearch perspective.

Thanks for that. It's a step in the right direction as we have to nest our documents in this way. The only thing I'm struggling now with is the nested aggregations as the documentation for the .net client is really poor.