Deduce filters from search string

Hi,

let's say I store a catalogue of household appliances in Elasticsearch. Documents in my index can represent dish washers, washing machines, freezers and similar devices.

Dish washers have a text field "type" which can contain one of the following strings:

  • integrated
  • semi-integrated
  • freestanding
  • table top

If a search string contains any of these terms I want to automatically apply a filter to my query.
For example if the search string contains 'table top' I want to only select documents where type is exactly 'table top'.

Dish washer type is only one of many filters I want to apply automatically. If a user was searching for a cooker, the type could be something like 'gas cooker' or 'induction cooker'. Of course there can be more than one filter per device. A cooker could also be 'with grill' or 'without grill'. That information would be in a dedicated field 'grill' and I would like to apply a filter on that field as well if the search string contains 'with grill' or 'without grill'.

Pseudo code for what I want to achieve in Elasticsearch:

  • search string = 'semi-integrated dish washer' should only return documents where type = "semi-integrated"'
  • search string = 'gas cooker with grill' should only return documents where type = "gas cooker" and grill = 'with grill'

I could try to solve this outside of a Elasticsearch for example in plain Java code, but then I could not take advantage of Elasticsearch features like synonyms, so: What is the best way to achieve this in Elasticsearch?

1 Like

This is something I'd call "category snapping" - detecting category names in the text of simple query input and rewriting the query to be a more structured query of the form

bool
    must
            term
                     categoryKeyword:integrated
            query_string
                     freeTextField: Bosch dishwasher 

That's a very domain-specific set of pattern-matching rules you need to pull together and querqy might be a useful framework for managing this.

One other approach which relies less on manual rule curation is to use a statistical approach to suggesting relevant filters or facets based on analysing the top matches. Check out the comments below this youtube video for a way of selecting relevant filters to suggest to your searchers.

Thanks a lot, Mark!

querqy looks very promising and I will try to implement my use case with it. I will also check out the youtube video tomorrow. :slightly_smiling_face:

No worries.

I will also check out the youtube video tomorrow

The video might be useful background on the business problem but their proposed elasticsearch solution is quite inefficient, hence my comments below the video.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.