Autocomplete on many fields, sorted and without duplicates


#1

I have a index with products containing fields like: product name, category, product id, product description etc.
I want to have autocomplete feature that lists a product name, category, product id in specific order and without duplicates.
For example if user enters for example 'comp' it will return following suggestions:

  • Computers and Accessories - category
  • Computer cases - category
  • some other categories
  • HP Elite Professional Desktop PC Computer - product name
  • some other product names containing 'comp'
  • COMP25873 - product id
  • some other product id containing 'comp'

I tried following options:

  1. I have separate index for suggestions where I store only distinct values of product name, category, product id and a enum representing where value come from so I can order by this. I use edge ngrams for handling starts with:
    [
    {
    "text": "Computers and Accessories",
    "column": 1
    },
    {
    "text": "HP Elite Professional Desktop PC Computer",
    "column": 2
    }
    ]
    This solution is the best except the fact that I need to support a separate index.

  2. I added a completion type field to products index where I put an array of suggestions:
    [
    {
    "productName": "HP Elite Professional Desktop PC Computer",
    "category": "Computers and Accessories",
    "suggestions":
    [
    {
    "input": ["Computers and Accessories"],
    "weight": 4
    },
    {
    "input": ["HP Elite Professional Desktop PC Computer"],
    "weight": 3
    }
    ]
    }
    ]
    I use suggest query with skip duplicates option and sort by score.
    It works fast and satisfies my requirements but the disadvantage of this solution is that it works like prefix query - matched word must be on first place.

  3. I use products index. I added edge ngram and keyword subfields to specific columns. Then I use multi query to search against specific field, aggregate by this field to remove duplicates and return only this field. Then I concat returned result from multi query in desired order:

    {"index":"products","type":"doc"}
    {
    "size":0,
    "_source":
    {"includes":["category"]},
    "aggs":
    {"categories":{"terms": {"field":"category.keyword"}}},
    "query":
    {"match":{"category.starts_with":{"query":"comp"}}}
    }
    {"index":"products","type":"doc"}
    {
    "size":0,
    "_source":{"includes":["productName"]},
    "aggs":{"productname":{"terms": {"field":"productName.keyword"}}},
    "query":{"match":{"productName.starts_with": {"query":"comp"}}}
    }
    ...

Do you know any other solution that will work in my case or any improvement to mentioned ones?


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.