Autocomplete across multiple fields

Hello,

I read about two search-as-you-type technics, one from "Definitive Guide" with edge-ngrams and another with Prefix Suggester at https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html which gave an initial overview of the ES autocomplete capabilities and shed light on one possible way to do that.

This is a typical address autocompletion, which involves 5 fields : [Country, City, District, Ward, Street].
For example full address might be ['Canada', 'Toronto', 'East York', 'Crescent Town', 'The Market Pl'] (almost made up for illustration purposes)
Country field (for now) is excluded from the input, so query should be performed only across city, district, ward and street.
User can start typing anything that he is thinking about (true random input generator in place!):
East York
York
Crescent
Market Pl
etc.

What is the suggested approach to creating a robust and effective autocompletion solution in this situation?
Should I combine all those fields into one and use edge-ngrams?

Hey,

if you use the completion suggester, there is no need for edge-ngrams - it is a prefix suggester however (please make sure that you understand the limitations of this!), so you need each of those as your own input. There are two strategies now. Either have an own suggester for each field and merge the results on the client side. Or take those five field contents and create a single suggester field, that returns a well normalized output.

You should try out both, however the latter one not requiring client side merging might be easier.

--Alex

Hello @spinscale!

I'm trying to put together completion suggester for all 4 address fields as it's described at https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html
As far as I understood there are two limitations:

  • it takes a time to build FST during document indexing
  • terms will be looked up from the beginning of the field's value, so a user can type "Tor" to see "Toronto", but won't get anything for typing "onto"

From reading https://www.elastic.co/blog/you-complete-me I was able to derive only how to build autocompletion by combining all terms as inputs for the address_suggest field:

curl -X PUT localhost:9200/hotels/hotel/2 -d '
{
  "country":   "Canada",
  "city":      "Toronto",
  "district":  "East York",
  "ward" :     "Crescent Town",
   "street":   "The Market Pl"
  "address_suggest": { 
    "input":  [ 
     "Toronto",
     "East York",
     "Crescent Town",
      "The Market Pl"
    ],
    "output":      "Toronto,  East York, Crescent Town, The Market Pl"
  }
}'

Is it the only possible way? If there is another way and it's not too much, will really appreciate an example of a mapping and query examples.

Thanks.

Hey,

the other way I mentioned was, having dedicated city_suggest, suburb_suggest, street_suggest fields, querying all those three and combining the response manually. As you can see, this would be more work, so it might be easier to use what you already tried. Also, dont forget about the weights for scoring in your example :slight_smile:

--Alex

Hey @spinscale, I simplified structure a little bit - now there is no ward and here is what I could came up with:

PUT /autocomplete
{
  "mappings": {
    "address" : {
      "properties" : {
        "city":     { "type" : "string" },
        "district": { "type" : "string" },
        "street":   { "type" : "string" },
        "city_suggest": {
          "type":   "completion"
        },
        "district_suggest": {
          "type":   "completion"
        },
        "street_suggest": {
          "type":  "completion"
        }
      } 
    }
  }
}'

POST /autocomplete/address/1
{
  "city":           "Kiev",
  "district":       "Shevchenkovsky",
  "street":         "Shevhenko",
  "city_suggest": {
    "input":    ["Kiev"],
    "weight":   "1"  
  },
  "district_suggest": {
      "input":  ["Shevchenkovsky"],
      "output": "Kiev, Shevchenkovsky",
      "weight": 10
  },
  "street_suggest": {
      "input":  ["Shevchenko"],
      "output": "Kiev, Shevchenkovsky, Shevchenko",
      "weight": 100
  }
}

POST /autocomplete/address/2
{
  "city":               "Kiev",
  "district":           "Shevchenkovsky",
  "street":             "Pobedy ave",
   "city_suggest": {
    "input":    ["Kiev"],
    "weight":   "1"  
  },
  "district_suggest": {
      "input":  ["Shevchenkovsky"],
      "output": "Kiev, Shevchenkovsky",
      "weight": 10
  },
  "street_suggest": {
      "input":  ["Pobedy ave"],
      "output": "Kiev, Shevchenkovsky, Pobedy ave",
      "weight": 100
  }
}

Now, I had a hope to include *_suggestion in the field parameter of the query, but that yielded an error.

GET /autocomplete/_suggest
{
    "address": {
        "text" : "sh",
        "completion" : {
          "field" : ["city_suggest", "district_suggest", "street_suggest"]
        }
    }
}

In the end I wanted to see something like that

Kiev | Shevchenkovsky | Shevchenko 
Kiev | Shevchenkovsky