Autocomplete address search

foobar2022 · May 13, 2022, 11:21pm

Hi,
I'm trying to build an index & query for address search autocomplete for City, State & Country.
I have a structured data fields:

City,
State
Country
Full (a complete field that concatenates City State & Country, separated by space)

I used ngram approach, and that was working fine when I'm using it on City+Country. But when the field is longer, the ngram approach is not working. for eg I'm not getting any result for query term 'san fran', but I get result for 'san francisco'

I'm using this for indexing:

{
  "index_patterns": ["address_book*"],
  "settings": {
    "analysis": {
      "analyzer": {
        "autocomplete": {
          "tokenizer": "autocomplete",
          "filter": [
            "lowercase"
          ]
        },
        "autocomplete_search": {
          "tokenizer": "lowercase"
        }
      },
      "tokenizer": {
        "autocomplete": {
          "type": "edge_ngram",
          "min_gram": 2,
          "max_gram": 50,
          "token_chars": [
            "letter"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "full": {
        "type": "text",
        "analyzer": "autocomplete",
        "search_analyzer": "autocomplete_search"
      },
      "location": {
        "type": "geo_point"
      }
    }
  }
 }

And the query is:

{
   "query": {
    "match": {
      "full": {
        "query": "san fran", 
        "operator": "and"
      }
    }
  }
}

I'm guessing that the index field (full) needs to be tokenized if that is not happening already?
Any suggestion on how I can get the auto complete working on the partial matches?

Thank you

stephenb · May 13, 2022, 11:54pm

Did you look at

Should work pretty good on the full concatenated address.

As people start to type the address. You have to be a little careful on the front and so that you don't flood your back end with queries, You need a couple millisecond pause between submissions, but I'm not the front-end expert

stephenb · May 14, 2022, 12:58am

Ohh and App Search pretty much has the Out of the Box

foobar2022 · May 14, 2022, 1:22am

ah didn't realize there already is search_as_you_type field type! Will give that a try thanks

On the frontend side, I do have a bouncer that waits for 500ms after the last char before sending the search request.

foobar2022 · May 14, 2022, 4:45am

@stephenb Still I'm having exact same issue with search_as_you_type
Here is the new index setting:

 {
  "index_patterns": ["address_book*"],
  "mappings": {
    "properties": {
      "full": {
        "type": "search_as_you_type"
      },
      "location": {
        "type": "geo_point"
      }
    }
  }
 }

Here is a field value: 'San Francisco California USA US'
I get results for: 'san francisco', 'san usa', 'san us'
But no results for: 'san fran', 'san cali'

stephenb · May 14, 2022, 5:23am

Hmmmm I get all expected results

PUT discuss-search-as-type
{
  "mappings": {
    "properties": {
      "full": {
        "type": "search_as_you_type"
      },
      "location": {
        "type": "geo_point"
      }
    }
  }
}

POST discuss-search-as-type/_doc
{
  "full" : "San Francisco California USA US"
}

POST discuss-search-as-type/_doc
{
  "full" : "San Diego USA US"
}

I get

GET discuss-search-as-type/_search
{
  "query": {
    "multi_match": {
      "query": "san",
      "type": "bool_prefix",
      "fields": [
        "full",
        "full._2gram",
        "full._3gram"
      ]
    }
  }
}

#Result Both as Expected
{
  "took" : 67,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "discuss-search-as-type",
        "_id" : "OU77wIABpvjF7kd-eB1V",
        "_score" : 1.0,
        "_source" : {
          "full" : "San Francisco California USA US"
        }
      },
      {
        "_index" : "discuss-search-as-type",
        "_id" : "Ok4BwYABpvjF7kd-Dx1b",
        "_score" : 1.0,
        "_source" : {
          "full" : "San Diego USA US"
        }
      }
    ]
  }
}

For

I get results for: 'san francisco', 'san usa', 'san us'
Also for 'san fran', 'san cali'

GET discuss-search-as-type/_search
{
  "query": {
    "multi_match": {
      "query": "san fran",
      "type": "bool_prefix",
      "fields": [
        "full",
        "full._2gram",
        "full._3gram"
      ]
    }
  }
}

Results Both as Expected, Scored as expected san fran higher

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 2.1743946,
    "hits" : [
      {
        "_index" : "discuss-search-as-type",
        "_id" : "OU77wIABpvjF7kd-eB1V",
        "_score" : 2.1743946,
        "_source" : {
          "full" : "San Francisco California USA US"
        }
      },
      {
        "_index" : "discuss-search-as-type",
        "_id" : "Ok4BwYABpvjF7kd-Dx1b",
        "_score" : 0.19100355,
        "_source" : {
          "full" : "San Diego USA US"
        }
      }
    ]
  }
}

And for just "fran"

GET discuss-search-as-type/_search
{
  "query": {
    "multi_match": {
      "query": "fran",
      "type": "bool_prefix",
      "fields": [
        "full",
        "full._2gram",
        "full._3gram"
      ]
    }
  }
}

Results

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "discuss-search-as-type",
        "_id" : "OU77wIABpvjF7kd-eB1V",
        "_score" : 1.0,
        "_source" : {
          "full" : "San Francisco California USA US"
        }
      }
    ]
  }
}

foobar2022 · May 14, 2022, 5:42am

Turns out my query was not good:

{
   "query": {
    "match": {
      "full_address": {
        "query": "san fran", 
        "operator": "and"
      }
    }
  }
}

I tried it against your example, It didn't produce any result. I was expecting to partially match all search terms to avoid looking into score values to discard unwanted results.
For eg, for the query 'san fran', the result 'san diego' is not relevant.

I think the sorted result is good enough for now. Thanks a lot for digging in deep

system · June 11, 2022, 5:43am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Postal address autocomplete Elastic Search elastic-app-search	3	242	January 4, 2024
Autocomplete across multiple fields Elasticsearch	5	13957	July 5, 2017
ES autocomplete on cities Elasticsearch	1	868	November 22, 2018
Help with Autocomplete (Search Suggestion) Elasticsearch	3	568	July 6, 2017
Elasticseach completion strange behavior Elasticsearch	1	333	July 25, 2018

Autocomplete address search

Related topics