EdgeGram Search trouble

I am very new to ES so I am probably getting something wrong but this is my mapping and settings:

    {
       "locations":{
          "mappings":{
             "locations":{
                "_meta":{
                   "model":"Postcode"
                },
                "dynamic_date_formats":[
                   
                ],
                "properties":{
                   "city":{
                      "type":"text",
                      "analyzer":"autocomplete",
                      "search_analyzer":"standard"
                   },
                   "code":{
                      "type":"text",
                      "analyzer":"autocomplete"
                   },
                   "codeFull":{
                      "type":"text",
                      "fields":{
                         "keyword":{
                            "type":"keyword"
                         }
                      },
                      "analyzer":"autocomplete"
                   },
                   "country":{
                      "type":"text",
                      "analyzer":"standard"
                   },
                   "region":{
                      "type":"text",
                      "analyzer":"autocomplete",
                      "search_analyzer":"standard"
                   },
                   "state":{
                      "type":"text",
                      "analyzer":"autocomplete",
                      "search_analyzer":"standard"
                   },
                   "suburb":{
                      "type":"text",
                      "analyzer":"autocomplete",
                      "search_analyzer":"standard"
                   }
                }
             }
          }
       }
    }
{
   "locations":{
      "settings":{
         "index":{
            "number_of_shards":"5",
            "provided_name":"locations",
            "max_result_window":"50000",
            "creation_date":"1607485588736",
            "analysis":{
               "filter":{
                  "autocomplete_filter":{
                     "token_chars":[
                        "letter",
                        "digit"
                     ],
                     "min_gram":"2",
                     "type":"edgeNGram",
                     "max_gram":"10"
                  }
               },
               "analyzer":{
                  "autocomplete":{
                     "filter":[
                        "lowercase",
                        "autocomplete_filter"
                     ],
                     "type":"custom",
                     "tokenizer":"standard"
                  }
               }
            },
            "number_of_replicas":"1",
            "uuid":"Hw2u0WVGQzKywWxotuLy5Q",
            "version":{
               "created":"6050199"
            }
         }
      }
   }
}

And here a few example of data:

    {
   "_index":"locations",
   "_type":"locations",
   "_id":"Australia2000",
   "_version":1,
   "_score":1,
   "_source":{
      "state":"New South Wales",
      "city":"Sydney",
      "suburb":"Sydney Inner City",
      "region":"Sydney - City and Inner South",
      "country":"Australia",
      "code":"2000"
   }
}
{
   "_index":"locations",
   "_type":"locations",
   "_id":"United KingdomM15",
   "_version":1,
   "_score":1,
   "_source":{
      "state":"United Kingdom",
      "city":"Manchester",
      "suburb":"Manchester",
      "region":"Manchester",
      "country":"United Kingdom",
      "codeFull":"M15",
      "code":"M15"
   }
}
{
   "_index":"locations",
   "_type":"locations",
   "_id":"United KingdomM1",
   "_version":1,
   "_score":1,
   "_source":{
      "state":"United Kingdom",
      "city":"Manchester",
      "suburb":"Manchester",
      "region":"Manchester",
      "country":"United Kingdom",
      "codeFull":"M1",
      "code":"M1"
   }
}
{
   "_index":"locations",
   "_type":"locations",
   "_id":"United KingdomM120",
   "_version":1,
   "_score":1,
   "_source":{
      "state":"United Kingdom",
      "city":"London",
      "suburb":"London",
      "region":"London",
      "country":"United Kingdom",
      "codeFull":"M120",
      "code":"M120"
   }
}

And my goal is to display an autocomplete as user types for this code, problem is I dont index the full postcode for UK for example, so the full postcode might be something like "M1 ASD" or "M1ASD". I created a keyword to boost exact matches so if a user types "2000", 2000 will come with higher score comparing to other postcodes like 2007 or 2080. But I am not being sussessful in boosting if the user over-type the postcode, e.g. he types "M1ASD" I would like to bring all the 3 example results that start with m1, but I would like to boost the m1 because there is an exact match of one edgegram of "M1ASD" with "M1", is that possible?

Im using the following search atm which works for when the term is an exact match but not for when it has more typed in the queryterm

{
   "query":{
      "bool":{
         "must":[
            {
               "match":{
                  "country":{
                     "query":"United Kingdom"
                  }
               }
            },
            {
               "multi_match":{
                  "type":"cross_fields",
                  "query":"m1asd",
                  "fields":[
                     "region",
                     "code"
                  ]
               }
            }
         ],
         "should":[
            {
               "multi_match":{
                  "fields":[
                     "codeFull",
                     "codeFull.keyword^2"
                  ],
                  "analyzer":"autocomplete",
                  "query":"m1asd"
               }
            }
         ]
      }
   }
}

Results come with other codes with higher score e.g. "m15", "m13", etc, "m1" is on 8th and I am scared other postcodes might not even be presented, I dont get why would m15 have a higher score than m1 with they query term being "m1asd"

It would help if you could provide a full recreation script as described in About the Elasticsearch category. It will help to better understand what you are doing. Please, try to keep the example as simple as possible.

A full reproduction script is something anyone can copy and paste in Kibana dev console, click on the run button to reproduce your use case. It will help readers to understand, reproduce and if needed fix your problem. It will also most likely help to get a faster answer.

But anyway, here, I'd recommend that you use the _analayze API to understand how your text is analyzed at index time and at search time. See

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.