Searching with special characters

Alex_Pravdin · October 23, 2020, 4:08am

I'm trying to make an autocomplete search that will account for special characters and add weights to the results which match a query with these characters.

Index:

{
   "institution_test":{
      "settings":{
         "index":{
            "analysis":{
               "filter":{
                  "autocomplete_filter":{
                     "type":"ngram",
                     "min_gram":"2",
                     "max_gram":"14"
                  }
               },
               "analyzer":{
                  "autocomplete":{
                     "type":"custom",
                     "filter":[
                        "lowercase",
                        "asciifolding",
                        "autocomplete_filter"
                     ],
                     "tokenizer":"whitespace"
                  }
               }
            }
         }
      }
   }
}

Mapping:

{
   "institution_test":{
      "mappings":{
         "default":{
            "properties":{
               "local_name":{
                  "type":"text",
                  "copy_to":[
                     "search_by_name"
                  ],
                  "analyzer":"autocomplete",
                  "search_analyzer":"standard"
               },
               "name":{
                  "type":"text",
                  "copy_to":[
                     "search_by_name"
                  ],
                  "analyzer":"autocomplete",
                  "search_analyzer":"standard"
               },
               "search_by_name":{
                  "type":"text",
                  "analyzer":"autocomplete",
                  "search_analyzer":"standard"
               }
            }
         }
      }
   }
}

Documents:

[
   {
      "name":"Adam SMITH"
   },
   {
      "name":"Abd MIT Abd"
   },
   {
      "name":"Massachusetts (MIT)"
   },
   {
      "name":"MI TOO"
   }
]

Task: to pull up rows which contain (MIT), next MIT as a separate word, next partial matches.

Search query:

http://localhost:9200/institution_test/_search?q=search_by_name:\(mit\)

Results (some useless fields removed):

      "hits":[
         {
            "_score":0.5691198,
            "_source":{
               "name":"Adam SMITH"
            }
         },
         {
            "_score":0.5691198,
            "_source":{
               "name":"Massachusetts (MIT)"
            }
         },
         {
            "_score":0.55140024,
            "_source":{
               "name":"Abd MIT Abd"
            }
         }
      ]

Why Adam SMITH which is a partial match has the highest position? Why it has the weight identical to the second row?

How to make the results as follows and all rows have different weights?

Massachusetts (MIT)
Abd MIT Abd
Adam SMITH

Thanks.

system · November 20, 2020, 4:09am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Searching special characters in elastic Elasticsearch	4	189	April 10, 2024
How to create mapping for special characters and autocomplete search Elasticsearch	6	556	March 3, 2023
URL with special characters when searched not working in ElasticSearch 5.2.2 Elasticsearch	3	2027	September 7, 2017
Search using special characters in standard analyzer Elasticsearch	1	295	May 11, 2023
Special Characters in Query String Elasticsearch	2	2096	April 23, 2020

Searching with special characters

Related topics