Elastic Search text similarity search

what is the best way to search San Jose and SanJose in Elasticsearch. Data is stored as text and correct value in ES is San Jose.
Sample query not working:

{
        "match" : {
          "cssot_city" : {
            "query" : "SanJose",
            "operator" : "OR",
            "fuzziness" : "AUTO",
            "prefix_length" : 0,
            "max_expansions" : 50,
            "fuzzy_transpositions" : true,
            "lenient" : true,
            "zero_terms_query": "all",
            "auto_generate_synonyms_phrase_query" : true,
            "boost" : 1.0
          }
        }
      }

Hi @gkumargaur !

I found two options:

1 - In this post the user created an analyzer to index the words without the white space.

2 - You can use a suggester to suggest the closest term to the user, it would be a "did you mean".

Ex:

PUT idx_did_you_mean
{
  "settings": {
    "analysis": {
      "filter": {
        "shingle_filter": {
          "max_shingle_size": "3",
          "min_shingle_size": "2",
          "type": "shingle"
        }
      },
      "analyzer": {
        "shingle_analyzer": {
          "filter": [
            "lowercase",
            "shingle_filter"
          ],
          "type": "custom",
          "tokenizer": "standard"
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "fields": {
          "suggest": {
            "type": "text",
            "analyzer": "shingle_analyzer"
          }
        }
      }
    }
  }
}


POST idx_did_you_mean/_doc
{
  "title":"San Jose"
}

GET idx_did_you_mean/_search
{
  "suggest": {
    "text": "sanjose city",
    "did_you_mean": {
      "phrase": {
        "field": "title.suggest",
        "size": 5
      }
    }
  }
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.