Reverse word search in Elastic

Hello,

I want to search words in reverse also in Elasticsearch. for Example - "Tpp Info" and "Info Tpp" should give the same result. while in my db the document present is "Tpp Infotech" in the field title.

My query is:

{
  "from": 0,
  "size": 1000,
  "query": {
    "bool": {
      "must": [
        {
          "multi_match": {
            "query": "info tpp",
            "fields": [
            "Title^6","Title._2gram","Title._index_prefix^12"
            ],
            "type": "bool_prefix",
            "operator": "and"
          }
        }
      ]
    }
  }
}

Hi @Rakhshunda_Noorein_J

Did you try with tokenizer edge-ngram?

GET _analyze
{
  "tokenizer": "standard",
  "filter": [
    {
      "type": "edge_ngram",
      "min_gram": 2,
      "max_gram": 4
    }
  ],
  "text": [
    "Tpp Infotech"
  ]
}

It is giving me like this..

{
    "tokens": [
        {
            "token": "tp",
            "start_offset": 0,
            "end_offset": 3,
            "type": "<ALPHANUM>",
            "position": 0
        },
        {
            "token": "tpp",
            "start_offset": 0,
            "end_offset": 3,
            "type": "<ALPHANUM>",
            "position": 0
        },
        {
            "token": "In",
            "start_offset": 4,
            "end_offset": 12,
            "type": "<ALPHANUM>",
            "position": 1
        },
        {
            "token": "Inf",
            "start_offset": 4,
            "end_offset": 12,
            "type": "<ALPHANUM>",
            "position": 1
        },
        {
            "token": "Info",
            "start_offset": 4,
            "end_offset": 12,
            "type": "<ALPHANUM>",
            "position": 1
        }
    ]
}

I don't understand what will I do with this result???

What is the index mapping for the field involved in the query?

The output of the _analyze API shows you which tokens that are indexed in Elasticsearch. As you can see here the words are tokenized individually and strings from the start that are 2 to 4 characters long are indexed, which should give the result you wanted.

1 Like

Data type of the field is - Search-as-you-type...

Myfield actual value is: Tpp Infotech

When i am searching - Tpp Infotech or Infotech Tpp , it is giving me the same result.

but when I am Typing Tpp info, it is giving me the result but not in Info Tpp...

i.e,  Infotech Tpp  <-> Tpp Infotech , but Tpp Info  !<-> Info Tpp.

Please show the full mapping as well as a sample document so others can recreate the issue.

My document:

{
"Id":401
"Title": "Tpp Infotech Ltd.",
"Keys": 485rT,
"@timestamp": "2023-05-31T06:35:18.033239100Z"
}

Index mapping:

{
    "my_index_001": {
        "mappings": {
            "properties": {
                "@timestamp": {
                    "type": "date"
                },
                "Id": {
                    "type": "long"
                },
                "Keys": {
                    "type": "search_as_you_type",
                    "doc_values": false,
                    "max_shingle_size": 3
                },
                "Title": {
                    "type": "search_as_you_type",
                    "doc_values": false,
                    "max_shingle_size": 3
                }            
            }
        }
    }
}

My Query:

{
  "from": 0,
  "size": 1000,
  "query": {
    "bool": {
      "must": [
        {
          "multi_match": {
            "query": "info tpp",
            "fields": [
            "Title^6","Title._2gram","Title._index_prefix^12"
            ],
            "type": "bool_prefix",
            "operator": "and"
          }
        }
      ]
    }
  }
}

Below is an example that return the results you expect. I used edge-ngram .

PUT idx_test
{
  "settings": {
    "analysis": {
      "analyzer": {
        "ngram": {
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "ngram_filter"
          ]
        }
      },
      "filter": {
        "ngram_filter": {
          "type": "edge_ngram",
          "min_gram": 2,
          "max_gram": 4
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "fields": {
          "ngram": {
            "type": "text",
            "analyzer": "ngram"
          }
        }
      }
    }
  }
}

POST idx_test/_doc
{
  "title": "Tpp Infotech Ltd."
}

GET idx_test/_search
{
  "query": {
    "match": {
      "title.ngram": "info tpp"
    }
  }
}
1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.