URL with special characters when searched not working in ElasticSearch 5.2.2


(Abhisek DasGupta) #1

I am not able to search URL's which contains special characters like http://example.sample.com/guide/Analyzers_Terms_and_Analysis_(ABC)_Guide

Suppose I have indexed 4 noteText

  1. http://example.sample.com/guide/Analyzers_Terms_and_Analysis_%28ABC%29_Guide

  2. example

  3. Terms

  4. Analysis

Expected Result - When I search with full URL option (1) it should output me the exact result of URL only and not with partial search with other indexed values like example, Terms, Analysis.

Index Settings -

PUT my_index
{
  "settings": {
    "index": {
      "analysis": {
        "filter": {
          "my_ngram": {
            "type": "nGram",
            "min_gram": 1,
            "max_gram": 50
          }
        },
        "char_filter": {
          "whitespace_mapping": {
            "mappings": [
              "\\u00A0=>\\u0020"
            ],
            "type": "mapping"
          }
        },
        "analyzer": {
          "default": {
            "type": "custom",
            "char_filter": [
              "whitespace_mapping"
            ],
            "filter": [
              "lowercase",
              "asciifolding",
              "stop",
              "my_ngram",
              "kstem"
            ],
            "tokenizer": "whitespace"
          },
          "default_search": {
            "type": "custom",
            "char_filter": [
              "whitespace_mapping"
            ],
            "filter": [
              "lowercase",
              "asciifolding",
              "kstem"
            ],
            "tokenizer": "whitespace"
          },
          "match_phrase": {
            "type": "custom",
            "char_filter": [
              "whitespace_mapping"
            ],
            "filter": [
              "lowercase"
            ],
            "tokenizer": "whitespace"
          },
          "match_phrase_search": {
            "type": "custom",
            "char_filter": [
              "whitespace_mapping"
            ],
            "filter": [
              "lowercase",
              "stop"
            ],
            "tokenizer": "whitespace"
          }
        }
      }
    }
  }
}

Mapping -


PUT my_index/_mapping/notes
{
    "properties": {
      "userId": {
        "type": "long"
      },
      "noteText": {
        "analyzer": "match_phrase",
        "term_vector": "with_positions_offsets",
        "type": "text",
        "fields": {
          "ngrammed": {
            "term_vector": "with_positions_offsets",
            "type": "text"
          }
        }
      }
    }
}

Search Query -

POST /my_index/_search 
{
  
  "query": {
    "bool": {
      "must": [
        {
          "constant_score": {
            "query": {
              "query_string": {
                "query": "http://example.sample.com/guide/Analyzers_Terms_and_Analysis_%28ABC%29_Guide",
                "fields": [
                  "noteText.ngrammed"
                ],
                "analyzer": "match_phrase_search"
              }
            },
            "boost": 5
          }
        },
        {
          "query_string": {
            "query": "http://example.sample.com/guide/Analyzers_Terms_and_Analysis_%28ABC%29_Guide",
            "fields": [
              "noteText.ngrammed"
            ]
          }
        }
      ]
    }
  }
}

(David Pilato) #2

Please don't post pictures but code.

And format it using </> icon as explained in this guide. It will make your post more readable.

Or use markdown style like:

```
CODE
```

(Abhisek DasGupta) #3

Done changed pictures to code. Thanks !!!


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.