How do I do Case Sensitive Match Query in Elastic Search?

I'm trying to do a match query that is case sensitive such that it only returns words where the html field contains the string. Here is the query I have so far:

POST data/_search
{
  "track_total_hits": true, 
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "html": "test_html_query"
          }
        }
      ]
    }
}
  
}

I want to only return documents that strictly have the "test_html_query" string as a substring, not documents that are kind of like this string or have some relevance. When I do this query, I get more results than expected. How do I make this more case sensitive?

Hi @DataStorageMuse
Could you give more information about your mapping? If the "html" field has any analyzer and what is its type?

What do you mean by mapping? I didn't put any analyzer on the html field.

Since there is no parser in the field field, one option is to create a new field where the parser is of type whitespace just to separate the terms.
The example below represents what I said.



PUT idx_test
{
  "settings": {
    "analysis": {
      "analyzer": {
        "raw_analyzer": {
          "type": "whitespace"
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "html": {
        "type": "text",
        "fields": {
          "raw": {
            "type":"text",
            "analyzer": "raw_analyzer"
          }
        }
      }
    }
  }
}

POST idx_test/_bulk
{"index":{}}
{"html": "test html query test_html_query"}
{"index":{}}
{"html": "test html query Test_html_query"}

GET idx_test/_analyze
{
  "field": "html",  
  "text": ["test html query Test_html_query"]
}

GET idx_test/_search
{
  "query": {
    "match": {
      "html.raw": {
        "query": "test_html_query"
      }
    }
  }
}

The Mapping is the Schema for the index i.e. the data types and other specifications about your data, it may be worth taking a look at the docs as it is a key concept.

Text Analysis is also a key concept in this case above @RabBit_BR showed how to use a simple whitespace analyzer to tokenize your data and then your search will be case sensitive but you will not get other features like stemming etc.

1 Like

It is worth noting that all fields have a mapping, which defines how they are indexed. If you do not specify any mapping through an index template or when the index is created the field will be mapped using dynamic mapping rules.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.