Mapping with type keyword is still analyzed

Elasticsearch version: 5.2.2

Plugins installed: [none]

JVM version:

java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)

OS version:
CentOS 7

Description of the problem including expected versus actual behavior:
Fields should not be analyzed, not working with type : keyword, I still find the record when searching parts of a string

Steps to reproduce:
Create New Index, all future message fields in this index shall be not_analyzed (since v5 = type : keyword):

curl -X PUT 'http://10.2.5.230:9200/twitter' -d '{
  "mappings": {
    "tweet": {
      "properties": {
        "message": {
          "type": "keyword"
        }
      }
    }
  }
}'

Get mapping for check:

curl -X GET 'http://10.2.5.230:9200/twitter/_mapping/tweet?pretty'
{
  "twitter" : {
    "mappings" : {
      "tweet" : {
        "properties" : {
          "message" : {
            "type" : "keyword"
          }
        }
      }
    }
  }
}

Seems good, add record:

curl -X PUT 'http://10.2.5.230:9200/twitter/tweet/xyz?pretty' -d '{
  "foo" : "2",
  "message" : "trying out Elasticsearch"
}'

Now search for a part of the string "trying out Elasticsearch", should return in zero hits:

curl -X POST 'http://10.2.5.230:9200/twitter/tweet/_search?pretty' -d '{
  "query": { "query_string": { "query" : "out" } }
}'

But it has been found:

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "twitter",
        "_type" : "tweet",
        "_id" : "xyz",
        "_score" : 0.2876821,
        "_source" : {
          "foo" : "2",
          "message" : "trying out Elasticsearch"
        }
      }
    ]
  }
}

Is it a bug or am I too stupid to use it?

Okay, I got it...
The query has to be in the following format:

curl -X POST 'http://10.2.5.230:9200/twitter/tweet/_search?pretty' -d '{
    "query" : {
        "constant_score" : {
            "filter" : {
                "term" : {
                    "message" : "trying out Elasticsearch"
                }
            }
        }
    }
}'

But I don't get it. Why can it still be found with "query_string"?

What is the mapping after you inserted your doc?

BTW it's strange that a keyword analyzer is defined on the keyword type. Only normalizers can be defined.

After adding a record the mapping looks like this:

{
  "twitter" : {
    "mappings" : {
      "tweet" : {
        "properties" : {
          "foo" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "message" : {
            "type" : "keyword"
          }
        }
      }
    }
  }
}

Isn't it strange, that the record can be found with "query_string"?
I mean the idea of "type : keyword" is NOT to index the single parts of the string, why can it still be found?
Would it be a good idea to make "type:keyword" as default for all fields and query exact searches with "constant_score" and fulltext with "query_string"?

Actually I missed one part:

curl -X POST 'http://10.2.5.230:9200/twitter/tweet/_search?pretty' -d '{
  "query": { "query_string": { "query" : "out" } }
}'

You are searching here in the _all field as you did not set the field. _all uses by default the standard analyzer.

Try:

curl -X POST 'http://10.2.5.230:9200/twitter/tweet/_search?pretty' -d '{
  "query": { "query_string": { "query" : "message:out" } }
}'

Or

curl 'http://10.2.5.230:9200/twitter/tweet/_search?q=message:out'

Ah, okay, I think I got it! Thank you for your time! :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.