How to test analyzer properly to debug mapping issues?


(Charles Patton) #1

I have some very simple settings for an index that I'm trying to use to test my analyzer with:

{
	"settings": {
		"number_of_shards": 1,
		"number_of_replicas": 0
	},
	"mappings": {
		"applog": {
			"properties": {
				"user_id": {
					"type": "string",
					"index": "not_analyzed"
				}
			}
		}
	}
}

I simply want to confirm that my user_id property for my applog document type will not get tokenized.

However, after confirming that the mappings are indeed in place:

curl localhost:9200/applog-test/_mapping?pretty
{
  "applog-test" : {
    "mappings" : {
      "applog" : {
        "properties" : {
          "user_id" : {
            "type" : "string",
            "index" : "not_analyzed"
          }
        }
      }
    }
  }
}

My analyzer appears to still be tokenizing the user_id... hopefully this is user-error?

curl -XGET "localhost:9200/applog-test/_analyze?user_id&pretty" -d "this is a test"
{
  "tokens" : [ {
    "token" : "this",
    "start_offset" : 0,
    "end_offset" : 4,
    "type" : "<ALPHANUM>",
    "position" : 0
  }, {
    "token" : "is",
    "start_offset" : 5,
    "end_offset" : 7,
    "type" : "<ALPHANUM>",
    "position" : 1
  }, {
    "token" : "a",
    "start_offset" : 8,
    "end_offset" : 9,
    "type" : "<ALPHANUM>",
    "position" : 2
  }, {
    "token" : "test",
    "start_offset" : 10,
    "end_offset" : 14,
    "type" : "<ALPHANUM>",
    "position" : 3
  } ]
}

My expectation was that this would all be in a single token?

Bonus points: how do I create the desired behavior in a dynamic_mapping? I ran into the same issue with the dynamic mapping and/or default mappings.


(Charles Patton) #2

I see in kibana after I indexed a single record of just the user_id field, that it is in fact not being analyzed (my desired behavior). So I guess my follow up question would be how do I test the analyzer (indexer?) before I go inserting records since testing it in this fashion resulted in me getting multiple tokens?


(tri-man) #3

When you do "index" : "not_analyzed" in the mapping, it's equivalent to index the field with "keyword" analyzer (see the link below for more info)

https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-keyword-analyzer.html

To test different analyzer, you can do the following
http://localhost:9200/applog-test/_analyze?analyzer=analyzer-name&text=this+is+a+test


(Charles Patton) #4

Thank you so much! Could you please point me in the direction of how you found out that the "index": "not_analyzed" was equivalent to the keyword analyzer? I feel I might need that resource in the future... unless it's expected to just happen upon the analyzer page you referenced. :wink:


(tri-man) #5

The information is in the first link


(system) #6