Hi,
I've got an index which has been configured to use the snowball
analyzer (English) as both index_analyzer and search_analyzer. The
problem is that it doesn't appear to be applied to any search queries,
but works perfectly at indexing time.
My analyzer in elasticsearch.json:
"index":{
"analysis":{
"analyzer":{
"snowball_en":{
"type":"snowball",
"language":"English"
}
}
}
}
Looking at _cluster/state, I've successfully configured a template
that will assign that analyzer to any new index ending in "_en".
"templates" : {
"english_index" : {
"template" : "*_en",
"order" : 0,
"settings" : {
},
"mappings" : {
"webpage" : {
"index_analyzer" : "snowball_en",
"search_analyzer" : "snowball_en"
}
}
}
Again in _cluster/state, the english index:
"indices" : {
"test_en" : {
"state" : "open",
"settings" : {
"index.number_of_shards" : "5",
"index.number_of_replicas" : "1"
},
"mappings" : {
"webpage" : {
"_source" : {
"compress" : true
},
"dynamic_templates" : [ {
"everything_else" : {
"mapping" : {
"include_in_all" : false,
"index" : "not_analyzed",
"type" : "string"
},
"match_mapping_type" : "string",
"match" : "*"
}
} ],
"analyzer" : "snowball_en",
"properties" : {
"id" : {
"include_in_all" : false,
"index" : "not_analyzed",
"type" : "string"
},
"title" : {
"include_in_all" : true,
"type" : "string"
},
"text" : {
"include_in_all" : true,
"type" : "string"
},
"language" : {
"include_in_all" : false,
"index" : "not_analyzed",
"type" : "string"
},
"url" : {
"include_in_all" : true,
"index" : "not_analyzed",
"type" : "string"
},
"fields" : {
"dynamic" : "true",
"properties" : {
"tags" : {
"include_in_all" : false,
"index" : "not_analyzed",
"type" : "string"
}
}
}
}
}
},
"aliases" : [ ]
}
}
}
What I get when I run a test analysis against that index:
curl -XGET 'localhost:9200/test_en/_analyze' -d 'getting started'
{"tokens":[{"token":"getting","start_offset":0,"end_offset":
7,"type":"","position":1},{"token":"started","start_offset":
8,"end_offset":15,"type":"","position":2}]}
If I explicitly specify the analyzer:
curl -XGET 'localhost:9200/test_en/_analyze?analyzer=snowball_en' -d
'getting started'
{"tokens":[{"token":"get","start_offset":0,"end_offset":
7,"type":"","position":1},{"token":"start","start_offset":
8,"end_offset":15,"type":"","position":2}]}
My understanding was that specifying 'search_analyzer' would cause
elasticsearch to analyze the query string and in this case the two
statements above would return the same result?
Best regards
Mattias