Search analyzer not being applied

Hi,

I've got an index which has been configured to use the snowball
analyzer (English) as both index_analyzer and search_analyzer. The
problem is that it doesn't appear to be applied to any search queries,
but works perfectly at indexing time.

My analyzer in elasticsearch.json:
"index":{
"analysis":{
"analyzer":{
"snowball_en":{
"type":"snowball",
"language":"English"
}
}
}
}

Looking at _cluster/state, I've successfully configured a template
that will assign that analyzer to any new index ending in "_en".
"templates" : {
"english_index" : {
"template" : "*_en",
"order" : 0,
"settings" : {
},
"mappings" : {
"webpage" : {
"index_analyzer" : "snowball_en",
"search_analyzer" : "snowball_en"
}
}
}

Again in _cluster/state, the english index:
"indices" : {
"test_en" : {
"state" : "open",
"settings" : {
"index.number_of_shards" : "5",
"index.number_of_replicas" : "1"
},
"mappings" : {
"webpage" : {
"_source" : {
"compress" : true
},
"dynamic_templates" : [ {
"everything_else" : {
"mapping" : {
"include_in_all" : false,
"index" : "not_analyzed",
"type" : "string"
},
"match_mapping_type" : "string",
"match" : "*"
}
} ],
"analyzer" : "snowball_en",
"properties" : {
"id" : {
"include_in_all" : false,
"index" : "not_analyzed",
"type" : "string"
},
"title" : {
"include_in_all" : true,
"type" : "string"
},
"text" : {
"include_in_all" : true,
"type" : "string"
},
"language" : {
"include_in_all" : false,
"index" : "not_analyzed",
"type" : "string"
},
"url" : {
"include_in_all" : true,
"index" : "not_analyzed",
"type" : "string"
},
"fields" : {
"dynamic" : "true",
"properties" : {
"tags" : {
"include_in_all" : false,
"index" : "not_analyzed",
"type" : "string"
}
}
}
}
}
},
"aliases" : [ ]
}
}
}

What I get when I run a test analysis against that index:
curl -XGET 'localhost:9200/test_en/_analyze' -d 'getting started'
{"tokens":[{"token":"getting","start_offset":0,"end_offset":
7,"type":"","position":1},{"token":"started","start_offset":
8,"end_offset":15,"type":"","position":2}]}

If I explicitly specify the analyzer:
curl -XGET 'localhost:9200/test_en/_analyze?analyzer=snowball_en' -d
'getting started'
{"tokens":[{"token":"get","start_offset":0,"end_offset":
7,"type":"","position":1},{"token":"start","start_offset":
8,"end_offset":15,"type":"","position":2}]}

My understanding was that specifying 'search_analyzer' would cause
elasticsearch to analyze the query string and in this case the two
statements above would return the same result?

Best regards
Mattias

You set it on the mapping level for webpage, which has no concept of
index_analyzer or search_analyzer. Do you want to set it as the default
analyzer for the index? If so, in your analysis config, simply configure a
"default" analyzer (name) that has the snowball properties.

On Wed, Aug 31, 2011 at 4:43 PM, Mattias Nordberg <
mattias.nordberg@gmail.com> wrote:

Hi,

I've got an index which has been configured to use the snowball
analyzer (English) as both index_analyzer and search_analyzer. The
problem is that it doesn't appear to be applied to any search queries,
but works perfectly at indexing time.

My analyzer in elasticsearch.json:
"index":{
"analysis":{
"analyzer":{
"snowball_en":{
"type":"snowball",
"language":"English"
}
}
}
}

Looking at _cluster/state, I've successfully configured a template
that will assign that analyzer to any new index ending in "_en".
"templates" : {
"english_index" : {
"template" : "*_en",
"order" : 0,
"settings" : {
},
"mappings" : {
"webpage" : {
"index_analyzer" : "snowball_en",
"search_analyzer" : "snowball_en"
}
}
}

Again in _cluster/state, the english index:
"indices" : {
"test_en" : {
"state" : "open",
"settings" : {
"index.number_of_shards" : "5",
"index.number_of_replicas" : "1"
},
"mappings" : {
"webpage" : {
"_source" : {
"compress" : true
},
"dynamic_templates" : [ {
"everything_else" : {
"mapping" : {
"include_in_all" : false,
"index" : "not_analyzed",
"type" : "string"
},
"match_mapping_type" : "string",
"match" : "*"
}
} ],
"analyzer" : "snowball_en",
"properties" : {
"id" : {
"include_in_all" : false,
"index" : "not_analyzed",
"type" : "string"
},
"title" : {
"include_in_all" : true,
"type" : "string"
},
"text" : {
"include_in_all" : true,
"type" : "string"
},
"language" : {
"include_in_all" : false,
"index" : "not_analyzed",
"type" : "string"
},
"url" : {
"include_in_all" : true,
"index" : "not_analyzed",
"type" : "string"
},
"fields" : {
"dynamic" : "true",
"properties" : {
"tags" : {
"include_in_all" : false,
"index" : "not_analyzed",
"type" : "string"
}
}
}
}
}
},
"aliases" :
}
}
}

What I get when I run a test analysis against that index:
curl -XGET 'localhost:9200/test_en/_analyze' -d 'getting started'
{"tokens":[{"token":"getting","start_offset":0,"end_offset":
7,"type":"","position":1},{"token":"started","start_offset":
8,"end_offset":15,"type":"","position":2}]}

If I explicitly specify the analyzer:
curl -XGET 'localhost:9200/test_en/_analyze?analyzer=snowball_en' -d
'getting started'
{"tokens":[{"token":"get","start_offset":0,"end_offset":
7,"type":"","position":1},{"token":"start","start_offset":
8,"end_offset":15,"type":"","position":2}]}

My understanding was that specifying 'search_analyzer' would cause
elasticsearch to analyze the query string and in this case the two
statements above would return the same result?

Best regards
Mattias

Well, yeah but I want it to be dependent on the language of the index
which I indicate with the language code suffix on the index name. I've
got two templates, one for english indices and one for swedish
indices, _en should get snowball_en and _sv should get snowball_sv.
Can I set the analyzer in the template? I tried, but as you say it
ended up on the mapping :slight_smile:

Many thanks
Mattias

On Aug 31, 3:07 pm, Shay Banon kim...@gmail.com wrote:

You set it on the mapping level for webpage, which has no concept of
index_analyzer or search_analyzer. Do you want to set it as the default
analyzer for the index? If so, in your analysis config, simply configure a
"default" analyzer (name) that has the snowball properties.

On Wed, Aug 31, 2011 at 4:43 PM, Mattias Nordberg <

mattias.nordb...@gmail.com> wrote:

Hi,

I've got an index which has been configured to use the snowball
analyzer (English) as both index_analyzer and search_analyzer. The
problem is that it doesn't appear to be applied to any search queries,
but works perfectly at indexing time.

My analyzer in elasticsearch.json:
"index":{
"analysis":{
"analyzer":{
"snowball_en":{
"type":"snowball",
"language":"English"
}
}
}
}

Looking at _cluster/state, I've successfully configured a template
that will assign that analyzer to any new index ending in "_en".
"templates" : {
"english_index" : {
"template" : "*_en",
"order" : 0,
"settings" : {
},
"mappings" : {
"webpage" : {
"index_analyzer" : "snowball_en",
"search_analyzer" : "snowball_en"
}
}
}

Again in _cluster/state, the english index:
"indices" : {
"test_en" : {
"state" : "open",
"settings" : {
"index.number_of_shards" : "5",
"index.number_of_replicas" : "1"
},
"mappings" : {
"webpage" : {
"_source" : {
"compress" : true
},
"dynamic_templates" : [ {
"everything_else" : {
"mapping" : {
"include_in_all" : false,
"index" : "not_analyzed",
"type" : "string"
},
"match_mapping_type" : "string",
"match" : "*"
}
} ],
"analyzer" : "snowball_en",
"properties" : {
"id" : {
"include_in_all" : false,
"index" : "not_analyzed",
"type" : "string"
},
"title" : {
"include_in_all" : true,
"type" : "string"
},
"text" : {
"include_in_all" : true,
"type" : "string"
},
"language" : {
"include_in_all" : false,
"index" : "not_analyzed",
"type" : "string"
},
"url" : {
"include_in_all" : true,
"index" : "not_analyzed",
"type" : "string"
},
"fields" : {
"dynamic" : "true",
"properties" : {
"tags" : {
"include_in_all" : false,
"index" : "not_analyzed",
"type" : "string"
}
}
}
}
}
},
"aliases" :
}
}
}

What I get when I run a test analysis against that index:
curl -XGET 'localhost:9200/test_en/_analyze' -d 'getting started'
{"tokens":[{"token":"getting","start_offset":0,"end_offset":
7,"type":"","position":1},{"token":"started","start_offset":
8,"end_offset":15,"type":"","position":2}]}

If I explicitly specify the analyzer:
curl -XGET 'localhost:9200/test_en/_analyze?analyzer=snowball_en' -d
'getting started'
{"tokens":[{"token":"get","start_offset":0,"end_offset":
7,"type":"","position":1},{"token":"start","start_offset":
8,"end_offset":15,"type":"","position":2}]}

My understanding was that specifying 'search_analyzer' would cause
elasticsearch to analyze the query string and in this case the two
statements above would return the same result?

Best regards
Mattias

I got it working by passing in the module settings in the template, as
the documentation mentioned :slight_smile: Thanks for you help.

On Aug 31, 3:07 pm, Shay Banon kim...@gmail.com wrote:

You set it on the mapping level for webpage, which has no concept of
index_analyzer or search_analyzer. Do you want to set it as the default
analyzer for the index? If so, in your analysis config, simply configure a
"default" analyzer (name) that has the snowball properties.

On Wed, Aug 31, 2011 at 4:43 PM, Mattias Nordberg <

mattias.nordb...@gmail.com> wrote:

Hi,

I've got an index which has been configured to use the snowball
analyzer (English) as both index_analyzer and search_analyzer. The
problem is that it doesn't appear to be applied to any search queries,
but works perfectly at indexing time.

My analyzer in elasticsearch.json:
"index":{
"analysis":{
"analyzer":{
"snowball_en":{
"type":"snowball",
"language":"English"
}
}
}
}

Looking at _cluster/state, I've successfully configured a template
that will assign that analyzer to any new index ending in "_en".
"templates" : {
"english_index" : {
"template" : "*_en",
"order" : 0,
"settings" : {
},
"mappings" : {
"webpage" : {
"index_analyzer" : "snowball_en",
"search_analyzer" : "snowball_en"
}
}
}

Again in _cluster/state, the english index:
"indices" : {
"test_en" : {
"state" : "open",
"settings" : {
"index.number_of_shards" : "5",
"index.number_of_replicas" : "1"
},
"mappings" : {
"webpage" : {
"_source" : {
"compress" : true
},
"dynamic_templates" : [ {
"everything_else" : {
"mapping" : {
"include_in_all" : false,
"index" : "not_analyzed",
"type" : "string"
},
"match_mapping_type" : "string",
"match" : "*"
}
} ],
"analyzer" : "snowball_en",
"properties" : {
"id" : {
"include_in_all" : false,
"index" : "not_analyzed",
"type" : "string"
},
"title" : {
"include_in_all" : true,
"type" : "string"
},
"text" : {
"include_in_all" : true,
"type" : "string"
},
"language" : {
"include_in_all" : false,
"index" : "not_analyzed",
"type" : "string"
},
"url" : {
"include_in_all" : true,
"index" : "not_analyzed",
"type" : "string"
},
"fields" : {
"dynamic" : "true",
"properties" : {
"tags" : {
"include_in_all" : false,
"index" : "not_analyzed",
"type" : "string"
}
}
}
}
}
},
"aliases" :
}
}
}

What I get when I run a test analysis against that index:
curl -XGET 'localhost:9200/test_en/_analyze' -d 'getting started'
{"tokens":[{"token":"getting","start_offset":0,"end_offset":
7,"type":"","position":1},{"token":"started","start_offset":
8,"end_offset":15,"type":"","position":2}]}

If I explicitly specify the analyzer:
curl -XGET 'localhost:9200/test_en/_analyze?analyzer=snowball_en' -d
'getting started'
{"tokens":[{"token":"get","start_offset":0,"end_offset":
7,"type":"","position":1},{"token":"start","start_offset":
8,"end_offset":15,"type":"","position":2}]}

My understanding was that specifying 'search_analyzer' would cause
elasticsearch to analyze the query string and in this case the two
statements above would return the same result?

Best regards
Mattias