Dynamic template multi-field w/ _all combo analyzer issue?


(asanderson) #1

I'm seeing duplicate concatenated values when using the combo analyzer for
_all using a multi-field defined in a dynamic template.

e.g. Instead of seeing "Foo Bar" when listing the _all terms aggregation,
I'm seeing "Foo Bar Foo Bar" for the token because my mulit-field defines 2
sub-fields. If the multi-field is defined with 4 sub-fields, then "Foo Bar"
is concatenated 4 times.

My set up is below.

Elasticsearch 1.0.0 on CentOs 6.4 with Java 1.7.0_51.

$ES_HOME/config/default-mapping.json:
{
"default": {
"_all": {
"enabled": true,
"analyzer": "combo",
"store": false
},
"dynamic_templates": {
"string_multifield_template": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"include_in_all": false,
"fields": {
"{name}": {
"index": "not_analyzed",
"store": true,
"type": "string"
},
"lowercase": {
"analyzer": "lowercase",
"index": "analyzed",
"store": false,
"type": "string"
}
}
}
}
}
}
}

$ES_HOME/config/elasticsearch.yml:
...
index.analysis.analyzer.lowercase.type: custom
index.analysis.analyzer.lowercase.tokenizer: keyword
index.analysis.analyzer.lowercase.filter [ lowercase ]

index.analysis.analyzer.combo.type: custom
index.analysis.analyzer.combo.sub_analyzers: [ keyword, lowercase ]
index.analysis.analyzer.combo.deduplication: true
index.analysis.analyzer.combo.tokenstream_reuse: false
...

The aggregation query I use is the following:
{
"aggs": {
"_all": {
"terms": {
"field": "_all"
}
}
}
}

Thoughts?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f9344d45-33e6-45eb-b193-fdf1d24ebc1b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #2