We are using Elasticsearch 2.1.1 with a 3-node cluster. Recently we see Elasticsearch constantly initializing many shards in multiple indexes and stuck in that mode. The cluster health stays red with no obvious methods to recover from it, other than deleting all the indexes and restart the Elasticsearch cluster.
By researching the logs and from internet, we found that this is probably caused by dynamically changing data types of certain fields in the indexes, which caused the discrepancy of mapping of primary and standby shards. Here are some of the exceptions reported in ES logs:
MergeMappingException[Merge failed with failures {[mapper
[cpuPercent] of different type, current_type [long], merged_type [double]]}]
IllegalArgumentExceptions, occur in ‘mona’ mapping:
java.lang.IllegalArgumentException: Mapper for [response]
conflicts with existing mapping in other types [Can't merge a non object
mapping [response.headers] with an object mapping [response.headers]]
Subsequently we found that ES allows to configure at index level for all mappings to set ignore_malformed to be true, by doing so ignoring the 'bad' data types and accept all other 'normal' fields of a document so that the system will not fall into the red mode of unable to allocate shards.
However, we are only able to set this at index level, after the index has been created daily (via Logstash). This is the REST API we use to set: curl 'http://elk:9200/logstash-2016.06.02/_settings' -d '{
"index" : {
"mapping.ignore_malformed" : true
}
}
Since the logstash index is created daily, we cannot use this approach to change the settings.
We tried with modifying the elasticsearch.yml as follows:
......
#################################### Index ####################################
You can set a number of options (such as shard/replica options, mapping
or analyzer definitions, translog settings, ...) for indices globally,
in this file.
Note, that it makes more sense to configure index settings specifically for
a certain index, either when creating it or by using the index templates API.
See http://elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules.html and
http://elasticsearch.org/guide/en/elasticsearch/reference/current/indices-create-index.html
for more information.
index.mapping.ignore_malformed: true
Set the number of shards (splits) of an index (5 by default):
index.number_of_shards: 5
Set the number of replicas (additional copies) of an index (1 by default):
index.number_of_replicas: 1
......
But this seems does not take effect. A REST query of newly created index settings only shows the following:
{
"logstash-2016.06.02": {
"settings": {
"index": {
"creation_date": "1464874876688",
"refresh_interval": "5s",
"number_of_shards": "5",
"number_of_replicas": "1",
"uuid": "7nXLQdWQQc6Bue-XhODcMA",
"version": {
"created": "2010199"
}
}
}
}
}
So the question is that is there a global way of setting the ignore_malformed? And if not, how to solve the situation?