After the aws hardware failure, we've been encountering a new error that we've never seen before on our elastic.co cloud cluster.
Reference: https://discuss.elastic.co/t/problem-cluster-is-under-maintenance-recovery-for-the-last-hour/76090
I can blow away our mapping and template and rebuild my indexes from scratch, but then I appear to be encountering sporadic index corruption, even with the dual-data-center capability now turned on.
The issue manifests itself as follows. This is on cluster, cluster ID "a02c49" as the forum topic reference above.
The query:
curl -XPOST 'https://user:pw@host.us-west-1.aws.found.io:port/silver_jobs-*/_search' -d '{"query":{"bool":{"should":[{"term":{"consolidated_status":"complete"}},{"term":{"consolidated_status":"passed"}},{"term":{"consolidated_status":"failed"}},{"term":{"consolidated_status":"errored"}}],"filter":[{"range":{"at0_creation_time":{"gte":1487592000000,"lte":1488218400000,"format":"epoch_millis"}}}],"minimum_number_should_match":1}},"aggs":{"byDate":{"date_histogram":{"field":"at0_creation_time","interval":"6h"},"aggs":{"byStatus":{"terms":{"field":"consolidated_status"}}}}},"size":0,"sort":{"at0_creation_time":{"order":"desc"}}}}'
The output:
{
"took": 1208,
"timed_out": false,
"_shards": {
"total": 23,
"successful": 13,
"failed": 10,
"failures": [
{
"shard": 0,
"index": "silver_jobs-2017-02-26",
"node": "jkQeLJ8VQ2WDFfSBdZpkew",
"reason": {
"type": "illegal_argument_exception",
"reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [consolidated_status] in order to load fieldda
ta in memory by uninverting the inverted index. Note that this can however use significant memory."
}
}
]
},
"hits": {
"total": 8049211,
"max_score": 0,
"hits":
},
"aggregations": { ...
The reason this looks like sporadic corruption to me is that the field that is being complained about, consolidated_status
, is a keyword field as defined in the mapping:
"consolidated_status": { "type": "keyword", "copy_to": "ft" },
The ft field is analyzed, but not consolidated_status itself.
I had thought that the sporadic index corruption was left behind in the 2.4.1 series, but we appear to be seeing in with the 5.1.1 Elasticsearch as well.