Hi running 2.3.0
This issue is 100% reproducible.
We have a 12 node cluster running with 20GB of RAM per node so 240GB total.
There is 2 indexes Index1 is about 35,000,000 records and Index2 about 6,000,000. Both indexes are "identical" except for the mapping difference noted below (We attempted a new mapping).
Index1 has a mapping of...
"myDate": {
"format": "dateOptionalTime",
"type": "date"
},
Index2 has a mapping of...
"myDate": {
"type": "long"
},
The type is the same on both indexes. The documents are inserted with myDate as yyyyMMdd (no time just the days).
Index1 has...
8,000,0000 documents for 20160101
6,000,0000 documents for 20160102
7,000,0000 documents for 20160103
8,000,0000 documents for 20160104
Index2 has
6,000,0000 documents for 20160407
When we run the below query the cluster crashes. We loose nodes...
GET index*/myType/_search
{
"size" : 0,
"aggregations" : {
"Date" : {
"date_histogram" : {
"field" : "myDate",
"interval" : "1d"
},
"aggregations" : {
"Record Count" : {
"value_count" : {
"field" : "myId"
}
}
}
}
},
"query" : {
"bool" : {
"must" : {
"match" : {
"myUser" : {
"type" : "phrase",
"query" : "user1"
}
}
}
}
}
}
If we run the same agg individually on each index with out the wildcard. It seems to work. Though we have noticed that on Index1, the agg will return a couple thousand records for each "day". Index2 the agg returns a bucket with a doc count.
When we run the agg on a wildcard for both indexes that's where the problem occurs and we lose the cluster. From application stand point we are trying to rectify the issue by revising the mapping and the data inserted.
Just letting you know that the above combination wreaks havoc on Elasticsearch and hopefully something you can reproduce and fix to avoid this kind of crash.
Thanks