We started using aggregations on elasticsearch 2.3.3 and we have a spike in field data usage. I have a few basic questions regarding field data.
We are only doing a cardinality aggregation (with default precision settings) on a single field which is not analyzed and the field's mapping looks like this
....
"name" : {
"type" : "string",
"index" : "not_analyzed",
"fields" : {
"lowercase" : {
"type" : "string",
"analyzer" : "lower_keyword"
}
}
}
....
From the documentation as the field is not analyzed doc values should be used instead of field data but from looking at our metrics, this is consuming a whopping 17GB of heap on each of hosts (see below). Any reason why this can happen ?
"indices" : {
"fielddata" : {
"memory_size_in_bytes" : 18575882472,
"evictions" : 0,
"fields" : {
"name" : {
"memory_size_in_bytes" : 18575882472
}
}
}
}
}