Terms aggregation returns enormous invalid values

Hello,

when using ES 1.7.2 a terms aggregation top-20 list of terms contains values such as 4639024275540410000 when the actual highest values is 199.

I wonder if this is a known issue with 1.7.2 and we should migrate to a new release?

Here's the offending query:

GET metrics-2016-04/_search
{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "name": "jvm.thread-states.count"
          }
        }
      ]
    }
  },
  "aggs": {
    "3": {
      "terms": {
        "field": "value",
        "size": 10
      }
    }
  }
}

and the result contains these enormous values:

{
   "took": 3,
   "timed_out": false,
   "_shards": {
      "total": 16,
      "successful": 16,
      "failed": 0
   },
   "hits": {
      "total": 9394,
      "max_score": 0,
      "hits": []
   },
   "aggregations": {
      "3": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 1511,
         "buckets": [
            {
               "key": 50,
               "doc_count": 1376
            },
            {
               "key": 138,
               "doc_count": 1344
            },
            {
               "key": 137,
               "doc_count": 1299
            },
            {
               "key": 144,
               "doc_count": 681
            },
            {
               "key": 143,
               "doc_count": 649
            },
            {
               "key": 141,
               "doc_count": 611
            },
            {
               "key": 142,
               "doc_count": 603
            },
            {
               "key": 4639024275540410000,
               "doc_count": 448
            },
            {
               "key": 4632233691727266000,
               "doc_count": 445
            },
            {
               "key": 4639059459912499000,
               "doc_count": 427
            }
         ]
      }
   }
}

When I perform a regular bool query with a range condition for the highest value, I get exactly the one result but definitely nothing higher.

Query:

GET metrics-2016-04/_search
{
  "size": 10000,
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "value": {
              "gte": 199,
              "lte": 1000000000000000000000
            }
          }
        },
        {
          "match": {
            "name": "jvm.thread-states.count"
          }
        }
      ]
    }
  },
  "fields": ["value"]
}

Result:

{
   "took": 12,
   "timed_out": false,
   "_shards": {
      "total": 16,
      "successful": 16,
      "failed": 0
   },
   "hits": {
      "total": 1,
      "max_score": 5.197636,
      "hits": [
         {
            "_index": "metrics-2016-04",
            "_type": "gauge",
            "_id": "AVRivuU1roeuo1W0WbLL",
            "_score": 5.197636,
            "fields": {
               "value": [
                  199
               ]
            }
         }
      ]
   }
}

Hi,

"key": 4639024275540410000,

those keys look like timestamps. Do you probably have two types defined for the metrics-2016-04 index which have conflicting mappings for the value field? Whats the output of GET metrics-2016-04/_mappings?

hmm, interesting idea. I think the mapping looks Ok though:

{
   "metrics-2016-04": {
      "mappings": {
         "_default_": {
            "_all": {
               "enabled": false
            },
            "properties": {
               "name": {
                  "type": "string",
                  "index": "not_analyzed"
               }
            }
         },
         "gauge": {
            "_all": {
               "enabled": false
            },
            "properties": {
               "@timestamp": {
                  "type": "date",
                  "format": "dateOptionalTime"
               },
               "instance_name": {
                  "type": "string"
               },
               "name": {
                  "type": "string",
                  "index": "not_analyzed"
               },
               "value": {
                  "type": "long"
               }
            }
         }
      }
   }
}

Now with the new month our monitoring tool created a new index and although the mapping is the same the problem hasn't occurred yet...