Wrong results for nested aggregation on numerical fields


(Georgiana) #1

I have an ES 2.0 index with a custom mapping on which I perform some aggregations. The problem is that in my aggregation on fields of type 'double' there are values returned that don't appear in the same query performed on the MongoDB collection where they came from even though the number of results (i.e 'doc_count') is the same.
The fields I checked so far and found this problem are amount.value.amount, amount.value.x_amountEur and tender.value.x_amountEur. My mappings, as stated by curl -XGET 'http://localhost:9200/index_name/_mappings?pretty&human', are in this gist. The aggregation that I used is here.
For the sake of brevity, I will paste only one of the results:


"aggregations":
 {"entities":
   {"doc_count":24300,
    "procuring_entity_names":
     {"doc_count_error_upper_bound":0,
      "sum_other_doc_count":0,
      "buckets":
       [{"key":"vsia-bernu-kliniska-universitates-slimnica",
         "doc_count":1360,
         "suppliers":
          {"doc_count":1360,
           "suppliers_names":
            {"doc_count_error_upper_bound":0,
             "sum_other_doc_count":0,
             "buckets":
              [{"key":"recipe-plus-as",
                "doc_count":388,
                "awards":
                 {"doc_count":388,
                  "award_amounts":
                   {"doc_count_error_upper_bound":0,
                    "sum_other_doc_count":0,
                    "buckets":
                     [{"key":3679.08661250056, "doc_count":372},
                      {"key":0.0, "doc_count":13},
                      {"key":80472.0, "doc_count":1},
                      {"key":331636.6, "doc_count":1},
                      {"key":342348.935935837, "doc_count":1}]}}}

You'll notice in the last bucket there are four different values for that field, while a query in MongoDB for the same field/ pair of procuring_entity- supplier states that there 388 values that all have the value 3679.08661250056 for which ES finds only 372. There is no trace of the other values.
I also tried mapping them as 'float' ,'long' and 'string'. For 'float' and 'string' it returned the same results and with 'long' the results were even more numerous and incorrect. I have a record of them if someone thinks that would help.
I have performed the same operations on a different machine with a clean ES and the results were the same.
I've been investigating this for a while and I have no other idea of what it could be so any help would be greatly appreciated.


(Georgiana) #2

After further investigation, I found that the problem was how I was retrieving the values. Using a reverse _nested aggregation for all the nested aggregations involved except for the first one gives the proper response.
I'll leave the aggregation that returns the correct result in this gist in case anyone who encounters the same problem will want to have a look.


(system) #3