ES 6.1.0 Aggregations Empty


(Shreyas Karnik) #1

I have about 400K documents in ES (3 node cluster), the documents have following format:

{
    "@timestamp": "2017-11-21T06:49:39.800Z",
    "message": "xxxx",
    "username": "xxxx",
    "geoip": {
        "country_name": "United States",
        "region_name": "Washington",
        "city_name": "Seattle"
    }
}

I am performing nested aggregations using the following query:

{
    "size": 0,
    "query": {
        "bool": {
            "must": [
                {
                    "range": {
                        "@timestamp": {
                            "gte": "now-90d/d",
                            "lt": "now/d",
                            "time_zone": "US/Pacific"
                        }
                    }
                }
            ]
        }
    },
    "sort": [
        {
            "@timestamp": {
                "order": "desc"
            }
        }
    ],
    "aggs": {
        "countries": {
            "terms": {
                "size": 1000000,
                "field": "geoip.country_name.keyword"
            },
            "aggs": {
                "regions": {
                    "terms": {
                        "size": 1000000,
                        "field": "geoip.region_name.keyword"
                    },
                    "aggs": {
                        "cities": {
                            "terms": {
                                "size": 1000000,
                                "field": "geoip.city_name.keyword.keyword"
                            },
                            "aggs": {
                                "users": {
                                    "terms": {
                                        "size": 1000000,
                                        "field": "username.keyword"
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

I do get results back for this query and the aggregations are computed. However if I change the time range to target some of the most recent documents for example:

{
    "range": {
        "@timestamp": {
            "gte": "now-2d/d",
            "lt": "now/d",
            "time_zone": "US/Pacific"
        }
    }
}

I get hits back but no aggregations, I can confirm that there is documents matching the time range I provided and the data format has not changed either.

Only thing I can think of is that I recently upgraded to ES 6.1.0 from ES 6.0.0
I found this odd that ES computes aggregations for slice of documents containing older documents but does not compute aggregations for newer documents.

Can anyone point me in the right direction to solve this issue?


(David Pilato) #2

What if you remove

"size": 100000

For each agg?


(Shreyas Karnik) #3

I just tried that, still get empty aggregations:

{
  "took": 196,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 20266844,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "countries": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": []
    }
  }
}

(David Pilato) #4

Are you running the query on multiple indices? like index-*?

I wonder if the mapping is different for most recent indices?

Could you check the mapping you have for a recent index?
And if it looks ok to you, could you reproduce a small script like:

DELETE index
PUT index
{
  "settings": {},
  "mappings": {}
}
PUT index/doc/1
{
    "@timestamp": "2017-11-21T06:49:39.800Z",
    "message": "xxxx",
    "username": "xxxx",
    "geoip": {
        "country_name": "United States",
        "region_name": "Washington",
        "city_name": "Seattle"
    }
}
GET index/_search
{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "@timestamp": {
              "gte": "now-2d/d",
              "lt": "now/d",
              "time_zone": "US/Pacific"
            }
          }
        }
      ]
    }
  },
  "sort": [
    {
      "@timestamp": {
        "order": "desc"
      }
    }
  ],
  "aggs": {
    "countries": {
      "terms": {
        "size": 1000000,
        "field": "geoip.country_name.keyword"
      },
      "aggs": {
        "regions": {
          "terms": {
            "size": 1000000,
            "field": "geoip.region_name.keyword"
          },
          "aggs": {
            "cities": {
              "terms": {
                "size": 1000000,
                "field": "geoip.city_name.keyword.keyword"
              },
              "aggs": {
                "users": {
                  "terms": {
                    "size": 1000000,
                    "field": "username.keyword"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

(Shreyas Karnik) #5

No, I am not all of the data is in one single index.

Here is the mapping from the index (from /app-access/_mappings)

{
    "app-access": {
        "mappings": {
            "doc": { 
              // ES 6.0 onwards
                "_all": {
                    "enabled": true
                },
                "properties": {
                    "@timestamp": {
                        "type": "date"
                    },
                    "geoip": {
                        "properties": {
                            "city_name": {
                                "type": "text"
                            },
                            "country_name": {
                                "type": "text"
                            },
                            "region_name": {
                                "type": "text"
                            }
                        }
                    }
                },
                "app-access": {
                   // mapping ES 5.6.X
                    "_all": {
                        "enabled": true
                    },
                    "properties": {
                        "@timestamp": {
                            "type": "date"
                        },
                        "geoip": {
                            "properties": {
                                "city_name": {
                                    "type": "text",
                                    "fields": {
                                        "keyword": {
                                            "type": "keyword",
                                            "ignore_above": 256
                                        }
                                    }
                                },
                                "country_name": {
                                    "type": "text",
                                    "fields": {
                                        "keyword": {
                                            "type": "keyword",
                                            "ignore_above": 256
                                        }
                                    }
                                },
                                "region_name": {
                                    "type": "text",
                                    "fields": {
                                        "keyword": {
                                            "type": "keyword",
                                            "ignore_above": 256
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

Upon close inspection I see that when the upgrade to ES 6.0 took place a new mapping under the type doc was created and the earlier mapping which had the "type: keyword" and now new documents (post ES 6.X) do not have the "type: keyword" hence the aggregations are empty for the latest documents (I guess)

So to provide more context I am using the logstash-elasticsearch-output plugin to ingest the logs and since ES 6.X the multiple _types not supported I removed explicit document_type => "access_logs" from my logstash config and I can see why this mapping issue happened.

So at this point shall I use the re-index API to fix this issue or is there is prescribed way to solve this?

I am thankful for the great pointer regarding checking out mapping for this index.


(Shreyas Karnik) #6

Thanks @dadoonet for pointing towards mappings I was able to use the reindex API to fix this issue.


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.