Histogram aggregation on collapse result

Hi,

My question is that I need to find the first time login of each user, and then apply date_histogram aggregation. I tried top_hits aggregation, min aggregation and collapse. They all get the first time login of each user, but I don't know how to apply histogram aggregation on the top_hit or min aggregation result. And for collapse, the doc says "The collapsing is applied to the top hits only and does not affect aggregations." Is there a way to do it?

Thank you so much.

Example data:

      {
        "_id" : "d78f4a88",
        "@timestamp" : "2019-08-23T20:03:13.297608",
        "eventType" : "login",
        "userId" : "9784a0008cf2"
      },
      {
        "_id" : "78852d56",
        "@timestamp" : "2019-08-27T18:13:58.963763",
        "eventType" : "login",
        "userId" : "9784a0008cf2"
      },
      {
        "_id" : "6a7b9406",
        "@timestamp" : "2019-08-28T03:47:04.704077",
        "eventType" : "login",
        "userId" : "3b3be93b0751"
      },
      {
        "_id" : "23490d4",
        "@timestamp" : "2019-08-28T23:54:23.704586",
        "eventType" : "login",
        "userId" : "3b3be93b0751"
      }

Thanks.

This sort of behavioural analysis is expensive - especially if the data is distributed across many nodes and the cardinality of userId is high.
You need to bring the related data physically closer together using an entity-centric index keyed on userID. The new dataframes feature would allow you to pivot your data in this way to make the analysis possible. You would join on the user ID and record the start date using the min aggregation on the timestamp field.

1 Like

Hi @Mark_Harwood,

I see. Thank you so much. The data frame transforms is exactly what I need. But it's a new feature in version 7.2+ and our version is 6.7. Is there a way to do it without data frame transforms even it's expensive? We need a working demo soon before we can update to version 7.2.

Thank you so much.

Try this script-based approach

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.