Hi all!
I have event logs from my system that describe the actions that users take. I am trying to devise the best method for storing those events in a user entity model.
For example, my event data looks like this:
{time: 2021-01-03, customerId: 1234-5678, action: "logged-in"}
{time: 2021-01-03, customerId: 1234-5678, action: "viewed-item", item-id: 222}
{time: 2021-01-07, customerId: 1234-5678, action: "logged-in"}
{time: 2021-01-07, customerId: 1234-5678, action: "viewed-item", item-id: 444}
{time: 2021-01-07, customerId: 1234-5678, action: "viewed-item", item-id: 555}
{time: 2021-01-11, customerId: 1234-5678, action: "logged-in"}
{time: 2021-01-11, customerId: 1234-5678, action: "viewed-item", item-id: 444}
And my desired output is something like this:
{
customerId: 1234-5678,
productsViewed: [{time: 2021-01-03, item-id: 222},{time: 2021-01-07, item-id: 444},{time: 2021-01-07, item-id: 555},{time: 2021-01-11, item-id: 444}]
logins: [{time: 2021-01-03}],{time: 2021-01-07},{time: 2021-01-11}]
}
Is there a recommended approach using the Elastic stack to transform my event logs into entity-centric models that contain nested (and abbreviated) event data like this?
Elasticsearch transforms do not support aggregations that return hits/documents (e.g. top_hits), so I do not see a way to leverage transforms to output a nested array of abbreviated events. I do not believe ingest pipelines are the solution, either, as they seem to transform documents more than create entities.