Hi everyone, I would love to hear some tips on how to avoid data redundancy in ES and Kibana.
We are using ES and Kibana to collect and visualize videogame analytics. The game client indexes an "event" - a document with lots of fields like "EventTimestamp", "EventName", "UserID", "SessionID", "ScreenResolution" etc. - when the player performs some significant actions as launching the game, loading the gameplay, dying, returning to the menu etc. All events until quitting the game are considered as one "session", marked by a unique "SessionID".
The problem is that there is dozens of fields (like "GPU_Name") whose values are constant throughout the whole session, but we still send them with every event in the session. That significantly raises the storage size of all individual event documents and makes the index size grow quickly. It feels wasteful to use the storage and memory for that much redundant information.
But we are sending the information with every individual event to be able to leverage it in Kibana. If I want to visualize for example the number of crashes by "GPU_Name", it's very straightforward if I have the gpu field on the "GameCrash" event document", but as far as I know it is very difficult or impossible to do if the gpu field is only on the "SessionStart" event document sent half an hour prior.
So, my question is: Should we just settle with redundant fields taking our storage space? Or is there some trick to structure our data differently to avoid duplicate information across the events in a session, but still be able to use it in Kibana's visualizations?