Hello,
I am working in an Elasticsearch environment in which I can monitor and search logs within a game server. This includes a reputation/punishment system, or voting, for example to kick a cheater from the game.
Example document:
{
"id": "example-doc-id",
"user_id": 7656xxxxxxxxxxx74,
"display_name": "mr_cheater",
"user_bio": "I am a user with a bad reputation.",
"votes": {
"positive": [
7656xxxxxxxxxxx59,
7656xxxxxxxxxxx71
],
"negative": [
7656xxxxxxxxxxx12,
7656xxxxxxxxxxx63,
7656xxxxxxxxxxx00,
7656xxxxxxxxxxx84
]
}
}
As clarification, the votes are an array of player IDs used by the game engine. A vote could have more options than just "positive" and "negative" ratings, which is why they are nested instead of their own fields, but my concern lies in this: a theoretically unlimited number of participants can vote for a given option. An average Joe who plays casually (a majority) may have one or two ratings on their profile, whereas competitive players will have many more.
I am aware that large documents will gradually slow Elasticsearch down, with a default limit of 100MB, so I'm wondering how many votes a document could have before there is any noticeable issue with query speed. A few hundred? Few thousand? Or, would we not see much difference until we're into the realm of megabytes+?
For reference, our cluster has about 12GB of RAM currently, which we are able to scale up. I'd like to get some opinions on whether this is good/bad practice before implementing it in production.
Thank you in advance for your assistance,
Matthias