How to denormalize array with huge number of values Elasticsearch?


We're building an app which is using Elasticsearch. For each user there will be followers (array of longs - ids of user who follow this user). With time, number of followers might increase to hundreds of thousands, so I guess that simply storing it in array followers (followers": [1,2,3,...]) will slow the system down significantly and cause memory issues maybe on replication.

What is the best way to handle this?

Is it good approach to create a new index named follow, where each follow document would be in the format

   "follower_id": 1,
   "following_id" : 2

I would need to get followers_count and possibly followers for user in some cases where I get user profile. followers_count can be added as property to users index, but again is it good approach?

Thanks in advance


Constantly updating a doc containing thousands of ids is not going to be handled efficiently.
Your multiple-doc approach will be more efficient at writing but could make certain types of analysis harder at query time eg sorting people by numbers of followers.