How to denormalize array with huge number of values Elasticsearch?

Hi,

We're building an app which is using Elasticsearch. For each user there will be followers (array of longs - ids of user who follow this user). With time, number of followers might increase to hundreds of thousands, so I guess that simply storing it in array followers (followers": [1,2,3,...]) will slow the system down significantly and cause memory issues maybe on replication.

What is the best way to handle this?

Is it good approach to create a new index named follow, where each follow document would be in the format

{
   "follower_id": 1,
   "following_id" : 2
}

I would need to get followers_count and possibly followers for user in some cases where I get user profile. followers_count can be added as property to users index, but again is it good approach?

Thanks in advance

Anyone?

Constantly updating a doc containing thousands of ids is not going to be handled efficiently.
Your multiple-doc approach will be more efficient at writing but could make certain types of analysis harder at query time eg sorting people by numbers of followers.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.