Modelling Social search to work at scale

The question is more inclined towards how to model my data to satisfy all kind of queries at scale.
I am trying to model search with Social factor to be in consideration .

Example use-case:-

1 - A user can post a status with a #hashtag.
2 - A user can follow another user.
3 - A user can like/comment on a status.

I would like answer to queries such as:-

1 - find all posts containing hashtag starting with #health(#heatlth , #healthIsLife i.e all tags with prefix #health). In this search those posts which are liked/commented by users I follow should come on top or have better score.

2 - Find all users whose name starts with Steve. Users who are followed by people I follow should come on top of search result.

3 - All users whom a user follow.

4 - All users following a user.

I tried modelling the above in a parent-child relationship model. However not everything was achieved.

Creating a new user.

tags:[fitness , health , Illionios]
location: xyz

tags:[GymFreak ,Austin]
location: abc
When a user follow another user.

With above however I am confused If user2 follow other users and they lay on same shard, how will it be achieved as a child cannot have multiple parents. And If I give a unique ID while inserting every child instead on userID, then I cannot answer my #4.
Also , when searching for #2 , I can get the users but cannot get the attributes(which of my friends follow them) , since I cannot get child attributes in has_child query.

How can I achieve my 1-4 with the above modelling efficiently ?

Note - A user can be followed by millions of users and can follow several thousand users.

Similarly , A post can be liked by million if users if its a celebrity post.

You won't be able to do this all in a single document structure. Look to setup an event driven format for tweets, then another one for users and their relationships.

Check out for an example.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.