Modelling Social search to work at scale

jalaj · November 18, 2017, 3:50pm

The question is more inclined towards how to model my data to satisfy all kind of queries at scale.
I am trying to model search with Social factor to be in consideration .

Example use-case:-

1 - A user can post a status with a #hashtag.
2 - A user can follow another user.
3 - A user can like/comment on a status.

I would like answer to queries such as:-

1 - find all posts containing hashtag starting with #health(#heatlth , #healthIsLife i.e all tags with prefix #health). In this search those posts which are liked/commented by users I follow should come on top or have better score.

2 - Find all users whose name starts with Steve. Users who are followed by people I follow should come on top of search result.

3 - All users whom a user follow.

4 - All users following a user.

I tried modelling the above in a parent-child relationship model. However not everything was achieved.

Creating a new user.

/users/user/user1
{
user_id:user1
tags:[fitness , health , Illionios]
location: xyz
}

/users/user/user2
{
user_id:user2
tags:[GymFreak ,Austin]
location: abc
}
When a user follow another user.

/users/follow/user2?parent=user1
{
user_id:user2
}
With above however I am confused If user2 follow other users and they lay on same shard, how will it be achieved as a child cannot have multiple parents. And If I give a unique ID while inserting every child instead on userID, then I cannot answer my #4.
Also , when searching for #2 , I can get the users but cannot get the attributes(which of my friends follow them) , since I cannot get child attributes in has_child query.

How can I achieve my 1-4 with the above modelling efficiently ?

Note - A user can be followed by millions of users and can follow several thousand users.

Similarly , A post can be liked by million if users if its a celebrity post.

warkolm · November 20, 2017, 9:22am

You won't be able to do this all in a single document structure. Look to setup an event driven format for tweets, then another one for users and their relationships.

Check out https://www.elastic.co/elasticon/2015/sf/building-entity-centric-indexes for an example.

system · December 18, 2017, 9:22am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.