Best design for managing 'likes' on each comment?

Hi, I'm new to Elasticsearch and struggling with the design.

I have 2 indices. One is 'comment' index which contains comments users wrote, the other is 'ex_comment' which also contains comments but crawling data.

What I want to make is this.
Like 'Facebook like', I want to show how many users like of each comment.
Since the user should know if the user hit like on each comment or not,
Just adding 'counter' field is not enough, and I thought there should be the mapping tables as below.

(index)
like_comment, like_ex_comment

(fields - both are same)
"comment_id" // id from parent index
"user_id" // user who hits like on the comment

These two indices will be childs of each parent indices.

I've heard that Elasticsearch can't have multiple parents. So I thought I need 2 indices respectively.

And I'll be using 'has-child' query if I decide to use this design.

I'm not sure if this would be the best design for managing 'likes'. I'm also worried about the performance since 'hit like' will be frequently executed.

Thank you.

Hi Jenny,
Much depends on the types of query you want to do.
If you only need to know if a particular user has liked a particular comment (e.g. to colour a comment's heart icon accordingly when rendering a web page) then you only need a regular index with comment_id and user_id and query that alone.
If you want to get fancy and query properties of both comments and users in the same request (e.g. find users from London who liked comments about cheese) then you'll need something more either by denormalising data or using parent/child routing of related data to the same machines.

Best to consider the questions you want to ask of the data up-front

1 Like

Hi Mark, thanks for the quick reply.

I want to show a list of comment, and each of it has the number of how many users have liked as below.

:heart: 999 like this comment (empty heart)
:heart: 999 like this comment (full heart if the user has liked this comment)

For now, I only need to get how many likes on every comment and whether the user has liked it or not.

So is it okay to make just one index (comment_id, user_id) and calculate sum on every single comment?

Since one page has 10 rows(comments) I think calculating sum from the newly created index not so costly.

Thank you.

True. I put an example in this gist

1 Like

I really appreciate that you wrote an example script which covers cases I've mentioned perfectly. I got a lot out of it.
Thank you :slight_smile:

1 Like

No problem. Glad it was useful.
Some times it's the simple things that are hard. Makes me marvel at how things like Twitter scale, counting likes for all tweets and remembering which tweets we liked, going back years.

1 Like

So true. Knowing and understanding the architecture of high scale system is marvelous. Anyways, thank you again. And hope you stay safe and well.

1 Like

Hi, as you suggested a design for the 'like comments', I could make it perfectly as I expected.

But recently I've got stuck on another problem, and It's hard for me to figure out.
User comments also need to be shown in descending order by user likes.

As you suggested earlier, I made 'likes_index' to save the user likes.

I queried 'comment_index/_search', 'likes_index/_search' respectively, and then I merged the results to show the result as below. (fetch 10 rows per page)

Comment1
OO liked this comment

Comment2
OO liked this comment
..

But how can I get the result in descending order? It's quite easy when I use RDB, but it's hard for me to think..

I'm not sure if I should think of another design or it can be solved by the current design.
When I googled it, I've seen about 'denormalization', but I'm unsure that I need this here.

If you don't mind, could you give me some ideas that I can solve this problem?

Thank you.

That should happen naturally in the example I gave. Terms are sorted by doc_count descending so the more "liked" docs for a comment in the likes index the higher it appears in the array of top comment terms in the results. I amended my gist with extra like docs to show this effect.
It's also possible to sort aggregation terms by other things like max date (to get the latest first) but the default sort order of number of docs should be working in your favour here.

1 Like

As far as I understand, using sort aggregation terms can be possible only with the fields of 'likes_index'. Is it right?
Because I've heard that I cannot do join with elasticsearch. And what I might need is to sort index with fields from comment index.

For now, I only need to search for on 'title' of comments.
ex) user search 'love' --> fetch comment data, in descending order by user likes.

So I think just adding a 'title' field to likes_index could be the possible answer in order to search for comments which include specific keywords.
I guess this is the best way to think of, but please, tell me if there are better options.
And in the near future, I might need more functions, such as elasticsearch query score.(on comment_index)