I would like to understand the numbers around how costly indexing is when there are updates. Are there any numbers that can be shared, based on your experiences ?
Specifically, lets say I have a blog posts index containing blog posts documents, comments index that contain comments related to blog posts and a likes index which contains likes for comments and posts. In this case, both comments and likes are likely to be updated more often. The blog post itself would not change.
Is the frequent liking/unliking a comment or post, or, adding new comments that causes updates to ES, and hence , reindex , a concern. For now, say, I have only 100000 posts and 10000 comments and 10000 likes. The concurrent users may be no more than 500-1000.
Also at what point should I consider alternatives such as storing some of these fields in a relational database ?
Check about rally to benchmark your sever and simulate your load and data.
Also you can use redis for the like/unlike to prevent add/remove i.e if somebody add a like you have the like data and when it was ceated if he unlike you know when he unlike it so before deleting the like you can check if the like is recent (less than x sec) you can save in redis or other cache tool and make something like a buffer before setting the definitive data in elastic. But for me it's an edge case all your 500 people will not like/unlike frenetically all day long.
If you store your data like logs (as explain in the other topic) you don't need relation, join and update.
Hope it help.
Thanks for your response. It is super helpful. I will check out the benchmark tool and try that.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.