How can ElasticSearch be used to implement a social search?


(Clay Wardell) #1

I’m trying to create a business search with social features using
ElasticSearch. I have a business directory, and users can interact
with those businesses in different ways: by reviewing them, checking
into them, etc.

When a user searches for a business, I'd like to be able to show them
the businesses that their friends have interacted with at the top of
the results (or filter based on those interactions). What's the best
way to set up my index to achieve this?

I can think have a few possible solutions, but I'm a beginner with ES
and I'm not sure what will cause problems:

I could use multi-tennancy and create a separate index for each user.
I've ruled this out because the number of users is much greater than
the amount of businesses or the amount of user-specific content.

I could add a list of user/score pairs to each indexed business. Every
user who has interacted with the business would be in there, and the
score would represent the amount of interaction they'd had with the
business (this is good enough for my filtering/sorting purposes).
Every time they interact with the business, I would update the score
in the index. The problem with this is that I only care about my
friends' activity, so I would need to figure out some way to take into
account who my friends are when creating a composite score for the
business. I don't know how to do this in ES.

I could create a similar scheme, but instead of keeping score of my
interactions with a business, the score would reflect my friends'
interactions with the business. This takes away the need to model my
social graph in ElasticSearch, but it does mean that any time a person
interacts with a business, I would need to update all of their
friends' scores. It would also mean that the list of user/score pairs
for each business would be larger, since it'll need to include anybody
who has a friend who has interacted with the business.

The final solution I can think of is to keep track of every individual
interaction that happens to a business, and add it to business’s
document in ES. This doesn’t seem realistic to me – it combines the
problems from the other solutions. But it’s probably the most
straightforward approach in terms of keeping the index up to date.

Thanks for your help!


(Shay Banon) #2

Heya,

This one is a bit tricky. Lets say that you only want to list business
that my friends interacted with. A simple option is to have a list of user
ids that interacted in a business, and then when searching, going to the
relevant user document, fetching his friends, and filtering (terms filter
for example) the businesses based on the friends.

One of the main questions is the changes done to the list of friends.

Storing them on the business document means reindexing the business doc
each time there is a new interaction. It can work well, depending on the
frequency of interactions. Another option is to have an interaction as a
child document of the business, so each new interaction is a new child
document of the business, and then use has_child to filter business based
on that.

On Thu, May 24, 2012 at 3:56 PM, Clay Wardell clay.wardell@gmail.comwrote:

I’m trying to create a business search with social features using
ElasticSearch. I have a business directory, and users can interact
with those businesses in different ways: by reviewing them, checking
into them, etc.

When a user searches for a business, I'd like to be able to show them
the businesses that their friends have interacted with at the top of
the results (or filter based on those interactions). What's the best
way to set up my index to achieve this?

I can think have a few possible solutions, but I'm a beginner with ES
and I'm not sure what will cause problems:

I could use multi-tennancy and create a separate index for each user.
I've ruled this out because the number of users is much greater than
the amount of businesses or the amount of user-specific content.

I could add a list of user/score pairs to each indexed business. Every
user who has interacted with the business would be in there, and the
score would represent the amount of interaction they'd had with the
business (this is good enough for my filtering/sorting purposes).
Every time they interact with the business, I would update the score
in the index. The problem with this is that I only care about my
friends' activity, so I would need to figure out some way to take into
account who my friends are when creating a composite score for the
business. I don't know how to do this in ES.

I could create a similar scheme, but instead of keeping score of my
interactions with a business, the score would reflect my friends'
interactions with the business. This takes away the need to model my
social graph in ElasticSearch, but it does mean that any time a person
interacts with a business, I would need to update all of their
friends' scores. It would also mean that the list of user/score pairs
for each business would be larger, since it'll need to include anybody
who has a friend who has interacted with the business.

The final solution I can think of is to keep track of every individual
interaction that happens to a business, and add it to business’s
document in ES. This doesn’t seem realistic to me – it combines the
problems from the other solutions. But it’s probably the most
straightforward approach in terms of keeping the index up to date.

Thanks for your help!


(Clay Wardell) #3

Shay,

Thanks for getting back to me. Both of your solutions seem solid for
filtering on whether or not my friends have interacted with a
business. However, I'm looking for something a little more
sophisticated -- I want to capture how much my friends have
interacted with each business, and then use that score to contribute
to my queries. So somewhere where all my friends eat every day is
more relevant than somewhere where one of my friends ate once.

Any thoughts on how I to go about this?

Thanks again for your help,
Clay

On May 29, 1:53 pm, Shay Banon kim...@gmail.com wrote:

Heya,

This one is a bit tricky. Lets say that you only want to list business
that my friends interacted with. A simple option is to have a list of user
ids that interacted in a business, and then when searching, going to the
relevant user document, fetching his friends, and filtering (terms filter
for example) the businesses based on the friends.

One of the main questions is the changes done to the list of friends.

Storing them on the business document means reindexing the business doc
each time there is a new interaction. It can work well, depending on the
frequency of interactions. Another option is to have an interaction as a
child document of the business, so each new interaction is a new child
document of the business, and then use has_child to filter business based
on that.

On Thu, May 24, 2012 at 3:56 PM, Clay Wardell clay.ward...@gmail.comwrote:

I’m trying to create a business search with social features using
ElasticSearch. I have a business directory, and users can interact
with those businesses in different ways: by reviewing them, checking
into them, etc.

When a user searches for a business, I'd like to be able to show them
the businesses that their friends have interacted with at the top of
the results (or filter based on those interactions). What's the best
way to set up my index to achieve this?

I can think have a few possible solutions, but I'm a beginner with ES
and I'm not sure what will cause problems:

I could use multi-tennancy and create a separate index for each user.
I've ruled this out because the number of users is much greater than
the amount of businesses or the amount of user-specific content.

I could add a list of user/score pairs to each indexed business. Every
user who has interacted with the business would be in there, and the
score would represent the amount of interaction they'd had with the
business (this is good enough for my filtering/sorting purposes).
Every time they interact with the business, I would update the score
in the index. The problem with this is that I only care about my
friends' activity, so I would need to figure out some way to take into
account who my friends are when creating a composite score for the
business. I don't know how to do this in ES.

I could create a similar scheme, but instead of keeping score of my
interactions with a business, the score would reflect my friends'
interactions with the business. This takes away the need to model my
social graph in ElasticSearch, but it does mean that any time a person
interacts with a business, I would need to update all of their
friends' scores. It would also mean that the list of user/score pairs
for each business would be larger, since it'll need to include anybody
who has a friend who has interacted with the business.

The final solution I can think of is to keep track of every individual
interaction that happens to a business, and add it to business’s
document in ES. This doesn’t seem realistic to me – it combines the
problems from the other solutions. But it’s probably the most
straightforward approach in terms of keeping the index up to date.

Thanks for your help!


(system) #4