ElasticSearch as a recommender?


(sam.leroux) #1

Hello,

I'm developing a real-time news recommendation engine and was hoping to be
able to use ElasticSearch.
The idea is to store tags for each user in a database
("sport","football","tennis","Olympics","recommender systems", ...).
I would add the news articles to elasticsearch and use the tags to
automatically query ES to generate the recommendations.

I still have a few questions:

  • Is ES able to do this in (near) real-time ?
  • Is ES able to handle big (huge) queries ? I would like to send all the
    tags for the user to ES, each with it's own weight and would like to
    receive a list of news articles that match the users interests.

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0553ccec-66e1-4623-ac3d-380ba9a2c4a1%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Mark Walkom) #2

Yes to both, it'll depend on your data and the volumes but ES can handle
storing and retrieving the tags.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 7 February 2014 21:26, sam.leroux@gmail.com wrote:

Hello,

I'm developing a real-time news recommendation engine and was hoping to be
able to use ElasticSearch.
The idea is to store tags for each user in a database
("sport","football","tennis","Olympics","recommender systems", ...).
I would add the news articles to elasticsearch and use the tags to
automatically query ES to generate the recommendations.

I still have a few questions:

  • Is ES able to do this in (near) real-time ?
  • Is ES able to handle big (huge) queries ? I would like to send all the
    tags for the user to ES, each with it's own weight and would like to
    receive a list of news articles that match the users interests.

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0553ccec-66e1-4623-ac3d-380ba9a2c4a1%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624ZvwrgJO5ZDGSQXNk4C2-c1Shn2JV%3DcGHf%3DrV5O9sQoXQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(sam.leroux) #3

Thank you Mark for your quick response.
I was thinking of keeping the tags associated with each user (his
interests) in a separate database.
I would only store the news articles in ES.
When the user requests recommended articles I would fetch his tags and
build a query containing all these tags.
Would this be a decent way of doing it?
I was worried that ES couldn’t handle these huge queries (potentially
hundreds of tags) but apparently I was underestimating ES.

Thanks

Op vrijdag 7 februari 2014 11:31:21 UTC+1 schreef Mark Walkom:

Yes to both, it'll depend on your data and the volumes but ES can handle
storing and retrieving the tags.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com <javascript:>
web: www.campaignmonitor.com

On 7 February 2014 21:26, <sam.l...@gmail.com <javascript:>> wrote:

Hello,

I'm developing a real-time news recommendation engine and was hoping to
be able to use ElasticSearch.
The idea is to store tags for each user in a database
("sport","football","tennis","Olympics","recommender systems", ...).
I would add the news articles to elasticsearch and use the tags to
automatically query ES to generate the recommendations.

I still have a few questions:

  • Is ES able to do this in (near) real-time ?
  • Is ES able to handle big (huge) queries ? I would like to send all the
    tags for the user to ES, each with it's own weight and would like to
    receive a list of news articles that match the users interests.

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0553ccec-66e1-4623-ac3d-380ba9a2c4a1%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1351fc79-051b-4207-bbb3-2174f4d832bc%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Binh Ly) #4

Have a look at filters, specifically the terms filter. They execute very
fast and are cached too.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-terms-filter.html

On Friday, February 7, 2014 5:57:15 AM UTC-5, sam.l...@gmail.com wrote:

I was worried that ES couldn’t handle these huge queries (potentially
hundreds of tags) but apparently I was underestimating ES.

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c336837b-eb9c-4845-97dd-331655817dad%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(sam.leroux) #5

Thank you Binh, that looks promising.

Just to be sure, will I be able to build a query containing a list of tags
each with it's own weight like this:
{"sport":0.25,"tennis":0.54862,"recommender systems":0.236,"search
engine":0.59} ?
Then I would like to extract all documents that match as many tags as
possible.
Is the correct way of doing this using boosts with OR ?

Thank you

Op vrijdag 7 februari 2014 13:09:58 UTC+1 schreef Binh Ly:

Have a look at filters, specifically the terms filter. They execute very
fast and are cached too.

http://www.elasticsearch.org/blog/all-about-elasticsearch-filter-bitsets/

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-terms-filter.html

On Friday, February 7, 2014 5:57:15 AM UTC-5, sam.l...@gmail.com wrote:

I was worried that ES couldn’t handle these huge queries (potentially
hundreds of tags) but apparently I was underestimating ES.

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0a420fca-7fb7-4640-9cfc-642b7c3df958%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #6