I'm looking to deploy a strategy for the following search problem
using elastic search.
A simplified version of the problem is as follows:
Say I have two data sets
- MyLikes
- contains multiple users and multiple records for each user
with 2 fields i.e. fields username & liketext
- ForSale
- contains multiple product records
with 2 fields i.e. product_title & description
I want to match MyLikes against ForSale and generate a third dataset
called Sales_I_Might_Like
The strategies I have in mind are
Turn each ForSale entry into a tagsoup by removing stopwords and
use these tags to go against MyLikes and build a table of users
for each Product ranked by number of hits
Do a more_like_this search for each ForSale Record against the
MyLikes dataset and use those results.
Are there any other strategies that I could use and would the
more_like_this
strategy be effective.
I'm looking to deploy a strategy for the following search problem
using Elasticsearch.
A simplified version of the problem is as follows:
Say I have two data sets
MyLikes
contains multiple users and multiple records for each user
with 2 fields i.e. fields username & liketext
ForSale
contains multiple product records
with 2 fields i.e. product_title & description
I want to match MyLikes against ForSale and generate a third dataset
called Sales_I_Might_Like
The strategies I have in mind are
Turn each ForSale entry into a tagsoup by removing stopwords and
use these tags to go against MyLikes and build a table of users
for each Product ranked by number of hits
Do a more_like_this search for each ForSale Record against the
MyLikes dataset and use those results.
Are there any other strategies that I could use and would the
more_like_this
strategy be effective.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.