I'm still not sure why you need ES at all in this case, can't you just get
a ranked list of jobs for that user from mysql?
But if there's a reason why you need to combine relevance with an ES query
(e.g. for additional filtering), maybe the best way would be to have a
separate type called "relevance" with "user_id" and "score" fields, and set
the job type as its _parent.
Then do something like described here:
in the "Ordering" section.
On Sunday, 7 October 2012 11:44:57 UTC+1, Joakim Ekström wrote:
Oh, sorry. The relevance data is precalculated and currently stored in
mysql. job_id, user_id, relevance.
My issue is strictly regarding how to be able to index this in the optimal
way, in order for it to be sortable by.
Den söndagen den 7:e oktober 2012 kl. 12:03:21 UTC+2 skrev Andrew Clegg:
How exactly are you representing the relevance data?
I would go about this by doing something like: each user has a list of
keywords/phrases that represent their interests. This is stored alongside
their user profile in your database. Then you could just construct a match
or mlt query out of those words/phrases and query the jobs index with it.
Any additional criteria the user applies at search time could just be added
Or have you already pre-calculated job<->user relevance using a batch
process somewhere else? In that case you probably don't need ElasticSearch
for this part A row in a key-value store with an ordered list of job
IDs for each user would do, right?
On Saturday, 6 October 2012 22:49:16 UTC+1, Joakim Ekström wrote:
I have a list of jobs, queryable by the user that i want to be able to
sort according to the relevance for the logged in user. There are roughly
100 000 jobs and more users. How should i approach this indexing? Is
attaching all user relevance data for a specific job to that job's document
such a good idea?
Also considering using persistent storage mode for the relevance data,
and storing only in ES. Would that affect the optimal setup?