Related Searches


(Aaron Rosenthal) #1

How is it best to offer our users related search suggestions based on
related query. Does elastic have features like this built in our would re
have to build our own. Any suggestions would be great. We are trying to
plan this in our next release. Aaron

--


(Chris Male) #2

Would something along the lines of MoreLikeThis (
http://www.elasticsearch.org/guide/reference/query-dsl/mlt-query.html) be
sufficient? It can provide similar documents.

On Thursday, October 25, 2012 4:18:57 PM UTC+13, Aaron Rosenthal wrote:

How is it best to offer our users related search suggestions based on
related query. Does elastic have features like this built in our would re
have to build our own. Any suggestions would be great. We are trying to
plan this in our next release. Aaron

--


(Aaron Rosenthal) #3

I was thinking along the lines of like a typical ebay search.

keyword: server
Related Searches: home serverhttp://www.ebay.com/sch/i.html?_sacat=0&_nkw=home+server&_frs=1,
dell serverhttp://www.ebay.com/sch/i.html?_sacat=0&_nkw=dell+server&_frs=1,
server rackhttp://www.ebay.com/sch/i.html?_sacat=0&_nkw=server+rack&_frs=1,
server towerhttp://www.ebay.com/sch/i.html?_sacat=0&_nkw=server+tower&_frs=1,
server quad corehttp://www.ebay.com/sch/i.html?_sacat=0&_nkw=server+quad+core&_frs=1,
hp server http://www.ebay.com/sch/i.html?_sacat=0&_nkw=hp+server&_frs=1,
server 2003http://www.ebay.com/sch/i.html?_sacat=0&_nkw=server+2003&_frs=1,
server16gb ramhttp://www.ebay.com/sch/i.html?_sacat=0&_nkw=server+16gb+ram&_frs=1,
server warmerhttp://www.ebay.com/sch/i.html?_sacat=0&_nkw=server+warmer&_frs=1

--


(Nick Dunn) #4

ES doesn't log what people have searched for, so if you want to return a genuine list of related/popular searches that other users made, this logic would be in your own application.

I know the ES developers have yet to implement "did you mean" functionality because its going to be built in to Lucene 4 (ES uses Lucene 3?), and they didn't want to write it from scratch.

However you could potentially use a fuzzy match query against your index and set the result to return highlighted chunks of text, ie the words surrounding your term. It won't be perfect but might get you half way there.

Your best bet is to write your own logic — store each search term and the number of results it returns. Your "see also" list would be the distinct terms from this dataset, ordered either by their frequency (popularity) or number of results they yield (precision).

--


(Nick Dunn) #5

What I failed to say is that you could periodically push and aggregated form of this logged dataset (all distinct terms) back into your ES index, so that the terms can be retrieved with whatever type of ES query best suits (fuzzy, wildcard, exact term, stemmed etc).

--


(Aaron Rosenthal) #6

Thanks for your input we just didn't want to re-invent the wheel. -Aaron

On Thursday, October 25, 2012 12:30:31 AM UTC-7, Nick Dunn wrote:

What I failed to say is that you could periodically push and aggregated
form of this logged dataset (all distinct terms) back into your ES index,
so that the terms can be retrieved with whatever type of ES query best
suits (fuzzy, wildcard, exact term, stemmed etc).

--


(Otis Gospodnetić) #7

Hi Aaron,

We have something like that in a form of a Java library. It's not packaged
or polished, but it does work, assuming you have a good sized query log.
If you are interested, please get in touch - http://sematext.com and if
there is strong interest we could polish and package this up.

Otis

Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html

On Wednesday, October 24, 2012 11:39:53 PM UTC-4, Aaron Rosenthal wrote:

I was thinking along the lines of like a typical ebay search.

keyword: server
Related Searches: home serverhttp://www.ebay.com/sch/i.html?_sacat=0&_nkw=home+server&_frs=1,
dell serverhttp://www.ebay.com/sch/i.html?_sacat=0&_nkw=dell+server&_frs=1,
server rackhttp://www.ebay.com/sch/i.html?_sacat=0&_nkw=server+rack&_frs=1,
server towerhttp://www.ebay.com/sch/i.html?_sacat=0&_nkw=server+tower&_frs=1,
server quad corehttp://www.ebay.com/sch/i.html?_sacat=0&_nkw=server+quad+core&_frs=1,
hp serverhttp://www.ebay.com/sch/i.html?_sacat=0&_nkw=hp+server&_frs=1,
server 2003http://www.ebay.com/sch/i.html?_sacat=0&_nkw=server+2003&_frs=1,
server16gb ramhttp://www.ebay.com/sch/i.html?_sacat=0&_nkw=server+16gb+ram&_frs=1,
server warmerhttp://www.ebay.com/sch/i.html?_sacat=0&_nkw=server+warmer&_frs=1

--


(BillyEm) #8

Hey Nick:

What does precision defined as "or number of results they yield
(precision)." come from. Its not the TREC definition. In fact the number of
results is nothing but the number of results, without a relevance judge
evaluating them.

re
b

On Thursday, October 25, 2012 3:30:31 AM UTC-4, Nick Dunn wrote:

What I failed to say is that you could periodically push and aggregated
form of this logged dataset (all distinct terms) back into your ES index,
so that the terms can be retrieved with whatever type of ES query best
suits (fuzzy, wildcard, exact term, stemmed etc).

--


(Nick Dunn) #9

I think what I was trying to say was that storing the number of results in
the log against the search term goes some way to allowing you to evaluate
whether that search was successful and/or useful to the user. You could set
a threshold and say that a search yielding 1-10 results is good, but any
more then precision is poor. You could factor this into your algorithm to
decide which "related" searches to display to the user, thereby filtering
out the user searches that you deem unsuccessful.

In reality you'd want this to be more accurate and use other metrics,
perhaps tracking the number of clicks from search results pages (implying
accuracy), whether search term yielded a bounced visit (not good!) and so
on.

On Thursday, October 25, 2012 4:18:57 AM UTC+1, Aaron Rosenthal wrote:

How is it best to offer our users related search suggestions based on
related query. Does elastic have features like this built in our would re
have to build our own. Any suggestions would be great. We are trying to
plan this in our next release. Aaron

--


(system) #10