How to do MoreLikeTHESE in Elastic Search?


(Alain D├ęsilets) #1

I just LOVE the MoreLikThis (MLT) feature of ElasticSearch and am finding
new and interesting ways of using it all the time.

However these days, I find myself needing a MoreLikeTHESE feature. In other
words, given a group of documents that fit together, I want to find other
documents that might belong in that group.

Is there a "native" way of doing that with the ES API? For example, is it
possible to do MoreLikeThis on an Aggregation? I looked in the API
documentation and didn't find anything.

If this is not supported natively, what would be the "best" way to
implement this using the existing blocks? I can think of at least two:

== Approach 1: Concatenated Pseudo-Document ==

Create a pseudo-document whose content is the concatenation of the content
of all documents in the group. Add that "document" to the index, then do
the regular MoreLikeThis on that document.

The disadvantage of this approach is that the Pseudo-Document could become
very large.

== Approach 2: Multiterm Vector Pseudo-Document ==

Retrieve the multiterms of all the documents in the group. Create a pseudo
document that contains only the most frequent or most "important"
multi-terms.

Do the regular MoreLikeThis on this pseudo-document.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d7d5e14d-6a02-4d60-b797-351337b14e77%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #2