Hi
As far as I saw in the MLT (0.5) documentation, you can onlt query MLT
for a document that is already indexed.
We are looking for slightly different implementation where the input
document is a URL that the server extracts the most significant
keywords from and returns the similar docs.
elasticsearch has the option the execute a moreLikeThis query, which is part
of the query dsl. When using the query, you just provide it with a text to
find docs that match it, so, in your case, fetch the doc, get the text from
it, and execute a moreLikeThis query.
-shay.banon
On Sun, Mar 28, 2010 at 11:09 AM, Ori Lahav olahav@gmail.com wrote:
Hi
As far as I saw in the MLT (0.5) documentation, you can onlt query MLT
for a document that is already indexed.
We are looking for slightly different implementation where the input
document is a URL that the server extracts the most significant
keywords from and returns the similar docs.
elasticsearch has the option the execute a moreLikeThis query, which is
part of the query dsl. When using the query, you just provide it with a text
to find docs that match it, so, in your case, fetch the doc, get the text
from it, and execute a moreLikeThis query.
-shay.banon
On Sun, Mar 28, 2010 at 11:09 AM, Ori Lahav olahav@gmail.com wrote:
Hi
As far as I saw in the MLT (0.5) documentation, you can onlt query MLT
for a document that is already indexed.
We are looking for slightly different implementation where the input
document is a URL that the server extracts the most significant
keywords from and returns the similar docs.
elasticsearch has the option the execute a moreLikeThis query, which is
part of the query dsl. When using the query, you just provide it with a text
to find docs that match it, so, in your case, fetch the doc, get the text
from it, and execute a moreLikeThis query.
-shay.banon
On Sun, Mar 28, 2010 at 11:09 AM, Ori Lahav olahav@gmail.com wrote:
Hi
As far as I saw in the MLT (0.5) documentation, you can onlt query MLT
for a document that is already indexed.
We are looking for slightly different implementation where the input
document is a URL that the server extracts the most significant
keywords from and returns the similar docs.
just fyi,
In solr mlt you can supply a url as a paramter to the request, and
solr will access this url, interpret the response as the contents of a
document, tokenize it and extract the interesting words from it and
use it to perform an MLT query
elasticsearch has the option the execute a moreLikeThis query, which is
part of the query dsl. When using the query, you just provide it with a text
to find docs that match it, so, in your case, fetch the doc, get the text
from it, and execute a moreLikeThis query.
-shay.banon
On Sun, Mar 28, 2010 at 11:09 AM, Ori Lahav ola...@gmail.com wrote:
Hi
As far as I saw in the MLT (0.5) documentation, you can onlt query MLT
for a document that is already indexed.
We are looking for slightly different implementation where the input
document is a URL that the server extracts the most significant
keywords from and returns the similar docs.
I understand what it means. If you want this feature, then open a feature
request for this. In general, I prefer not to rely on external resources in
elasticsearch, and if I do, it should be done correctly. Meaning, in this
case, to use async io to fetch the doc, and not block a thread on io
operation, which needs developing.
just fyi,
In solr mlt you can supply a url as a paramter to the request, and
solr will access this url, interpret the response as the contents of a
document, tokenize it and extract the interesting words from it and
use it to perform an MLT query
elasticsearch has the option the execute a moreLikeThis query, which
is
part of the query dsl. When using the query, you just provide it with
a text
to find docs that match it, so, in your case, fetch the doc, get the
text
from it, and execute a moreLikeThis query.
-shay.banon
On Sun, Mar 28, 2010 at 11:09 AM, Ori Lahav ola...@gmail.com wrote:
Hi
As far as I saw in the MLT (0.5) documentation, you can onlt query
MLT
for a document that is already indexed.
We are looking for slightly different implementation where the input
document is a URL that the server extracts the most significant
keywords from and returns the similar docs.
By the way, if you want to provide the text for the search, then you
probably want to use the search API, with an mlt query. The mlt query allows
to provide the text to do mlt on. In this case, you fetch the text from the
url on the client side, and execute the search query with the text
populated. This makes more sense then fetching the text on each search shard
side.
I understand what it means. If you want this feature, then open a feature
request for this. In general, I prefer not to rely on external resources in
elasticsearch, and if I do, it should be done correctly. Meaning, in this
case, to use async io to fetch the doc, and not block a thread on io
operation, which needs developing.
just fyi,
In solr mlt you can supply a url as a paramter to the request, and
solr will access this url, interpret the response as the contents of a
document, tokenize it and extract the interesting words from it and
use it to perform an MLT query
elasticsearch has the option the execute a moreLikeThis query, which
is
part of the query dsl. When using the query, you just provide it with
a text
to find docs that match it, so, in your case, fetch the doc, get the
text
from it, and execute a moreLikeThis query.
-shay.banon
On Sun, Mar 28, 2010 at 11:09 AM, Ori Lahav ola...@gmail.com
wrote:
Hi
As far as I saw in the MLT (0.5) documentation, you can onlt query
MLT
for a document that is already indexed.
We are looking for slightly different implementation where the input
document is a URL that the server extracts the most significant
keywords from and returns the similar docs.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.