Similar documentation detection System

Hi ,

Is there a feature in ES where when i give a document to elasticSearch , it
tells me which all documents in the ES are similar to the document i
inserted.
Like instance let there be a score for similarity checking between documents
(between 1 to 100).

Whenever i add a document to ES , it should tell me which all documents have
similarity of score more than 70.
Also i want ES to look into similarity of last N days only (Not the whole
lot if it takes too much of processing power).

Thanks
Vineeth

Hi Vineeth

Is there a feature in ES where when i give a document to
elasticSearch , it tells me which all documents in the ES are similar
to the document i inserted.
Like instance let there be a score for similarity checking between
documents (between 1 to 100).

Look at:

Whenever i add a document to ES , it should tell me which all
documents have similarity of score more than 70.
Also i want ES to look into similarity of last N days only (Not the
whole lot if it takes too much of processing power).

And combing queries with one or more filters:

clint

You should have a look into the more/fuzzy like this query:

and also

http://www.phpriot.com/news/duplicates-detection-with-elasticsearch

But to speed up these think you probably would have to use one of the
local sensitive hashing algorithm

On 29 Sep., 08:31, Vineeth Mohan vineethmo...@algotree.com wrote:

Hi ,

Is there a feature in ES where when i give a document to elasticSearch , it
tells me which all documents in the ES are similar to the document i
inserted.
Like instance let there be a score for similarity checking between documents
(between 1 to 100).

Whenever i add a document to ES , it should tell me which all documents have
similarity of score more than 70.
Also i want ES to look into similarity of last N days only (Not the whole
lot if it takes too much of processing power).

Thanks
Vineeth

Thanks guyz ...
Elastic search simply rockz.....

Thanks
Vineeth

On Thu, Sep 29, 2011 at 12:56 PM, Karussell tableyourtime@googlemail.comwrote:

You should have a look into the more/fuzzy like this query:

Elasticsearch Platform — Find real-time answers at scale | Elastic

Elasticsearch Platform — Find real-time answers at scale | Elastic

and also

phpBBservice.nl - Forumoverzicht

But to speed up these think you probably would have to use one of the
local sensitive hashing algorithm

On 29 Sep., 08:31, Vineeth Mohan vineethmo...@algotree.com wrote:

Hi ,

Is there a feature in ES where when i give a document to elasticSearch ,
it
tells me which all documents in the ES are similar to the document i
inserted.
Like instance let there be a score for similarity checking between
documents
(between 1 to 100).

Whenever i add a document to ES , it should tell me which all documents
have
similarity of score more than 70.
Also i want ES to look into similarity of last N days only (Not the whole
lot if it takes too much of processing power).

Thanks
Vineeth

Just one more questions.

Is it possible to do the same without actually inserting the document ?
I need to decided if the document needs to be inserted looking at if there
are duplicates for it.

Thanks
Vineeth

On Thu, Sep 29, 2011 at 3:02 PM, Vineeth Mohan vineethmohan@algotree.comwrote:

Thanks guyz ...
Elastic search simply rockz.....

Thanks
Vineeth

On Thu, Sep 29, 2011 at 12:56 PM, Karussell tableyourtime@googlemail.comwrote:

You should have a look into the more/fuzzy like this query:

Elasticsearch Platform — Find real-time answers at scale | Elastic

Elasticsearch Platform — Find real-time answers at scale | Elastic

and also

phpBBservice.nl - Forumoverzicht

But to speed up these think you probably would have to use one of the
local sensitive hashing algorithm

On 29 Sep., 08:31, Vineeth Mohan vineethmo...@algotree.com wrote:

Hi ,

Is there a feature in ES where when i give a document to elasticSearch ,
it
tells me which all documents in the ES are similar to the document i
inserted.
Like instance let there be a score for similarity checking between
documents
(between 1 to 100).

Whenever i add a document to ES , it should tell me which all documents
have
similarity of score more than 70.
Also i want ES to look into similarity of last N days only (Not the
whole
lot if it takes too much of processing power).

Thanks
Vineeth

yes, just query for it via mlt or flt

this is not indexing ...

On 29 Sep., 11:33, Vineeth Mohan vineethmo...@algotree.com wrote:

Just one more questions.

Is it possible to do the same without actually inserting the document ?
I need to decided if the document needs to be inserted looking at if there
are duplicates for it.

Thanks
Vineeth

On Thu, Sep 29, 2011 at 3:02 PM, Vineeth Mohan vineethmo...@algotree.comwrote:

Thanks guyz ...
Elastic search simply rockz.....

Thanks
Vineeth

On Thu, Sep 29, 2011 at 12:56 PM, Karussell tableyourt...@googlemail.comwrote:

You should have a look into the more/fuzzy like this query:

Elasticsearch Platform — Find real-time answers at scale | Elastic

Elasticsearch Platform — Find real-time answers at scale | Elastic

and also

phpBBservice.nl - Forumoverzicht

But to speed up these think you probably would have to use one of the
local sensitive hashing algorithm

On 29 Sep., 08:31, Vineeth Mohan vineethmo...@algotree.com wrote:

Hi ,

Is there a feature in ES where when i give a document to elasticSearch ,
it
tells me which all documents in the ES are similar to the document i
inserted.
Like instance let there be a score for similarity checking between
documents
(between 1 to 100).

Whenever i add a document to ES , it should tell me which all documents
have
similarity of score more than 70.
Also i want ES to look into similarity of last N days only (Not the
whole
lot if it takes too much of processing power).

Thanks
Vineeth