Insights for an ES Newbie

zupeanut · February 22, 2011, 9:46pm

Hi,

I would like to try and cluster my documents. From what I understand of Lucene, it means fully replacing the term-vector. From my reading, I have determined that this vector is hidden in ES. Is there a way to augment my results using some or all of the information I do have available?

Available Information:

Global term importance (ie. word and importance relative to corpus)
Document vectors (ie. word and importance relative to document)

Many thanks in advance!

Regards,
Andrew

kimchy · February 24, 2011, 5:08am

In general you can do it, but its quite low level Lucene (and elasticsearch) work. You can write your own analyzer that puts your own "importance" as a payload on terms the analyzer generates, and then use those payloads in a custom query you implement.

I have been thinking hard on trying to enable this simply in elasticsearch, but did not come up (yet) with a nice API centric solution.
On Tuesday, February 22, 2011 at 11:46 PM, zupeanut wrote:

Hi,

I would like to try and cluster my documents. From what I understand of
Lucene, it means fully replacing the term-vector. From my reading, I have
determined that this vector is hidden in ES. Is there a way to augment my
results using some or all of the information I do have available?

Available Information:

Global term importance (ie. word and importance relative to corpus)

Document vectors (ie. word and importance relative to document)

Many thanks in advance!

Regards,
Andrew

View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Insights-for-an-ES-Newbie-tp2555803p2555803.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

Topic		Replies	Views
Term vectors for computing document similarity Elasticsearch	7	1335	July 6, 2017
Noobi question Elasticsearch	1	245	July 6, 2017
What are the research papers that ES relies on? Elasticsearch	8	3279	July 6, 2017
Newbie elasticssearch questions Elasticsearch	5	377	July 6, 2017
Indexing custom Lucene documents Elasticsearch	6	542	July 6, 2017

Insights for an ES Newbie

Regards, Andrew

Related topics

Regards,
Andrew