Insights for an ES Newbie

Hi,

I would like to try and cluster my documents. From what I understand of Lucene, it means fully replacing the term-vector. From my reading, I have determined that this vector is hidden in ES. Is there a way to augment my results using some or all of the information I do have available?

Available Information:

  • Global term importance (ie. word and importance relative to corpus)
  • Document vectors (ie. word and importance relative to document)

Many thanks in advance!

Regards,
Andrew

In general you can do it, but its quite low level Lucene (and elasticsearch) work. You can write your own analyzer that puts your own "importance" as a payload on terms the analyzer generates, and then use those payloads in a custom query you implement.

I have been thinking hard on trying to enable this simply in elasticsearch, but did not come up (yet) with a nice API centric solution.
On Tuesday, February 22, 2011 at 11:46 PM, zupeanut wrote:

Hi,

I would like to try and cluster my documents. From what I understand of
Lucene, it means fully replacing the term-vector. From my reading, I have
determined that this vector is hidden in ES. Is there a way to augment my
results using some or all of the information I do have available?

Available Information:

  • Global term importance (ie. word and importance relative to corpus)
  • Document vectors (ie. word and importance relative to document)

Many thanks in advance!

Regards,
Andrew

View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Insights-for-an-ES-Newbie-tp2555803p2555803.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.