How to get term_vector for a document


(Coady) #1

Is it possible to fetch the term vector for a document through the
API? I'd like to use the term statistics in the index to build a
weighted query profile. The TermFreqVector per doc would be ideal,
but also helpful would be the rewritten query from a more_like_this
search and the total docFreq per term.


(Loco Jay) #2

had the same question today. just have a look at

http://www.elasticsearch.org/guide/reference/api/search/facets/

this works for the complete index, subset(search) or per doc
On Sep 19, 2011, at 7:41 PM, Coady wrote:

Is it possible to fetch the term vector for a document through the
API? I'd like to use the term statistics in the index to build a
weighted query profile. The TermFreqVector per doc would be ideal,
but also helpful would be the rewritten query from a more_like_this
search and the total docFreq per term.


(Coady) #3

Interesting... facets are technically the inverse of the term vectors
(doc freqs per term vs. term freqs per doc). But I see what you
mean. I can at least start with the total index doc freqs as well as
the terms per doc (all with a count of 1) . Thx.

On Sep 19, 4:46 pm, Loco Jay locojay...@gmail.com wrote:

had the same question today. just have a look at

http://www.elasticsearch.org/guide/reference/api/search/facets/

this works for the complete index, subset(search) or per doc
On Sep 19, 2011, at 7:41 PM, Coady wrote:

Is it possible to fetch the term vector for a document through the
API? I'd like to use the term statistics in the index to build a
weighted query profile. The TermFreqVector per doc would be ideal,
but also helpful would be the rewritten query from a more_like_this
search and the total docFreq per term.


(Shay Banon) #4

There is no option to get the term vectors, but what are you after
specifically? Is it build your own query (custom Lucene query)? In this
case, one can do it by writing a plugin that does that (and also add it as
part of the query DSL). If thats the case, we can help in doing that.

-shay.banon

On Tue, Sep 20, 2011 at 3:37 AM, Coady aric.coady@gmail.com wrote:

Interesting... facets are technically the inverse of the term vectors
(doc freqs per term vs. term freqs per doc). But I see what you
mean. I can at least start with the total index doc freqs as well as
the terms per doc (all with a count of 1) . Thx.

On Sep 19, 4:46 pm, Loco Jay locojay...@gmail.com wrote:

had the same question today. just have a look at

http://www.elasticsearch.org/guide/reference/api/search/facets/

this works for the complete index, subset(search) or per doc
On Sep 19, 2011, at 7:41 PM, Coady wrote:

Is it possible to fetch the term vector for a document through the
API? I'd like to use the term statistics in the index to build a
weighted query profile. The TermFreqVector per doc would be ideal,
but also helpful would be the rewritten query from a more_like_this
search and the total docFreq per term.


(Coady) #5

Yes, I'm looking to build a custom query based on user preferences,
e.g., like or dislikes of documents in the index. Mathematically it
would analogous to a more_like_this query, except those are only based
on a single doc at a time.

A plugin may work, but I'd rather not recompute the custom query per
request. If I was using raw lucene, I would pull the the term_vector
for each relevant doc and do a weighted version of what the
more_like_this expansion does.

On Sep 20, 1:39 am, Shay Banon kim...@gmail.com wrote:

There is no option to get the term vectors, but what are you after
specifically? Is it build your own query (custom Lucene query)? In this
case, one can do it by writing a plugin that does that (and also add it as
part of the query DSL). If thats the case, we can help in doing that.

-shay.banon

On Tue, Sep 20, 2011 at 3:37 AM, Coady aric.co...@gmail.com wrote:

Interesting... facets are technically the inverse of the term vectors
(doc freqs per term vs. term freqs per doc). But I see what you
mean. I can at least start with the total index doc freqs as well as
the terms per doc (all with a count of 1) . Thx.

On Sep 19, 4:46 pm, Loco Jay locojay...@gmail.com wrote:

had the same question today. just have a look at

http://www.elasticsearch.org/guide/reference/api/search/facets/

this works for the complete index, subset(search) or per doc
On Sep 19, 2011, at 7:41 PM, Coady wrote:

Is it possible to fetch the term vector for a document through the
API? I'd like to use the term statistics in the index to build a
weighted query profile. The TermFreqVector per doc would be ideal,
but also helpful would be the rewritten query from a more_like_this
search and the total docFreq per term.


(system) #6