Termvector api fails for artificial document

murari_tikmani · January 21, 2021, 10:09am

I have a use case where I want to find the score of each word in my artificial document when compared against existing corpus.

For that I am using termvector api on an artificial document.
The api fails if we use "filter" AND "there are words in the artificial document that doesn't exist in the index"

ex:

{
"doc" : {
"PROB_DESC": " My Name is Murari Tikmani "
},
"fields":["PROB_DESC"],
"term_statistics": true,
"field_statistics": true,
"positions": false,
"offsets": false,
"filter" : {
"min_term_freq": 0,
"min_doc_freq": 2
}
}

The above returns error. If I remove the "filter" from the above, I get term statistics in which I see that for Murari {tf=1} and there is no df mentioned.

filter is a must for me because only after i provide the filter, I can get the score

system · February 18, 2021, 10:09am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Termvector differences between artificial document and indexed document Elasticsearch	1	386	July 18, 2019
Pre-filtering for _termvectors API to get statistics on a subset of documents Elasticsearch	1	289	March 25, 2021
TermVector of artifical doc + filtering Elasticsearch	1	343	November 21, 2019
Unexpected behaviour of termvector API in Python Elasticsearch	1	416	May 19, 2020
Term vectors for nested fields Elasticsearch	2	329	December 12, 2022

Termvector api fails for artificial document

Related topics