Hi, I am using ES to do research for my learning to rank algorithm. Essentially instead of using TF-IDF or BM25, my scoring function looks like:
TF(D, t) = model(D, t)
where t
is each term in the document. Instead of just counting the frequency of the term, a model will predict the term score. Additionally, I will not use the IDF score.
So this should be compatible with the underlying inverted index. I just need to replace the TF/IDF score with my computed score for each term in the document. Before indexing time, I already compute the term_score of each term in the doc, so I want to cache these results as the TF term in ES.
Right now I am using rank_feature as a by_pass to achieve this. The speed was okay for a small index, but then index > 50 million, it's much slower than BM25 search on the text. Can you please give instructions on how to modify the term to doc scores directly? Thanks!