Using painless scripting to re-implement BM25 scoring

Hello everyone

I have indexed a few hundreds of millions of data into ElasticSearch using the default parameters of b and k1. It is prohibitive for me to reindex the data however i would like to optimize the parameters b and k1 of bm25 for better scoring.
To my understanding there are some functions in Painless scripting that could compute/fetch tf and idf scores of a token in a document.
Could you please reproduce BM25 in Painless scripting so that i could tune the b and k1 parameters?

Thank you all in advance
Dimitris

In Elasticsearch the module implementing textual scoring is called similarity. There isn't a need to write a painless script, since b and k1 can be customized. However if you are intent on using painless, you can write a similarity script.

Unfortunately this will not work.
As i mentioned i do not want to reindex.
I have hundreds of millions of data.
i just need to optimize the values of b and k1.
I just want to replicate bm25 with painless scripting so to just change thw values of b and k1 as needed.

Thank you again

The similarity can be changed on an existing index, no reindexing necessary. What do you see implying reindexing would be needed?

I thought that "PUT /index ..." is a creation of an index.
Dont i need to re-index the data once i change the mapping/settings ?

While some settings cannot be changed (eg number_of_shards) many settings can be, like the configured similarity. Use the update settings api.

Dont i need to re-index the data once i change the mapping/settings ?

Since the similarity settings are not baked into the index, changing these parameters does not requiring reindexing.