Evaluating your search system's performance

Hi @Andrew_O_Brien,

I just found this post from over four years ago while searching for topics related to ranking evaluation, since we just recently introduced an (still experimental) ranking evaluation API in Elasticsearch and I'm looking for other mertrics we might want to support. Did you end up implementing bpref? Did it works/was it useful? What challenges did you meet and was it necesarry to scroll all hits for a quiery until you found all the "relevant" documents or is it possible to use this measure only on a window of the top N hits.
Please ignore me if this topic is no longer of interest to you, otherwise I'd be interested in your findings.

1 Like