Evaluating your search system's performance

What do you generally do to evaluate your search system's performance? Do
you use a metrics based approach where they can compare how changes to
scoring, analysis, or similarities effect hits in a quantitative way? Or
something more manual?

Going through Intro to Information Retrievalhttp://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-ranked-retrieval-results-1.html,
I see they pay a lot of attention to this and I can see the advantage of
having that kind of feedback loop, but I haven't heard too many cases of
this being used in practice.

For my own system, I've been looking to implement bpref (PDF; see Chapter
3.1) http://trec.nist.gov/pubs/trec16/appendices/measures.pdf, since I
have fairly incomplete knowledge of which documents are relevant/irrelevant
for my queries. I found that it would be helpful to be able to run a query
and give some expected documents as parameters and just get back the ranks
of those (I suppose I could implement this as a scan, but it would be nice
to avoid the traffic). Any similar experiences?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8c97dce1-1c77-47c8-8f2c-9488b2af4eaa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Maybe MLR (machine based learning for ranking) is of some interest, if you
do not know much about your document relevancy.

I use BM25 Okapi. For library catalogs, I have "document zones" like
subject headings, title, author, identifiers and other supplemental texts
like abstracts. All searches are on very short fields, fortunately. I am
surrounded by librarians who are very skeptical that Elasticsearch can find
"all the documents" they are looking for, they know what "relevancy" is. In
the future I want to extend the catalog by linked open data, a real
challenge for relevancy.

So BM25F BM25 / BM25F Scoring · Issue #2388 · elastic/elasticsearch · GitHub would
be nice to have to tune document zone features to get the linkages into
account. For now, I use a bit of field boosting and document boosting.

Jörg

On Thu, Apr 17, 2014 at 6:23 PM, Andrew O'Brien obrien.andrew@gmail.comwrote:

What do you generally do to evaluate your search system's performance? Do
you use a metrics based approach where they can compare how changes to
scoring, analysis, or similarities effect hits in a quantitative way? Or
something more manual?

Going through Intro to Information Retrievalhttp://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-ranked-retrieval-results-1.html,
I see they pay a lot of attention to this and I can see the advantage of
having that kind of feedback loop, but I haven't heard too many cases of
this being used in practice.

For my own system, I've been looking to implement bpref (PDF; see Chapter
3.1) http://trec.nist.gov/pubs/trec16/appendices/measures.pdf, since I
have fairly incomplete knowledge of which documents are relevant/irrelevant
for my queries. I found that it would be helpful to be able to run a query
and give some expected documents as parameters and just get back the ranks
of those (I suppose I could implement this as a scan, but it would be nice
to avoid the traffic). Any similar experiences?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8c97dce1-1c77-47c8-8f2c-9488b2af4eaa%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/8c97dce1-1c77-47c8-8f2c-9488b2af4eaa%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEKcr04z1%3D4fgzZOKJYeB4VoHWvPbWSgPZQizyz6nnPig%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Hi @Andrew_O_Brien,

I just found this post from over four years ago while searching for topics related to ranking evaluation, since we just recently introduced an (still experimental) ranking evaluation API in Elasticsearch and I'm looking for other mertrics we might want to support. Did you end up implementing bpref? Did it works/was it useful? What challenges did you meet and was it necesarry to scroll all hits for a quiery until you found all the "relevant" documents or is it possible to use this measure only on a window of the top N hits.
Please ignore me if this topic is no longer of interest to you, otherwise I'd be interested in your findings.

1 Like