Scoring and relevance report from optimzely hosted cluster; packet interception using wireshark


I work for a website that uses a hosted Elasticsearch instance. My work focuses on improving search experience. However, I am frequently hitting a wall when it comes to scoring and relevance. The report Optimizely grants access to from the apply.personalization API call is not comprehensive enough- I am looking for something much more granular in terms of reproducing how every element of a document scored for a given search.

The documentation explains that Optimizely does not have a system in place to access the explain API to extract the information I am looking for. This has led me to explore packet interception with software like Wireshark- I wonder if anyone has had success with that approach and how they went about retrieving the information. Or, in this case, if the approach is possible at all.

My objective is to determine why a field with few case-for-case matches would be scoring so high; the results are higher for items that have no terms in common with the original query and the mapped documents do not have any appropriate indications an edit distance was calculated. No visible distinctions are present, and manually creating a reverse index does not introduce any clarity. I want to see what went into index formation- though I know our tokenizers, the view into search from so high up is not useful when what breaks a search could be a single space.

I am green with the elasticstack, so my example might differ from my suggested approach to solving the problem so please let me know if I am on the right track.

Let me know what you think!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.