I'm trying to manually calculate the jlh score while running a significant term query and I don't get the same score.
Steps:
query:
{
"query": {
"terms": {
"${myFieldName}": [
"hurricane"
]
}
},
"aggregations": {
"significant_storm_types": {
"significant_terms": {
"field": "${myFieldName}"
, "size": 100, "jlh": {}
}
}
}
}
response:
...
"hits": {
"total": 106,
...
"aggregations": {
"significant_storm_types": {
"doc_count": 106,
"buckets": [
{
"key": "hurricane",
"doc_count": 105,
"score": 1407.7837073557366,
"bg_count": 106
}
The total number of documents in this type:
"hits": {
"total": 50956,
Now we have everything for the calculation (from the documentation):
jlh score = (foregroundPercent - backgroundPercent) ) * (foregroundPercent/backgroundPercent)
Putting the numbers inside:
jlh score = (105/106 - 106/50956) * ((105/106) / (106/50956)) = 470.699067
Which does not equal the score of the term in the bucket (1407.7837073557366).
Where seems to be the issue?
*And also, in the response, the total hits (106) is different the doc_count of the term's bucket (105). This was surprising since I'm working with only one shard.
Thanks