yes, the definition of subset and superset is clear.
My problem is that the value of _superset_size in a script is greater than the total number of document in my index. I think there is a bug somewhere.
I do a nested significant terms aggregation after a simple term aggregation, similar as the exemple describe here : https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-significantterms-aggregation.html
I need to perform a custom score and so i try to used "script_heuristic" : the result seems strange for me, so i modify my script to view the value of each variable (_superset_size, _superset_freq, _subset_freq and _subsetset_size) with :
"script_heuristic": {
"script": "_superset_size"
}
And what a surprise : the value of _superset_size is greater than the number of total document in my index...
In addition, the value of bg_count is greater than the value of total count for each terms, as descibe here Bg_counts in nested significant_terms aggregation
There is a bug, i guess