I have a log of events containing API access data like [ timestamp, id, other_informations ]. We would like to find some information about it.
We know that we can understand things by watching the amount of request we have from one id during a specified period of time. The problem is that Top X gives thousands of normal id access and inverted top X doesn't help (thousands of "1" connection attempt by an id). The id carnality is tens of millions, a full histogram can't be built.
On the other hand, we know that if we can specify N and M to something like "show me 1000 ids that appear between N and M times in the period of observation" we will have the info we need.