The value of each term e.g. ipaddress
126.96.36.199 is hashed and then we take the modulo of the number of paritions you picked e.g. 20. This means all terms will be evenly assigned between 20 partitions. We use the same technique to route documents evenly to shards based on their doc IDs.
For each aggregation request you make you pick one of your partitions e.g. partition 7 of 20. This means each shard will be performing analysis on the same subset of terms.
The terms within one request (e.g. partition 1 of 20) will be ordered by whatever criteria you pick.
There is no guarantee of order across partitions because partitions by design are a randomised division of the data.
You can't. "Some ordering" could be expanded to include something as tricky as userIds sorted by their last logged access date. Given very large numbers of user IDs and a distributed index it is impossible to compute this total ordering without resorting to map/reduce style streaming of masses of data across the network and creating temporary data files etc.
Using term partitioning however you can reduce the amount of data streamed but have to live with in-partition ordering only. This would be adequate for the scenario of finding userIDs that have expired but will not solve all problems. Another option for the complex scenarios like "last access dates" is to opt for entity-centric indexes based around user.