I suggest using the shard_size parameter in your terms aggregation to ensure that it pulls back enough terms to get an accurate enough result for your use case. The correct number is really based on the cardinality of your field, and the way it is sharded, but it seems you have found a good number to start with.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.