Assuming that shard settings got applied correctly here are a few thoughts.
The workload that you've chosen (metricbeat) is quite small and very compressible. What was the duration of your benchmark? It's very likely it was very short.
For evaluating indexing throughput, you probably want to use something that resembles more the workload that you are after and is larger. You can e.g. the http_logs track for something more logs oriented, pmc if you are after large docs or nyc_taxis for a rather large corpora. For most tracks there are README pages describing the workload e.g. rally-tracks/http_logs at master · elastic/rally-tracks · GitHub
You could also be having a bottleneck elsewhere, e.g. on your loaddriver, or elsewhere.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.