The--include-tasks argument means you'll only ever execute tasks of type index. Without seeing your esrally invocation or the results summary, it's difficult to reason about what might be happening.
That said, this seems a little like an XY problem. Unless your production dataset looks like that of the NYC Taxis (mappings, fields etc.), then the indexing throughput numbers are likely to be unrealistic, and in some cases completely invalid.
If you're trying to ascertain which instance types provide the best cost performance for your cluster, then it's imperative that you spend the time to model something akin to your production workload to ensure that any benchmarks are at least somewhat representative of what your cluster may need to handle once in production.
I haven seen similar differences on Bare Metal platform while performing the test.
I am not looking for any particular production datasets matching to NYC_Taxis track.
I am just checking out performance on different platforms with nyc_taxis.
whole execution takes time so i was only looking into indexing performance so opted out the flag --include-tasks but i see difference in the throughput .
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.