I am running my es-cluster on kubernetes on AWS instance x2iedn(memory optimized) and i4i(Storage optimized).
I tried benchmarking the es cluster on both instance type.
I have 3 data nodes and 1 master node.
each have a heap size 31 gb.
Resources of container is requested as below.
resources:
requests:
cpu: 8
memory: "64Gi"
I am using nyc_taxis track to benchmark cluster.
I see the throughput for index follows as
32x instance < 16x instance < 8x instance > 4x instance
When it comes to indexing performance, the amount of memory available is less important than the performance of the storage. The instance type you are using seem better suited to high concurrent query loads as all data might be able to fit in the operating system page cache. What type of storage are you using with the instance?
You should also note that the standard tracks are not necessarily set up to load very powerful nodes by default and you may very well need to tweak settings and concurrentcy in order to make Rally generate enough load to saturate the cluster, maybe even run multiple Rally instances.
I am also curious what you are looking to get out of this exercise. Does the expected workload for the cluster at all resemble the track you are using? I always recommend creating a custom track that as realistically as possible represents the data and load you are expecting in the cluster and then use this to get an as accurate estimate as possible.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.