I have multiple question about the optimization of a cluster with:
running on c5.xlarge (4cpu 8GB ram)
huge read req/s and small write req/s.
small index 20GB to 30GB.
experimenting shard between 1GB to 10gb
we use heavily auto scaling (increase number of instance, and increase number of replica), during the day we can go up to 15 replica or more and the night down to 1.
For example:
ur idx sh p st n ud sc sqc dc sto qce
es_x 0 p STARTED data-0 46 0 3539798 9.9gb 0
es_x 1 p STARTED data-1 42 0 3538036 9.8gb 0
For now the number of shards etc does not change much (from 2 to 6) but the big difference in performance is the number of segments
after a force merge the performance can be x10 to x20 better than before. I know I can not do a force merge every times because it is an expensive operation, and because it is not a read only index (even if we write only few data during the day).
My questions are:
Why when I restart ES (after first indexing) the req/s is increasing a lot ? (at least 30%) more even if only 1 segments
can it be reasonable to force merge every hour if there is not much write in the index ? if not what other settings can you set to optimize segments count etc
what other settings can I look to optimize the search performance ? ES_JAVA_OPTS does not seems to change much.
If you have a lot of searches per second, try using a single primary shard. 30 GB is not that big and as long at query latency is acceptable it might support higher query throughput.
C5 instances have a lot of CPU per unit of RAM. Are you sure you are limited by CPU and not disk I/O? Might a different instance type with more RAM per CPU core maybe reduce disk I/O and give better performance? Have you made your heap as small as the use case allows do you have more space for OS page cache?
From my benchmark c5xlarge or m5xlarge give same performance. However it is odd, because I can see around 5GB of file system cache when the shards is arround 9.9GB
For 2 node, with 1 shards each:
ur idx sh p st n ud sc sqc dc sto qce
es 0 p STARTED es-0 1 0 3542189 9.9gb 0
es 1 p STARTED es-1 1 0 3540384 9.9gb 0
If I go to the node:
free -h
total used free shared buff/cache available
Mem: 15G 2.8G 8.3G 572K 4.3G 12G
Swap: 0B 0B 0B
I set -Xmx2048m -Xms2048m to use as low as possible the ram of the system.
if shards is around 9.9 GB, I should expect to use same amount of buff/cache isn't ?
Also there is still a huge difference between 1 segments and 4 segments (more than 10x faster)
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.