for testing, I have one index with only 1 shard and 4 replicas.
in .percolator type of that index, there are 1.5k queries to be percolated.
and total 5 modern machines with 48G ram, assinged 12G for elasticsearch on
each node.
What I'm seeing now is, it rarely scales out in performance perspective of
view.
with only 1 node, percolate throughput is about 15k/s
with 5 nodes, it's about 18k/s.
I thought that If I have 1 shard and make number of replica same as number
of machines we have, performance statistic will also linearly scales out as
number of node increases.
there are several possibilties to increase performance. First you can have
your own index for your percolation queries, so it scales independently
from your data (there are use-cases where people do not have increasing
data, but ever increasing amount of percolators). Second you can filter
during percolation, so that a document is not executed against every
registered query in case you already know, that some are not important.
See
However, if you only have 1 shard, you should be able to scale out anyway.
Can you check your stats? Do you hit all those nodes evenly? Maybe you
created some 'hot node'?
for testing, I have one index with only 1 shard and 4 replicas.
in .percolator type of that index, there are 1.5k queries to be percolated.
and total 5 modern machines with 48G ram, assinged 12G for elasticsearch
on each node.
What I'm seeing now is, it rarely scales out in performance perspective of
view.
with only 1 node, percolate throughput is about 15k/s
with 5 nodes, it's about 18k/s.
I thought that If I have 1 shard and make number of replica same as number
of machines we have, performance statistic will also linearly scales out as
number of node increases.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.