I read previous posts regarding what does "took" measure. Does it including the time to distribute the query and merger the result?
I run some geo-point search benchmarks for a cluster(3master, 4 data nodes, 1 client, 4cores 16G RAM VMs). The cluster with replica shards seems like show better scalability than the cluster without shards (just allocate the primary shards to 1/2/3/4 nodes) in terms of average "took" value. Is that reasonable?
if you have replicas you might distribute your search requests across different machines, so yes it might scale better. The took time includes distribution of requests and merging of results.
Thanks for your reply. I checked the default behavior of "Sort by distance" function and the geo-point range function. To my understanding, these queries are hard to get benefits from the additional data node, am I right?
Could you tell me what kind of metrics or measurement needs to be conducted in order to see the horizontal scalability?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.