Hi,
Currently, I use 2 servers(2 shards+1 replica) to query 0.46 billion
documents. The average query response time is 1.5s.
I wish I can simply add a new server into the cluster to improve the
query performance. (still 2 shards + 1 replica)
But after adding a new server into the cluster, I found the query
performance didn't improve much. The average query response time still is
1.5s.
Did I miss anything??
In addition, I'm wondering is there any difference of querying only one
server or query 3 servers roundly?
What's your HW setup like, is this data in RAM, on disk, what does
the CPU/disk use etc. look like when you do queries
What does your data look like, do you have an example
How does the mapping look like
How do the queries that you've tried look like
Have you tried increasing the number of shards, if you have 2 shards
in total with X number of replicas and 3 machines you'll only
distribute your queries to 2 shards, maybe that's your bottleneck, so
increasing replicas/boxes wouldn't help.
It doesn't help to add a new server if no shards/replicas are migrated to
it. Check manually via the API or look at SPM.
Also, by adding more servers you are adding more CPUs, so resharding may
make sense.
On Friday, June 1, 2012 4:01:33 AM UTC-4, jackiedong wrote:
Hi,
Currently, I use 2 servers(2 shards+1 replica) to query 0.46 billion
documents. The average query response time is 1.5s.
I wish I can simply add a new server into the cluster to improve the
query performance. (still 2 shards + 1 replica)
But after adding a new server into the cluster, I found the query
performance didn't improve much. The average query response time still is
1.5s.
Did I miss anything??
In addition, I'm wondering is there any difference of querying only one
server or query 3 servers roundly?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.