I have a nice, performant cluster of 5 nodes. They're all on separate
machines on the same switch. Life is good.
Now...
Do I tell the consumers of my Elasticsearch cluster to hit any of the five
nodes as suits their fancy? Or do I give them the name of ONE node? If so,
is that node configured any differently?
Or do I put all five behind a virtual IP and load balance them?
I can't find any documentation on best practices here.
I have a nice, performant cluster of 5 nodes. They're all on separate
machines on the same switch. Life is good.
Now...
Do I tell the consumers of my Elasticsearch cluster to hit any of the five
nodes as suits their fancy? Or do I give them the name of ONE node? If so,
is that node configured any differently?
Or do I put all five behind a virtual IP and load balance them?
I can't find any documentation on best practices here.
In our environment our cluster is inside EC2/VPC. We have an ELB in front of the cluster. We use DNS to assign a CNAME to the ELB for easier internal use. The cluster is currently at 15 nodes, 3 of which are “master only, no data” and associate themselves with the ELB. The ELB balances requests to/from the master nodes. The master nodes are slightly smaller in memory, but faster in CPU than the rest of the nodes so they can quickly serve requests. The rest of the nodes are “data only” nodes. They are not master eligible and they just store and serve data to/from the masters via the ELB.
This has worked really well for us, and also allows us to test rolling upgrades on a node that does not actually contain any data for faster confirmations.
I'm curious why no data. Wouldn't having the data local mean faster lookups?
On Wednesday, December 3, 2014 1:14:10 PM UTC-8, Christian Hedegaard wrote:
In our environment our cluster is inside EC2/VPC. We have an ELB in
front of the cluster. We use DNS to assign a CNAME to the ELB for easier
internal use. The cluster is currently at 15 nodes, 3 of which are “master
only, no data” and associate themselves with the ELB. The ELB balances
requests to/from the master nodes. The master nodes are slightly smaller in
memory, but faster in CPU than the rest of the nodes so they can quickly
serve requests. The rest of the nodes are “data only” nodes. They are not
master eligible and they just store and serve data to/from the masters via
the ELB.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.