Hi,
One of my utmost features that I wanted to get into elasticsearch is now
on master. ElasticSearch was built from the ground up to work well in cloud
environments (and of course, outside the cloud as well), and now this vision
is fully realized.
The first support is allowing to use the cloud as a gateway storage. This
means that Amazon S3, Rackspace CloudFiles, or Azure Blob can be used to
store both the cluster meta data and each index information (index files and
transaction logs). This support really make sense and, at the end saves you
money ;). The saving money part comes from the fact that the index is stored
locally (no need for EBS) and automatically mirrored to S3 / CloudFiles.
The second support, and one of the reasons for the new Zen discovery
module (which is the default in master and future elasticsearch releases),
is the ability to use the cloud information for auto discovery of nodes. In
most cloud environments, multicast is disabled. This means that a gossip
routed needs to be defined with an elastic ip, and just for HA, more nodes
need to be defined. With the elasticsearch cloud discovery, all nodes are
created equal, no need for elastic IPs, or special nodes!.
Here is a simple configuration for Amazon (works on RackSpace as well...)
that stores the data on S3 and uses EC2 to discovery nodes:
cloud:
account:
key:
type: aws
discovery:
type: cloud
gateway:
type: cloud
cloud:
container: YourContainerNameHere
Just replace "aws with "rackspace" to work with rackspace (cross cloud
support is done using the excellent jclouds APIs).
Feature Issues Are:
http://github.com/elasticsearch/elasticsearch/issues/closed#issue/163
http://github.com/elasticsearch/elasticsearch/issues/closed#issue/164
Enjoy!
shay.banon