Elasticsearch Cluster Inside MapR Cluster

Hi Elasticsearch Team,

We are creating a Document Management System using FSCrawler, Elasticsearch & MapR-FS (File system)
We choose a Big Data Platform since the usecases can be scaled to analytics.

And is it possible to setup a Elasticsearch cluster inside a 5 node MapR Cluster?
Elasticsearch also uses Zookeeper, does this will work inside MapR Cluster?

If no, can we used Dockerize elasticsearch on each MapR node and create a cluster out of this dockerize ES?

Elasticsearch does not use Zookeeper, so that is not a problem. Running Elasticsearch on a distributed file system like MapR-FS, HDFS or NFS is however not recommended/supported as it can lead to very poor performance, index corruption and instability. I would therefore recommend running a separate Elasticsearch cluster outside the MapR cluster.

1 Like

Hi,

Thanks for the response :slight_smile:

It states here that Zookeeper is also a part of Architecture Elasticsearch Cluster when deploying an Elasticsearch Cluster. Am I on the right page? please correct me if Im wrong.

About the distributed filesystem, we are not planning to put elasticsearch fs/shards to a Distributed System, tho we are going to deploy ES Cluster on MapR Nodes.

A standalone Elasticsearch cluster does not use Zookeeper, and this is most likely what you would deploy.

Our Elastic Cloud Enterprise product, which is used to simplify management of multiple clusters however does use Zookeeper. I wonder if that is where the confusion comes from?

Good.

I think this one is really confusing, Elasticsearch cluster is compose of how many servers? or cluster means a cluster of index? Im really confused, because in a big data perspective a cluster is a collection of different nodes with different roles. I already assume that I can also deploy a clusterize Elasticsearch which can bring more power and speed? but please enlighten me.

A single Elasticsearch cluster can contain many nodes spread across a number of servers, and these nodes can have different roles if needed, even though the default is that all nodes have all roles. The cluster collectively manages indices and data just as you expect.

And it doesn't use Zookeeper ? only ECE ?

Yes, Elasticsearch does not use Zookeeper. Only ECE does.

1 Like

I see, I can now safely assume that I can deploy on a MapR Cluster. Thus deploying elasticsearch application on every MapR Nodes or maybe a few nodes and safely run it. I really thought it uses a coordinator like zookeeper in order to monitor everything and produce a quorum.

Because we are going to develop a searching web application wherein you can search any documents inside the HDFS (mapr-fs for mapr) using elasticsearch as the solution for this. We got confuse on how should we deploy it (we are only new to elastic) since we see that It also has a cluster option and we got an idea that we can deploy it on every MapR nodes and hopefully to make it more faster and to keep a replication for all the data.

I now understand it, sorry for a confusing question. Thank you so much !

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.