To cluster or not to cluster?

nandrik · May 19, 2019, 9:03pm

Hi,

I have the following setup with two apps running on two servers, both on the same subnet:

app_1 > filebeat_1 > logstash_1 > elasticsearch_1 (es_1) > kibana_1
app_2 > filebeat_2 > logstash_2 > elasticsearch_2 (es_2)

There's limited need for me to have replica shards as each elasticsearch instance just needs to store locally information from each app.

However, I'd like to be able from my Kibana instance to view all indexes and create visualizations with the data across the two es instances.

Despite heavy reading on the forum, I'm quite unsure as to which of the following might be the best solution for me:

Cluster the two es instances together (making sure I have 0 replica shards in my index mappings)
Setup cross-cluster analysis
Setup a new tribe node and connect Kibana to it

Any recommendations?

gbrown · May 20, 2019, 8:57pm

Replica shards are more useful for resiliency than local querying - if you have two data nodes, and all your data is replicated, if one of your servers dies, you still have a copy of the data accessible.

Without knowing more about your use case, it's difficult to give a concrete recommendation between clustering two nodes and using cross-cluster search, but I can tell you that you absolutely should not set up a tribe node - cross-cluster search is better in just about every way, and tribe node functionality has been entirely removed in 7.0 and up, so staying away from it now will make future upgrades easier.

nandrik · May 21, 2019, 1:24am

Thanks @gbrown, it's clear that the tribe node seems to be the obsolete way.

Right now I have no need for resiliency, i.e. I can afford to lose data but I want to scale out and be able to have more apps creating distributed local data lakes with one Kibana able to view and analyze/visualize everything.

I think this is the cross-cluster search use case and the time I spent configuring it over the weekend, I realized that cross-cluster search seems like a "read only" version of clustering without the privilege to write/replicate indexes across remote nodes.

Is that a fair assessment or am I missing some other capability by doing cross-cluster search and not clustering?

gbrown · May 21, 2019, 3:55pm

That's correct, you can use cross-cluster search to query (and therefore analyze/visualize/all that Kibana goodness) across multiple clusters. You're right that the remote clusters will be read-only from the local one, and data won't be replicated across clusters.

Given what you describe wanting, cross-cluster search does sound like it fits your use case - a read-only connection between your two one-node clusters that allows for search and aggregation through a single Kibana instance. As long as you're okay not being able to write to the "remote" cluster from your single Kibana instance (which it sounds like you are), cross-cluster search should work just fine, and it sounds like you have a good understanding of what it's for.

system · June 18, 2019, 3:55pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Cross-cluster-search setup and r/w to ES Elasticsearch	5	1010	June 28, 2017
Tribe node for elastic stack Elasticsearch	5	432	January 1, 2020
Elasticsearch one big cluster VS tribe node? Elasticsearch	4	654	July 5, 2017
Should I use Cross Cluster Search functionality in production for ES 5.5.0? Elasticsearch	3	566	August 8, 2017
Adding a new and different ElasticSearch node to Kibana Elasticsearch	4	383	June 7, 2022

To cluster or not to cluster?

Related topics