Elasticsearch one big cluster VS tribe node?

Problem descriptions:

  • Multiple machines producing logs.
  • On each machine we have logstash which filters the log files and sends them to a local elasticsearch
  • We would like to keep the machines as separate as possible and avoid intercommunication
  • But we would also like to be able to visualize all of these logs with a single Kibana instance

Approaches:

  1. Make each machine a single node ES cluster, and have one of the machines as a tribe node with Kibana installed on this machine (of course with avoiding indices conflict)

  2. Make all machines (nodes) part of a single cluster with each node writing to unique index of one shard and statically map each shard to its node, and finally of course having one instance of kibana for the cluster

Question:

Which approach is more appropriate for the described scenario in terms of: limiting inter machine communications, cluster management, and maybe other aspects that I haven't think about ?

That seems like a lot of work, why can you not have logs living in the same cluster, just use different indices to separate them.

I need to make sure each index lives only on the node which has created it, and as I have mentioned, unless someone wants to launch a query with Kibana I don't want machines to communicate

I am not sure what the practical limit for a tribe node is, but suspect the approach you described will not scale very well. If you want to limit the network traffic on the application nodes, I would instead suggest setting up a central Elasticsearch cluster and have Filebeat send logs from the application servers to one or more central Logstash instance that indexes into Elasticsearch. Filebeat is able to compress the data when sending to Logstash, and will that way minimise network traffic.

This is the standard way to deploy and will most likely be a lot easier to manage as well as being more robust.