Recommended ELK architecture for production?

Is there a recommended "minimal" ELK architecture for production? I know, I know, this depends

In my case, I want to set up a central ELK stack where we can send our application logs (via rsyslog) to the stack. Here is my production environment

  • 6-8 customer sites
  • Each site will have up to 6 application servers
  • Each application server can have 'up to" 150MB of "rolling" log data
  • We have to retain the data for up to 3 months

I'm thinking, 4 servers?

  • 3 Elasticsearch servers to form the cluster
  • 1 server for both Kibana and Logstash

Feedback? Is there a guideline available?

That sounds ok.

The biggest thing you need to watch is over sharding.

If you don't want missing log, I recommend you shoud be have one message queue. like kafka, rabbitm-mq

In an ES cluster, what IP do I put in Logstash's output's elasticsearch configuration? IOW, do I do this?

output {
  elasticsearch {
    hosts => ["es_host1", "es_host2", "es_host3"]
  }
}

Yep!

Thanks!

On the 1 server that houses Kibana and Logstash, do I need to install Elasticsearch on it too, but set it up as a client node?

You don't have to, but it's not a bad idea.

The following line

"Essentially, client nodes behave as smart load balancers."

from Elasticsearch nodes convinced me to install ES on the server with Kibana and Logstash on it, and run it as a client node.

So in this configuration, where I have a client node. should the logstash output plugin point to the client node?

output {
  elasticsearch {
    hosts => ["es_client_node_ip"]
  }
}

Yep!

Having a single client node will however introduce a single point of failure. You can provide a list of hosts to Logstash and recent versions also support sniffing, meaning they can find new nodes being added to the cluster, so having a client node with Logstash is usually not required.

@Christian_Dahlqvist, thanks for the advice. I'm evaluating all these options now.

For Logstash, I may end up not using the Elasticsearch client.

With Kibana. how do I set up kibana.yml if I don't use a client, that is, what would be the value of "elasticsearch.url" for a 3-node cluster? Do I just point it to one of the nodes in the cluster? With a client node, I point Kibana to the client node right?

Kibana, unlike Logstash, do benefit from a client node or a load balancer.

You are indeed correct. From Using Kibana in a Production Environment

If you have multiple nodes in your Elasticsearch cluster, the easiest way to distribute Kibana requests across the nodes is to run an Elasticsearch client node on the same machine as Kibana. Elasticsearch client nodes are essentially smart load balancers that are part of the cluster. They process incoming HTTP requests, redirect operations to the other nodes in the cluster as needed, and gather and return the results.

1 Like