ELK Cluster Design Best Approach


(Logeswaran) #1

Hi All,

We are working on designing ELK cluster for our client. We came up with the below design approach.

  1. Filebeat will source the data from log files, add tags and send it to Kafka Producer
  2. Kafka consumer will read from Kafka producer and source it to the logstash
  3. Logstash will parse the logs based on the tags and index to the Elastic Cluster.

We have got questions related to the best design approach

  1. We have different clients, is it good to have individual filebeat configuration files for each client or can we have the configuration in one filebeat configuration file.
  2. Similarly is it good to have individual logstash file configuration for each clients.
  3. To load balance the writing speed of logstash into Elastic cluster do we need a load balancer like Kafka in between them

Thanks,
Loki


(Christian Dahlqvist) #2

Given that all your questions are related to Beats or Logstash rather than Elasticsearch, I would suggest moving the post to the appropriate category.


(Magnus Bäck) #3

We have different clients, is it good to have individual filebeat configuration files for each client or can we have the configuration in one filebeat configuration file.

Filebeat doesn't care. Use whatever fits your needs.

Similarly is it good to have individual logstash file configuration for each clients.

Same thing here. However, you may want to consider running multiple Logstash pipelines if you want to isolate events from different clients. Depends on how many clients you have though; it's probably not advisable to have too many pipelines.

To load balance the writing speed of logstash into Elastic cluster do we need a load balancer like Kafka in between them

I don't see how it's even possible to put Kafka between Logstash and Elasticsearch, but if you're talking about an HTTP load balancer I don't think it's necessary. Logstash's elasticsearch output will distribute requests between the ES nodes on its own, and most of the work during indexing is done by the node(s) storing the shards being written to and you can't control that by distributing the requests.


(Logeswaran) #4

@magnusbaeck Thanks for the reply.

Filebeat doesn't care. Use whatever fits your needs

I am new to filebeat. As far as I know under prospectors in filebeat.yml file,we would be having 3 log files for each client and approximately we have 25 clients,then we will have 75 tags defined. But we will have only one output per .yml file. We have individual kafka topic for each client in that case should I need individual filbeat configurations?

Similarly is it good to have individual logstash file configuration for each clients

We have individual kafka consumers for each client , then is it possible to configure logstash to listen to all the kafka topics and the parse/filter the data based on the tags or we should need 25 Logstash pipelines?

I don't see how it's even possible to put Kafka between Logstash and Elasticsearch, but if you're talking about an HTTP load balancer I don't think it's necessary. Logstash's elasticsearch output will distribute requests between the ES nodes on its own, and most of the work during indexing is done by the node(s) storing the shards being written to and you can't control that by distributing the requests

Sorry for the confusion, I meant the http load balancer, we understood that we cant control that request distribution.

Thanks,
Loki


(Magnus Bäck) #5

I am new to filebeat. As far as I know under prospectors in filebeat.yml file,we would be having 3 log files for each client and approximately we have 25 clients,then we will have 75 tags defined. But we will have only one output per .yml file. We have individual kafka topic for each client in that case should I need individual filbeat configurations?

I'm not sure how Filebeat works here. You may need one Filebeat instance per client. Check the documentation.

We have individual kafka consumers for each client , then is it possible to configure logstash to listen to all the kafka topics and the parse/filter the data based on the tags

Sure.

or we should need 25 Logstash pipelines?

No, not for that reason anyway.


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.