BELK for distributed device log aggregation

Hello Everyone,

I'm trying to use the ElasticSearch community edition for the following scenario.

I have 20,000+ devices working on thousands of geographically distributed offices. Every office has its own private network. I want to collect all the logs from all the devices into a central ElasticSearch cluster which is deployed at the head office. In order to do this, I evaluated several options.

  1. Deploy Filebeats on each device and configure it to report its logs into central ElasticSearch Logstash deployed in the head office.
  2. Deploy Logstash at every office and each Logstash will write logs into central ElasticSearch at the head office.
  3. Develop a custom aggregator beat (using beats lib) which will be deployed at each office. Filebeats on devices will report logs to office level aggregator which forwards logs to Logstash+ElasticSearch at the head office.

Other requirements

  • Guaranteed delivery of logs to the central system
  • Higher acceptance (write) throughput at the central repository
  • Control the data flow (throttling/compression/bulk push) from edges to the central system.
  • Local network of the offices must not be exhausted when central Logstash is inaaccessible by the retries of Filebeats.
  • High Availability
  • Scalability

Clarifications

Which option would be recommended for the above scenario?
Are there any other better options?

Thank you in Advance!

Do you mean the open source software? We don't have community editions.

You can also write to a broker layer (kafka/redis) at each office and then to the Elasticsearch cluster.
Or even have a cluster per office and then use cross cluster search.

It's all up to you, your business requirements and limitations, and risk appetite.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.