I am tasked with designing a log collection architecture which involves collecting application and infrastructure logs from a number of Linux and windows severs and databases which are virtualization clustered and load balanced and separated from the tooling environment firewalls.
I need to understand if I need a local log collector in the remote environments.
how this would work with NSX, ESX, JBOSS.
it was suggested to me to use the following configuratuion
logstash > rabbitMQ > |firwwall|>logstash> elasticsearch
alternatively in the cluster of 9 RHEL servers could/should i deploy a syslog server.
Aslo should i be considerinf beats instead of Logstash
what is the addition of Rabbit MQ to this architecture giving us that Logstash cannot do on its own...
If you place a RabbitMQ broker outside the firewall it becomes a common service that hosts both inside and outside the firewall can connect to. I'd prefer that over having hosts outside the firewall connect to Logstash inside the firewall or have Logstash outside the firewall and connecting to Elasticsearch on the inside.
It also makes it trivial to run multiple Logstash instances that feed from the same queue, increasing fault tolerance and load balancing.
Do i need need to cluster and/or load balance log-stash for resilience
Logstash can't be clustered per se but you can run multiple instances that process events.
also the same for Elasticsearch and Kibana.
Kibana can't be clustered. Whether you need to cluster ES depends on how much data you need to process and what level of fault tolerance you need.
Hi Magnus,
The Log collection system will be collecting application and o/s logs from WIN2012 and RHEL VMs as well as several network devices.
I am advised to use Filebeats for the RHEL o/s and application logs and I am wondering
if this is a good idea
or should i have two agents one for o/s one for application or would this double the overhead for no benefit.
also is there is a Filebeats template for standard RHEL log file input
Also there is an Exadata appliance to be monitored this will send linux o/s and DB audit logs. It is suggested by exadata to use syslogs to send the o/s files to the collector. Is this ok?
or should i have two agents one for o/s one for application or would this double the overhead for no benefit.
I don't think there are any runtime benefits of running two Filebeat instances, but deployment-wise you might want to deploy OS-specific configurations as part of the machine provisioning while configuration of application log collection is owned by the application deployment scripts.
also is there is a Filebeats template for standard RHEL log file input
What do you mean by template?
Also there is an Exadata appliance to be monitored this will send linux o/s and DB audit logs. It is suggested by exadata to use syslogs to send the o/s files to the collector. Is this ok?
And by collector you mean Logstash? Sure, that's fine.
It's quite simple: Filebeat can't send to RabbitMQ on its own. If you choose Kafka or Redis instead of RabbitMQ you don't need that Logstash instance since Filebeat has native support.
is there any particular advantage to doing the file filtering in the local Logstash instance before transit to the remote Logstash/elasticsearch platform.
WInlogbeat and Filebeat behave the same in this regard. Syslog daemons typically only send over the syslog protocol (but you could configure Filebeat to read the logs from disk). Logstash can capture syslog messages sent over the network.
is there any particular advantage to doing the file filtering in the local Logstash instance before transit to the remote Logstash/elasticsearch platform.
I want to send syslog logfiles securely from my VMware servers through my management zone firewall to Logstash.
I need to understand how this could work
do i need to use syslog-ng, or can logstash integrate securely with syslog?
can logstash take an input from syslog-ng.
will i need to deploy syslog (or syslog-ng servers)
will i be able to use mutual certificate authentication?
These are really RabbitMQ questions but I'll answer them quickly.
What protections will be put in place to ensure that only authorized ‘processes’ access queues? And that they ONLY access their OWN queues
what controls will be put in place to prevent unauthorized message disclosure / deletion / duplication or relay
All AMQP clients need to authenticate. RabbitMQ supports fairly fine-grained access control over what you're allowed to set up queues against (i.e. what kind of information you can access) and which queues you're allowed to consume from. You could e.g. have a policy where anyone can set up a queue to subscribe to any messages but you can only consume from your own queues.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.