I'm new to the ELK stack and am trying to figure out how to configure filebeat+apache 2. I have the system doing some basic work such as syslog going from filebeat->logstash->ES, but I'm finding the documentation for setting up apache2 module very sparse.
Basically I have an apache 2.2 server outputting in a custom log format. I want to use filebeat and the apache2 module to read those logs, push the data to logstash->ES. The documentation on the elastic site is not very detailed on what needs to be done.
How do I get filebeat + apache module to read the custom format? I have some basic configuration for the module (ie. file paths) but where and how do I setup the custom log format I'm using? My apache2 custom log format:
LogFormat "%{X-ClientIP}i %{True-Client-IP}i %h %D %l %u %t "%r" %>s %b "%{Referer}i" "%{User-Agent}i" %X "%{u_locale}C" "%{u_locale}i" "%{u_locale}o"" combined
What do I need to do on the logstash server?
I have not configured any prospectors (I'm assuming I need to based on the log output when I try to set this up):
./filebeat -e --modules apache2 -setup
beat.go:346: CRIT Exiting: Error getting config for fielset apache2/access: Error interpreting the template of the prospector: template: text:3:22: executing "text" at <.paths>: range can't iterate over /apache/logs/access.log
Exiting: Error getting config for fielset apache2/access: Error interpreting the template of the prospector: template: text:3:22: executing "text" at <.paths>: range can't iterate over /apache/logs/access.log
The modules in Filebeat currently have a limitation in that they can only be used when the data is sent directly to Elasticsearch. This is because they rely on the Elasticsearch Ingest Node feature.
Since you are using Logstash already and you have a custom format I recommend that you add a grok filter to your LS config to parse the data. This tool is helpful for developing a grok pattern for you custom logs.
In filebeat add a new prospector to pick up your Apache logs.
Thank you Andrew. I was able to get things working. Still having a bit of trouble, and i think it has to do with the IP address /grok expression I'm using.
Right now, we collect three possible IP addresses (one passed in by our load balancer (of the CDN node), one from our CDN (of the actual user), and one if something goes directly to our webserver). If the request didn't go through the load balancer or CDN, we basically have a dash in the 1st and 2nd IP addresses log entries.
When setting up the fields, I was only able to give it a text field type and not the IP address field type because of the dashes. Is there anything I can do to use the IP address type so we can do IP address searches (eg. a subnet search)
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.