I would like to parse my IIS logs into more fields. Both the URL as well as the query string contains data I would like to extract and store in a field.
It seems like sending the logs to Logstash rather than directly to ElasticSearch is the way to go. But when I changed the filebeat.yml to point at Logstash, I lost the GEO-IP data. Additionally the timestamp is incorrect.
In Logstash, I haven't setup any filters yet. I just push the messages to ElasticSearch.
There are two reasons for using Logstash, but I can easily be persuaded to go back to my old configuration. I would prefer to skip Logstash since that would be less processing on the ELK server.
I couldn't figure out how to setup the grok in default.json. It seems like this is where I should set things up to parse out the extra field(s) I want. There is a customerId in the query string I want extracted to its own field.
I would like to enable other teams at our organization to update the ETL stuff. A Logstash pipeline is one location. Otherwise each server would need to be updated, which is not a big deal.
I have been unable to get the filter to selectively work. I assume its not making a match to trigger the filter. This didn't seem to have any affect at all, data still hits elasticsearch the same as before the filter.
For troubleshooting, I briefly set filebeats to output to file:
Here is my filebeats config. Nothing fancy in there. Removed some comments to fit char limit.
###################### Filebeat Configuration Example #########################
#=========================== Filebeat inputs =============================
#filebeat.inputs:
#- type: log
# Change to true to enable this input configuration.
#enabled: true
# Paths that should be crawled and fetched. Glob based paths.
#paths:
#- /var/log/*.log
#- c:\programdata\elasticsearch\logs\*
#- e:\W3SVC1\*.log
#document_type: iis
#exclude_lines: ['^DBG']
#include_lines: ['^ERR', '^WARN']
#exclude_files: ['.gz$']
#fields:
# level: debug
# review: 1
### Multiline options
# Multiline can be used for log messages spanning multiple lines. This is common
# for Java Stack Traces or C-Line Continuation
# The regexp Pattern that has to be matched. The example pattern matches all lines starting with [
#multiline.pattern: ^\[
# Defines if the pattern set under pattern should be negated or not. Default is false.
#multiline.negate: false
# Match can be set to "after" or "before". It is used to define if lines should be append to a pattern
# that was (not) matched before or after or as long as a pattern is not matched based on negate.
# Note: After is the equivalent to previous and before is the equivalent to to next in Logstash
#multiline.match: after
#============================= Filebeat modules ===============================
filebeat.config.modules:
# Glob pattern for configuration loading
path: ${path.config}/modules.d/*.yml
# Set to true to enable config reloading
reload.enabled: false
# Period on which files under path should be checked for changes
#reload.period: 10s
#==================== Elasticsearch template setting ==========================
setup.template.settings:
index.number_of_shards: 3
#index.codec: best_compression
#_source.enabled: false
#================================ General =====================================
# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:
# The tags of the shipper are included in their own field with each
# transaction published.
#tags: ["service-X", "web-tier"]
tags: ["ecomm"]
# Optional fields that you can specify to add additional information to the
# output.
#fields:
# env: staging
#============================== Dashboards =====================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here, or by using the `-setup` CLI flag or the `setup` command.
#setup.dashboards.enabled: false
# The URL from where to download the dashboards archive. By default this URL
# has a value which is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:
#============================== Kibana =====================================
# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:
# Kibana Host
# Scheme and port can be left out and will be set to the default (http and 5601)
# In case you specify and additional path, the scheme is required: http://localhost:5601/path
# IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
host: "myserver.net:5601"
#============================= Elastic Cloud ==================================
# These settings simplify using filebeat with the Elastic Cloud (https://cloud.elastic.co/).
# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:
# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth:
#================================ Outputs =====================================
# Configure what output to use when sending the data collected by the beat.
#-------------------------- Elasticsearch output ------------------------------
#output.elasticsearch:
# Array of hosts to connect to.
#hosts: ["myserver.net:9200"]
# Optional protocol and basic auth credentials.
#protocol: "https"
#username: "elastic"
#password: "changeme"
#----------------------------- Logstash output --------------------------------
output.logstash:
# The Logstash hosts
hosts: ["myserver.net:5044"]
# Optional SSL. By default is off.
# List of root certificates for HTTPS server verifications
#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]
# Certificate for SSL client authentication
#ssl.certificate: "/etc/pki/client/cert.pem"
# Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"
#------------------------ File output for debugging -------------------
#output.file:
# Boolean flag to enable or disable the output module.
#enabled: true
# Path to the directory where to save the generated files. The option is
# mandatory.
#path: "e:\\DevOpsLogs"
# Name of the generated files. The default is `filebeat` and it generates
# files: `filebeat`, `filebeat.1`, `filebeat.2`, etc.
#filename: filebeat_iis
# Maximum size in kilobytes of each file. When this size is reached, and on
# every filebeat restart, the files are rotated. The default value is 10240
# kB.
#rotate_every_kb: 10000
# Maximum number of files under path. When this number of files is reached,
# the oldest file is deleted and the rest are shifted from last to first. The
# default is 7 files.
#number_of_files: 7
#================================ Logging =====================================
# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
#logging.level: debug
# At debug level, you can selectively enable logging only for some components.
# To enable all selectors use ["*"]. Examples of other selectors are "beat",
# "publish", "service".
#logging.selectors: ["*"]
#============================== Xpack Monitoring ==============================
# filebeat can export internal metrics to a central Elasticsearch monitoring
# cluster. This requires xpack monitoring to be enabled in Elasticsearch. The
# reporting is disabled by default.
# Set to true to enable the monitoring reporter.
#xpack.monitoring.enabled: false
# Uncomment to send the metrics to Elasticsearch. Most settings from the
# Elasticsearch output are accepted here as well. Any setting that is not set is
# automatically inherited from the Elasticsearch output configuration, so if you
# have the Elasticsearch output configured, you can simply uncomment the
# following line.
#xpack.monitoring.elasticsearch:
No, I didn't see any grok errors. And I am pretty certain the ingest of geoip is working. When I check syslog, I see warnings about being unable to locate local ip addresses, which of course makes sense.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.