Quite simple, nothing about it seems to work consistently across all pieces to satisfy all requirements.
The only one that's been simple enough has been syslog
, and just by guessing to add in some "suitable" data_stream values that hopefully work with any default dashboards from filebeat (i've not imported them yet).
An input like this
# Handle RFC3164 syslog messages on both TCP and UDP (these are handled by
# default with the input parser)
syslog {
id => "syslog-rfc3164"
type => "syslog"
port => 1514
timezone => "UTC"
ecs_compatibility => "v1"
# NB: you cannot use comments inside the add_field section, will cause a parse error
# data stream fields part of ECS identify the application (dataset),
# namespace (environment), and type (logs or metrics)
# syslog line add's the format to the syslog specific section
add_field => {
"[labels][environment]" => "staging"
"[data_stream][namespace]" => "staging"
"[data_stream][type]" => "logs"
"[data_stream][dataset]" => "syslog"
}
tags => ["staging", "syslog", "rfc3164"]
}
combined with an output like this
elasticsearch {
id => "output-to-cloud-elastic"
cloud_auth => "<redacted>"
cloud_id => "<redacted>"
data_stream => "true"
data_stream_auto_routing => "true"
data_stream_type => "logs"
data_stream_dataset => "syslog"
data_stream_namespace => "staging"
ecs_compatibility => "v1"
}
Data appears to be arriving in the elastic cloud instance and going to the logs-syslog-staging
data stream. Sofar so good.
Next I'd like to get the output from metricbeat into this, we start with a simple box for now
metricbeat.config.modules:
# Glob pattern for configuration loading
path: ${path.config}/modules.d/*.yml
# Set to true to enable config reloading
reload.enabled: false
# ------------------------------ Logstash Output -------------------------------
output.logstash:
enabled: true
# The Logstash hosts
hosts: ["logstash:15044"]
processors:
- add_host_metadata: ~
- add_cloud_metadata: ~
- add_docker_metadata: ~
- add_kubernetes_metadata: ~
For the logstash input we have:
input {
beats {
port => 15044
id => "metrics"
type => "beats"
ecs_compatibility => "v1"
add_field => {
"[labels][environment]" => "staging"
}
tags => ["staging", "beats", "metrics"]
}
}
then some filter data enrichment (geoip stuff)
then when we want to send this data to elasticsearch, we have the following problems:
- The
data_stream
approach
- There is no
data_stream
information coming out of metric beat itself, so we'd have to add them ourselves based upon the event.dataset
.
- However looking at a console output of metricbeat, not everything is in fact a metric, there's the occasional log message in there from the beat.state module. So we'd need some way of distinguishing between them.
- Now we need to get the data into an index that matches the pattern used by the Metricbeat created Dashboards. Which expects things in the
metricbeat-*
index filter.
- And then we finally want to ensure some form of ILM is used to ensure that data/logs are not kept for longer than X days or Y size.
- Ideally the generated data streams match those of the output of other tools in the future (or current) like that of the ElasticAgent.
- Default ILM based index approach
- In metricbeat if you configure an output direct to elasticsearch, with ILM enabled we get
When index lifecycle management (ILM) is enabled, the default index
is "metricbeat-%{[agent.version]}-%{+yyyy.MM.dd}-%{index_num}"
, for example, "metricbeat-7.15.0-2021-10-07-000001"
. Custom index
settings are ignored when ILM is enabled.
- now when we use ILM in logstash, the index created seems to be (due to the use of ecs_compatibility) in the form of
"ecs-logstash-%{+yyyy.MM.dd}"
. This is not the index format expected and used by the dashboards so we'd somehow need to change it to an appropriate index with ILM settings that match those expected by the dashboards and match the metricbeat form of "metricbeat-%{[agent.version]}-%{+yyyy.MM.dd}-%{index_num}"
.
- For performance and resource usage and to minimise the number of connections to logstash, this would need to handle not only metricbeat, but also auditbeat, and filebeat (at a minimum) each with their own index form requirements.
- The ElasticAgent approach
- Doesn't work since there's kubernetes clusters involved that need scraping and nothing goes on the clusters that doesn't have a Helm chart for their deployment.
- Doesn't work because the majority of endpoints do not have direct Internet access and as as such cannot communicate with the Fleet Server on the cloud Elastic deployment directly
- The Standalone configuration could be an option by sending data to Logstash with an
elastic_agent
input. However other than the basics of installing a standalone instance, there is no documentation on configuring the standalone elastic agent to monitor specific log files, metrics, processes, scrape metrics from prometheus exporters running on the same host, etc.
I'd be happy if any of these options somehow worked or could be made to work with minimal effort.
There's probably something I'm missing here with regards to the life cycle material, but there's a couple of simple rules we go by when evaluating new products. (it's a small team and there just isn't enough time to spend a month on something)
- Can a Proof of Concept be done in <5 days
- Can a production ready secure deployment be done in <3 days starting from nothing using the majority of defaults and recommended settings (a few extra days available if it can be done completely via Infrastructure as Code).