There appears to be a couple of issues with the Metricbeat 8.2.1 template for the Elasticsearch metric monitoring. I am running a newly built 8.2.1 cluster (in Docker) and attempting to use Metricbeat to gather Stack Monitoring data. I am encountering 2 issues:
-
I can't access the Stack Monitoring page, no matter what access I have (I have tried the built-in elastic superuser and created a new users with the requisite roles specifically to no avail.
-
Metricbeat events are failing to be indexed for the dataset Elasticsearch.node.stats, due to a bad mapping value.
The field 'Elasticsearch.node.stats.os.cgroup.memory.limit.bytes' has a data type of 'long', yet when I query my node stats, I get a value back of 'max' which fails to index, and as a result I have no Elasticsearch metrics to monitor (potentially the permissions error is a red herring and it is just that the data doesn't exist in ES?)
Has anyone come across these and resolved them before? I know 8.2.1 is brand new, but I had the same issue on 8.2.0 (I patched to try and resolve). Currently it looks like I would be either Introducing an Ingest pipeline to do a transform on the value for Elasticsearch.node.stats.os.cgroup.memory.limit.bytes to replace 'max' with 0 or something similar, OR I change the template to be a keyword (but I haven't checked what this may or may not break).
Cluster is a brand new 8.2.1 build running on Docker and contains 2 x nodes (on 2 separate hosts) as well as Kibana, an Elastic Agent and Metricbeat (for cluster monitoring rather than the built-in as it is deprecated)
Any help would be much appreciated. Happy to raise an issue in Github if it is warranted.
I see that field on elasticsearch/monitoring-es-mb.json at master · elastic/elasticsearch · GitHub and I can view it in a test cluster of my own.
So as far as I know it should work. If you're able to provide a docker-compose example, I'd be happy to run it.
Metricbeat configuration and index mappings would also be helpful to see.
I wonder if maybe it's writing to metricbeat-* and you haven't run metricbeat setup yet.
@matschaffer , thanks for your input.
The issue is not that the mapping doesn't exist, but that it is of type 'long', yet Metricbeat is returning a value of 'max' and that can also be seen from the cluster stats API, so to me Metricbeat is doing the right thing. I don't have any cgroup based restrictions enforced, so I assume that is why the value is 'max'. I have tried building an ingest pipeline to clean this up, but it hasn't helped. (probably further to travel down that route)
The exact error I am seeing is below
{\"type\":\"mapper_parsing_exception\",\"reason\":\"failed to parse field [elasticsearch.node.stats.os.cgroup.memory.limit.bytes] of type [long] in document with id 'XXXXXXXXX'. Preview of field's value: 'max'\",\"caused_by\":{\"type\":\"illegal_argument_exception\",\"reason\":\"For input string: \\\"max\\\"\"}}, dropping event!","service.name":"metricbeat","ecs.version":"1.6.0"}
My distro is Debian 11, so I'm not sure if that makes any difference either.
I'm deploying via Ansible, so I don't have a straightforward Docker Compose to provide you, but a subset of my metricbeat config is as below:
metricbeat.config:
modules:
path: ${path.config}/modules.d/*.yml # Default metricsets we want on everything, system and docker
# Reload module configs as they change:
reload.enabled: false
# This will use hints to find things based on label, as well as what I have below regarding Elastic Stack modules.
metricbeat.autodiscover:
providers:
- type: docker
hints.enabled: true
templates:
# Replaces deprecated internal Elasticsearch monitoring - https://www.elastic.co/guide/en/beats/metricbeat/current/metricbeat-module-elasticsearch.html
- condition:
contains:
docker.container.image: elasticsearch
config:
- module: elasticsearch
period: 10s
hosts: ["redacted"]
xpack.enabled: true
enabled: true
scope: node
output.elasticsearch:
output config goes here
Huh, that’s interesting. What does the ES api return? I see on
https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-stats.html it might be a string. Could be metricbeat isn’t accounting for that.
Cheers @matschaffer . Always nice when I find an issue that actually is an issue and not me just being a spud 
I will in the meantime modify the template to be a keyword and see how I go.