Hey there. I'm running a docker swarm stack in production, and was having some issues with too many sockets being created back and forth and sockets hanging in TIME_WAIT etc. etc.
I thought that socket_summary included sockets in the user "net" namespace, that docker utilizes.
But at least with my setup it doesn't.
Is it possible to somehow make socket_summary count all sockets on the system, not just those in the root namespace?
Metricbeat is installed as a systemd service and is running as root, it is not running inside the swarm.
I ssh'ed into the hosts to find the issue and fired this little script
#!/bin/bash
for ID in $(docker ps -q)
do
PID=$(docker inspect -f '{{.State.Pid}}' $ID)
echo -en $(docker inspect -f '{{.Name}}' $ID) " \t"
nsenter -t $PID -n netstat -na | wc -l
done
Here is my metricbeat.yml
########################## Metricbeat Configuration ###########################
# Configuration reference:
# https://www.elastic.co/guide/en/beats/metricbeat/index.html
#========================== Modules configuration ============================
## Ansible managed
metricbeat.modules:
#------------------------------- System Module -------------------------------
- module: system
metricsets:
- "cpu"
- "filesystem"
- "load"
- "memory"
- "diskio"
- "process_summary"
- "socket_summary"
enabled: true
period: 1m
# Configure the metric types that are included by these metricsets.
cpu.metrics: ["normalized_percentages"]
# Collect container metrics in cgroups.
process.cgroups.enabled: true
- module: docker
metricsets:
- "container"
- "cpu"
- "diskio"
- "event"
- "healthcheck"
- "info"
- "memory"
hosts: ["unix:///var/run/docker.sock"]
period: 5m
enabled: true
cpu.cores: false
#================================ General ======================================
# set beatname
name: somehostname
#================================ Outputs ======================================
# Configure what output to use when sending the data collected by the beat.
#-------------------------- Elasticsearch output -------------------------------
output.elasticsearch:
# Array of hosts to connect to.
# Scheme and port can be left out and will be set to the default (http and 9200)
# In case you specify and additional path, the scheme is required: http://localhost:9200/path
# IPv6 addresses should always be defined as: https://[2001:db8::1]:9200
hosts: [
https://elastic.something.com:443/
]
username: "XXXXX"
password: "XXXXX"
#============================== Template =====================================
setup.template.enabled: true
setup.template.name: "metricbeat"
# Elasticsearch template settings
setup.template.settings:
index:
number_of_shards: 1
number_of_replicas: 0
#============================== Setup ILM =====================================
# Configure index lifecycle management (ILM).
# Set the prefix used in the index lifecycle write alias name. The default alias
# name is 'metricbeat-%{[agent.version]}'.
setup.ilm.rollover_alias: "metricbeat"
setup.ilm.policy_name: "metricbeat-policy"
#================================ Logging ======================================
# There are three options for the log output: syslog, file, stderr.
logging.to_files: true
logging.level: warning