Separate indexes for each kubernetes namespace

gurusrinivasamurthy · February 20, 2019, 4:10am

Hi Team

Setup:

ELK cluster is setup using docker-compose on one beefy bare metal server.
On the kubernetes side I am running filebeat as a DaemonSet to ship container logs to logstash.

Our K8S cluster will have around 50 namespaces and each namespace will have around 10 pods running.

Question:

I want to have a separate index created for each kubernetes namespace. Also whenever a new namespace is added, an index has to be auto-created as well. How do i achieve it ?

Configurations below.

Filebeat configuration

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: kube-system
  labels:
    k8s-app: filebeat
data:
  filebeat.yml: |-
    filebeat.config:
      inputs:
        # Mounted `filebeat-inputs` configmap:
        path: ${path.config}/inputs.d/*.yml
        # Reload inputs configs as they change:
        reload.enabled: false
      modules:
        path: ${path.config}/modules.d/*.yml
        # Reload module configs as they change:
        reload.enabled: false

    # To enable hints based autodiscover, remove `filebeat.config.inputs` configuration and uncomment this:
    #filebeat.autodiscover:
    #  providers:
    #    - type: kubernetes
    #      hints.enabled: true

    processors:
      - add_cloud_metadata:
    output.logstash:
      hosts: ['${LOGSTASH_HOST}:${LOGSTASH_PORT}']
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-inputs
  namespace: kube-system
  labels:
    k8s-app: filebeat
data:
  kubernetes.yml: |-
    - type: docker
      containers.ids:
      - "*"
      processors:
        - add_kubernetes_metadata:
            in_cluster: true
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    k8s-app: filebeat
spec:
  template:
    metadata:
      labels:
        k8s-app: filebeat
    spec:
      serviceAccountName: filebeat
      terminationGracePeriodSeconds: 30
      containers:
      - name: filebeat
        image: private_registry/filebeat:6.6.1
        args: [
          "-c", "/etc/filebeat.yml",
          "-e",
        ]
        env:
        - name: LOGSTASH_HOST
          value: "logstash_hostname"
        - name: LOGSTASH_PORT
          value: "5044"
        securityContext:
          runAsUser: 0
          # If using Red Hat OpenShift uncomment this:
          #privileged: true
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - name: config
          mountPath: /etc/filebeat.yml
          readOnly: true
          subPath: filebeat.yml
        - name: inputs
          mountPath: /usr/share/filebeat/inputs.d
          readOnly: true
        - name: data
          mountPath: /usr/share/filebeat/data
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      volumes:
      - name: config
        configMap:
          defaultMode: 0600
          name: filebeat-config
      - name: varlibdockercontainers
        hostPath:
          path: /data/docker/containers
      - name: inputs
        configMap:
          defaultMode: 0600
          name: filebeat-inputs
      # data folder stores a registry of read status for all files, so we don't send everything again on a Filebeat pod restart
      - name: data
        hostPath:
          path: /var/lib/filebeat-data
          type: DirectoryOrCreate
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: filebeat
subjects:
- kind: ServiceAccount
  name: filebeat
  namespace: kube-system
roleRef:
  kind: ClusterRole
  name: filebeat
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: filebeat
  labels:
    k8s-app: filebeat
rules:
- apiGroups: [""] # "" indicates the core API group
  resources:
  - namespaces
  - pods
  verbs:
  - get
  - watch
  - list
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: filebeat
  namespace: kube-system
  labels:
    k8s-app: filebeat
---

LogStash pipeline configuration

cat pipeline/logstash.conf
## Add your filters / logstash plugins configuration here
input {
  beats {
    port => 5044
  }
}

output {
  elasticsearch {
    hosts => "elasticsearch:9200"
    manage_template => false
    index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
    document_type => "%{[@metadata][type]}"
  }
}

Thanks

gurusrinivasamurthy · February 20, 2019, 8:51pm

I am a newbie to ELK stack. Any help will be appreciated.

Badger · February 20, 2019, 9:58pm

If the events have a field on them that contains the namespace name then you can use a reference to that in the index option for the elasticsearch output

index => "%{[fieldContainingNamespaceName]}"

If you have 50 namespaces then you are going to end up with a large number of shards, which can impact performance in elasticsearch. You may want to reduce the number of shards per index.

gurusrinivasamurthy · February 22, 2019, 7:31am

Hi @Badger, Thanks for the hint and recommendations. I was able to get the namespace(s) as index name(s).

In case someone is looking for config

Here's how the fields looks like when queried from kibana.

{
  "_index": "test-1",
  "_type": "doc",
  "_id": "upAUFGkB4pTTTbuh8MFh",
  "_version": 1,
  "_score": 0,
  "_source": {
    "host": {
      "name": "filebeat-kztmv"
    },
    "source": "/var/lib/docker/containers/41ce86fd65c4de705ef65331f79cc946033140a2379ead7ab3443e587f73bb7a/41ce86fd65c4de705ef65331f79cc946033140a2379ead7ab3443e587f73bb7a-json.log",
    "message": "Running from: /usr/share/jenkins/ref/warfile/jenkins.war",
    "prospector": {
      "type": "docker"
    },
    "input": {
      "type": "docker"
    },
    "offset": 0,
    "@version": "1",
    "kubernetes": {
      "pod": {
        "uid": "71f01698-3672-11e9-8bb8-246e96538240",
        "name": "test-1-5f87d986d9-jpt9s"
      },
      "labels": {
        "app": "test-1",
        "pod-template-hash": "1943854285"
      },
      "namespace": "test-1",
      "replicaset": {
        "name": "test-1-5f87d986d9"
      },
      "container": {
        "name": "test-1"
      },
      "node": {
        "name": "nodename"
      }
    },
    "tags": [
      "beats_input_codec_plain_applied"
    ],
    "beat": {
      "name": "filebeat-kztmv",
      "hostname": "filebeat-kztmv",
      "version": "6.6.1"
    },
    "log": {
      "file": {
        "path": "/var/lib/docker/containers/41ce86fd65c4de705ef65331f79cc946033140a2379ead7ab3443e587f73bb7a/41ce86fd65c4de705ef65331f79cc946033140a2379ead7ab3443e587f73bb7a-json.log"
      }
    },
    "stream": "stdout",
    "@timestamp": "2019-02-22T07:21:48.929Z",
    "docker": {
      "container": {
        "name": "k8s_test-1_test-1-5f87d986d9-jpt9s_test-1_71f01698-3672-11e9-8bb8-246e96538240_0",
        "labels": {
          "annotation": {
            "io": {
              "kubernetes": {
                "pod": {
                  "terminationGracePeriod": "10"
                },
                "container": {
                  "hash": "eb4c7704",
                  "restartCount": "0",
                  "ports": "[{\"containerPort\":8080,\"protocol\":\"TCP\"},{\"containerPort\":50000,\"protocol\":\"TCP\"}]",
                  "terminationMessagePath": "/dev/termination-log",
                  "terminationMessagePolicy": "File"
                }
              }
            }
          },
          "io": {
            "kubernetes": {
              "container": {
                "name": "test-1",
                "logpath": "/var/log/pods/71f01698-3672-11e9-8bb8-246e96538240/test-1/0.log"
              },
              "pod": {
                "name": "test-1-5f87d986d9-jpt9s",
                "uid": "71f01698-3672-11e9-8bb8-246e96538240",
                "namespace": "test-1"
              },
              "sandbox": {
                "id": "eb6c9dcee2c4274e6360009c338b021e8198b45481c86c1c3b1bd5647b35a936"
              },
              "docker": {
                "type": "container"
              }
            }
          }
        },
        "id": "41ce86fd65c4de705ef65331f79cc946033140a2379ead7ab3443e587f73bb7a",
        "image": "private_registrybasemaster@sha256:ee6be361d63ea3a68372662a4f3743a73aabd69593fd73b79eebc12217e3225f"
      }
    }
  },
  "fields": {
    "@timestamp": [
      "2019-02-22T07:21:48.929Z"
    ]
  }
}

logstash.conf for logstash pipeline was modified to this

## Add your filters / logstash plugins configuration here
input {
  beats {
    port => 5044
  }
}

output {
  elasticsearch {
    hosts => "elasticsearch:9200"
    manage_template => false
    index => "%{[kubernetes][namespace]}"
    document_type => "%{[@metadata][type]}"
  }
}

gurusrinivasamurthy · February 22, 2019, 7:36am

@Badger / All

Now that i have the indexes recognized by elasticsearch, how do I have them created automatically in elasticsearch.

In other words, when there is a new namespace, I want the index with that namespace to be created automatically in elasticsearch. Could you please guide me on that ?

Thanks

Badger · February 22, 2019, 1:24pm

That will happen automatiically. If there is a new value in ' index => "%{[kubernetes][namespace]}"' then a new index will be created.

gurusrinivasamurthy · March 4, 2019, 7:03pm

Thanks @Badger. You were right. the indexes are automatically created in elasticsearch.

Is there a way that we can create index paths in Kibana automatically for all these elasticsearch indexes. Currently kibana lists down the indexes avaliable in elasticsearch and we need to manually create index path in kibana before we can search for specific data from kibana.

Thanks

Badger · March 4, 2019, 7:06pm

If using * as an index pattern does not work then have a constant prefix to the index name...

index => "something-%{[kubernetes][namespace]}"

gurusrinivasamurthy · March 4, 2019, 7:41pm

Since our kuberntes namespaces are not unique, i added k8s- as the prefix in logstash output, so that my ES indexes looks like these below.

k8s-ingress-nginx
k8s-kube-system
k8s-test-1

However in Kibana, when I provide the index pattern as k8s-*, it creates a single index pattern obviously matching all the three namespaces.

I am looking for having a separate index pattern for each namespace. How do we achieve it? Could you please help.

Badger · March 4, 2019, 8:21pm

I don't have Kibana installed, but if I recall correctly it saves things like index patterns in an index called '.kibana'. Adding an index pattern would involve either adding a document or adding content to a document. You could go to Dev Tools / Console and see what is in that index.

Directly manipulating configuration documents is almost certainly unsupported, and may break upgrades, but I have been known to do it.

system · April 1, 2019, 8:27pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat trouble with separate indexes per namespace Beats filebeat	1	1185	November 17, 2021
Daemonset - collecting the pod's namespace along with log Beats docker , filebeat	2	1359	August 17, 2021
In logstash, I want to save logs to different outputs for each k8s namespace Logstash docker	4	1333	September 28, 2021
Part of the indices are created with this name "%{[kubernetes][namespace]}" Logstash	4	1305	April 15, 2021
Filebeat configuration file troubles Beats docker , filebeat	1	249	August 3, 2021

Separate indexes for each kubernetes namespace

Related topics