Elastic Agent, single custom log to multiple indices via ingest pipeline

I am using elastic agent to capture the custom log from a pihole server, but I would like to split out the dns logs and the dhcp logs that are both found in pihole.log into different indexes.

Things are working well, pretty easy straight forward process here, but when I want to use my ingest pipeline to conditionally take some of these documents and put them into a separate index, nothing is written to the other index.

My simple ingest pipeline

[
  {
    "grok": {
      "field": "message",
      "patterns": [
        "%{SYSLOGTIMESTAMP:timestamp} %{SYSLOGPROG}"
      ]
    }
  },
  {
    "date": {
      "field": "timestamp",
      "formats": [
        "MMM dd HH:mm:ss",
        "MMM d HH:mm:ss"
      ],
      "timezone": "CST6CDT"
    }
  },
  {
    "set": {
      "field": "_index",
      "value": "dhcp",
      "if": "ctx?.program == 'dnsmasq-dhcp'"
    }
  }
]

When I test my pipeline with a document with program == dnsmasq-dhcp, the test shows the output correctly.
Input:

[
{
  "_index": ".ds-logs-pihole-default-2021.11.04-000001",
  "_type": "_doc",
  "_id": "ziJAQ30BlfGFPPvm8jRA",
  "_version": 1,
  "_score": 1,
  "_source": {
    "agent": {
      "hostname": "pihole",
      "name": "pihole",
      "id": "082ac260-4c92-437c-a1be-7ac20cad3c90",
      "ephemeral_id": "5bfbf7ce-503b-4e56-8a43-72a1b17c46d8",
      "type": "filebeat",
      "version": "7.15.1"
    },
    "log": {
      "file": {
        "path": "/var/log/pihole.log"
      },
      "offset": 13012150
    },
    "elastic_agent": {
      "id": "082ac260-4c92-437c-a1be-7ac20cad3c90",
      "version": "7.15.1",
      "snapshot": false
    },
    "pid": "1028",
    "program": "dnsmasq-dhcp",
    "message": "Nov 21 10:07:48 dnsmasq-dhcp[1028]: DHCPREQUEST(enp1s0) 192.168.3.5 10:08:b1:6d:40:4e ",
    "input": {
      "type": "log"
    },
    "@timestamp": "2021-11-21T10:07:48.000-06:00",
    "ecs": {
      "version": "1.11.0"
    },
    "data_stream": {
      "namespace": "default",
      "type": "logs",
      "dataset": "pihole"
    },
    "host": {
      "hostname": "pihole",
      "os": {
        "kernel": "5.11.0-40-generic",
        "codename": "focal",
        "name": "Ubuntu",
        "type": "linux",
        "family": "debian",
        "version": "20.04.3 LTS (Focal Fossa)",
        "platform": "ubuntu"
      },
      "containerized": false,
      "ip": [
        "192.168.2.2",
        "fe80::84b1:1fc2:bc0c:8ce1",
        "10.8.0.1",
        "fe80::c65b:fb20:25cf:e899"
      ],
      "name": "pihole",
      "id": "f6876d5930b84506882073106cc0096e",
      "mac": [
        "52:54:00:87:dc:2d"
      ],
      "architecture": "x86_64"
    },
    "event": {
      "dataset": "pihole"
    },
    "dhcp": [
      "boners"
    ],
    "timestamp": "Nov 21 10:07:48"
  },
  "fields": {
    "elastic_agent.version": [
      "7.15.1"
    ],
    "pid": [
      "1028"
    ],
    "host.hostname": [
      "pihole"
    ],
    "program": [
      "dnsmasq-dhcp"
    ],
    "host.mac": [
      "52:54:00:87:dc:2d"
    ],
    "host.ip": [
      "192.168.2.2",
      "fe80::84b1:1fc2:bc0c:8ce1",
      "10.8.0.1",
      "fe80::c65b:fb20:25cf:e899"
    ],
    "agent.type": [
      "filebeat"
    ],
    "host.os.version": [
      "20.04.3 LTS (Focal Fossa)"
    ],
    "host.os.kernel": [
      "5.11.0-40-generic"
    ],
    "host.os.name": [
      "Ubuntu"
    ],
    "agent.name": [
      "pihole"
    ],
    "host.name": [
      "pihole"
    ],
    "elastic_agent.snapshot": [
      false
    ],
    "host.id": [
      "f6876d5930b84506882073106cc0096e"
    ],
    "dhcp": [
      "boners"
    ],
    "timestamp": [
      "Nov 21 10:07:48"
    ],
    "host.os.type": [
      "linux"
    ],
    "elastic_agent.id": [
      "082ac260-4c92-437c-a1be-7ac20cad3c90"
    ],
    "data_stream.namespace": [
      "default"
    ],
    "host.os.codename": [
      "focal"
    ],
    "input.type": [
      "log"
    ],
    "log.offset": [
      13012150
    ],
    "agent.hostname": [
      "pihole"
    ],
    "message": [
      "Nov 21 10:07:48 dnsmasq-dhcp[1028]: DHCPREQUEST(enp1s0) 192.168.3.5 10:08:b1:6d:40:4e "
    ],
    "data_stream.type": [
      "logs"
    ],
    "host.architecture": [
      "x86_64"
    ],
    "@timestamp": [
      "2021-11-21T16:07:48.000Z"
    ],
    "agent.id": [
      "082ac260-4c92-437c-a1be-7ac20cad3c90"
    ],
    "ecs.version": [
      "1.11.0"
    ],
    "host.containerized": [
      false
    ],
    "host.os.platform": [
      "ubuntu"
    ],
    "data_stream.dataset": [
      "pihole"
    ],
    "log.file.path": [
      "/var/log/pihole.log"
    ],
    "agent.ephemeral_id": [
      "5bfbf7ce-503b-4e56-8a43-72a1b17c46d8"
    ],
    "agent.version": [
      "7.15.1"
    ],
    "host.os.family": [
      "debian"
    ],
    "event.dataset": [
      "pihole"
    ]
  }
}
]

output of test document

{
  "docs": [
    {
      "doc": {
        "_index": "dhcp",
        "_type": "_doc",
        "_id": "ziJAQ30BlfGFPPvm8jRA",
        "_version": "1",
        "_source": {
          "agent": {
            "name": "pihole",
            "hostname": "pihole",
            "id": "082ac260-4c92-437c-a1be-7ac20cad3c90",
            "ephemeral_id": "5bfbf7ce-503b-4e56-8a43-72a1b17c46d8",
            "type": "filebeat",
            "version": "7.15.1"
          },
          "log": {
            "offset": 13012150,
            "file": {
              "path": "/var/log/pihole.log"
            }
          },
          "elastic_agent": {
            "version": "7.15.1",
            "snapshot": false,
            "id": "082ac260-4c92-437c-a1be-7ac20cad3c90"
          },
          "pid": "1028",
          "program": "dnsmasq-dhcp",
          "message": "Nov 21 10:07:48 dnsmasq-dhcp[1028]: DHCPREQUEST(enp1s0) 192.168.3.5 10:08:b1:6d:40:4e ",
          "input": {
            "type": "log"
          },
          "@timestamp": "2021-11-21T10:07:48.000-06:00",
          "ecs": {
            "version": "1.11.0"
          },
          "data_stream": {
            "namespace": "default",
            "type": "logs",
            "dataset": "pihole"
          },
          "host": {
            "hostname": "pihole",
            "os": {
              "kernel": "5.11.0-40-generic",
              "codename": "focal",
              "name": "Ubuntu",
              "type": "linux",
              "family": "debian",
              "version": "20.04.3 LTS (Focal Fossa)",
              "platform": "ubuntu"
            },
            "containerized": false,
            "ip": [
              "192.168.2.2",
              "fe80::84b1:1fc2:bc0c:8ce1",
              "10.8.0.1",
              "fe80::c65b:fb20:25cf:e899"
            ],
            "name": "pihole",
            "id": "f6876d5930b84506882073106cc0096e",
            "mac": [
              "52:54:00:87:dc:2d"
            ],
            "architecture": "x86_64"
          },
          "event": {
            "dataset": "pihole"
          },
          "dhcp": [
            "boners"
          ],
          "timestamp": "Nov 21 10:07:48"
        },
        "_ingest": {
          "timestamp": "2021-11-24T01:45:23.516787543Z"
        }
      }
    }
  ]
}

So it appears that my ingest pipeline is correctly changing the value of _index to my new index "dhcp"

However no documents are ever added to that index.

Thoughts on how to troubleshoot where this is failing?

Welcome to our community! :smiley: Thanks also for formatting your code and samples, it makes it heaps easier to read!

Have you considered splitting this up into 3 pipelines;

  1. determine if this is a dhcp or a dns request, then use Pipeline processor | Elasticsearch Guide [7.15] | Elastic
  2. send the events to either a specific dns or dhcp pipeline to end up in their own index

Also are you sending everything to a single index, or are you using ILM?

Thanks for the quick reply. Yes I had set it up as two pipelines - where the first one identifies the elements that are DHCP logs, and runs a second pipeline that will change the index. In both scenarios, the output is the same. When I test a document using the kibana interface, the output appears to have the index change as expected, but those documents that should be routed to that second index are never written to the index.

They are also not written to the first index - they are dropped somewhere.

I have not set up ILM, I did manually create the other index "dhcp" and verified that I could write data to it by manually adding a document via the API.

Can you share both pipelines?

That's not ideal for time based data, as your index will bet progressively larger and harder to manage. Take a look at ILM, as you can keep your index naming, but it'll be more efficient under the hood.

Re: ILM - one step at a time, I'll learn take a look at ILM once I just get the data in.

Re: the other pipeline - I deleted it, in an attempt to simplify things and troubleshoot things, and merged it into a single pipeline that just tries to accomplish one task, split the dhcp logs into the dhcp index.

1 Like

Note: I have also tried adding failure parsers as well as set ignore failures, no help. It appears that my pipeline is functioning correctly - as witnessed when I test my pipeline with the document quoted above - but rather it is being dropped after the pipeline processes before indexing. I'm just not sure how to troubleshoot where things are failing with these missing documents.

I think this has to do with the things are that are set up by the elastic-agent custom log integration. it's not the pipeline itself. Using the API to manually post data to that pipeline, things are being correctly processed and routed correctly

Something about how the custom log integration from elastic agent creates a datastream perhaps doesn't allow me to simply split off some data and change the index?