ElasticSearch-8-5-2 ingest-pipeline issue

Hi all,

I recently posted a problem on ingest-pipelines working in testing mode but finally not working. I have the same issue and i m still stuck while I ve tried many tricks (remove special characters from syslog string, etc.)

So, I m currently working on a bind9 log which is "<133>Jan 27 14:35:01 ns1 bind-logs 27-Jan-2023 14:35:01.390 queries: info: client @0x7f6ca42929b8 10.1.14.40#55168 (test.test): query: test.test IN AAAA +E(0) (10.1.17.20)"

My pipeline test result is

{
  "docs": [
    {
      "doc": {
        "_index": ".ds-logs-udp.dns-default-2023.01.03-000001",
        "_id": "zOFw84UB-z0M9c7cU43w",
        "_version": "-3",
        "_source": {
          "input": {
            "type": "udp"
          },
          "agent": {
            "name": "rsyslog",
            "id": "79647a28-6e08-4fcb-9aad-78c8e67b3311",
            "type": "filebeat",
            "ephemeral_id": "aa0bdaa4-4759-4571-80ea-a12493ec14df",
            "version": "8.5.2"
          },
          "@timestamp": "2023-01-27T13:35:01.394Z",
          "ecs": {
            "version": "8.5.0"
          },
          "log": {
            "source": {
              "address": "127.0.0.1:45044"
            }
          },
          "data_stream": {
            "namespace": "default",
            "type": "logs",
            "dataset": "udp.dns"
          },
          "syslog5424_pri": "133",
          "elastic_agent": {
            "id": "79647a28-6e08-4fcb-9aad-78c8e67b3311",
            "version": "8.5.2",
            "snapshot": false
          },
          "syslog5424_sd": "27-Jan-2023 14:35:01.390 queries: info: client @0x7f6ca42929b8 10.1.14.40#55168 (test.test): query: test.test IN AAAA +E(0) (10.1.17.20)",
          "event": {
            "original": "<133>Jan 27 14:35:01 ns1 bind-logs 27-Jan-2023 14:35:01.390 queries: info: client @0x7f6ca42929b8 10.1.14.40#55168 (test.test): query: test.test IN AAAA +E(0) (10.1.17.20)",
            "destination": "10.1.17.20",
            "client_ip": "10.1.14.40",
            "dataset": "udp.dns",
            "query_value": "test.test",
            "record_type": "AAAA"
          },
          "tags": [
            "syslog",
            "forwarded",
            "dns"
          ]
        },
        "_ingest": {
          "timestamp": "2023-01-27T14:34:49.949673256Z"
        }
      }
    }
  ]
}

We can see I ve well the event.destination, event.client_ip, etc. values. But when I'm looking for the documents those fields are not present.

my pipeline code is


[
  {
    "set": {
      "field": "ecs.version",
      "value": "8.5.0"
    }
  },
  {
    "rename": {
      "field": "message",
      "target_field": "event.original",
      "ignore_missing": true
    }
  },
  {
    "grok": {
      "field": "event.original",
      "patterns": [
        "%{SYSLOG5424PRI}%{GREEDYDATA:syslog5424_sd}$"
      ]
    }
  },
  {
    "gsub": {
      "field": "syslog5424_sd",
      "pattern": ".*\\b-logs ",
      "replacement": "",
      "ignore_missing": true,
      "ignore_failure": true
    }
  },
  {
    "grok": {
      "field": "syslog5424_sd",
      "patterns": [
        "%{MONTHDAY}[-]%{MONTH}[-]%{YEAR}\\s*%{TIME}\\s*%{WORD}[:]\\s*%{WORD}[:]\\s*%{WORD}\\s*%{DATA}\\s*%{IP:event.client_ip}[#]%{NUMBER}\\s*\\(%{HOSTNAME}\\)[:]\\s*query:\\s*%{HOSTNAME:event.query_value}\\s*%{WORD}\\s*%{WORD:event.record_type}\\s*%{NOTSPACE}\\s*\\(%{IP:event.destination}\\)"
      ],
      "ignore_missing": true,
      "ignore_failure": true
    }
  },
  {
    "convert": {
      "field": "event.client_data",
      "type": "string",
      "target_field": "event.original : *bind* and not event.original : *audit*",
      "ignore_missing": true,
      "ignore_failure": true
    }
  },
  {
    "convert": {
      "field": "event.destination",
      "type": "ip",
      "ignore_missing": true,
      "ignore_failure": true
    }
  },
  {
    "convert": {
      "field": "event.record_type",
      "type": "string",
      "ignore_missing": true,
      "ignore_failure": true
    }
  },
  {
    "convert": {
      "field": "event.client_port",
      "type": "integer",
      "ignore_missing": true,
      "ignore_failure": true
    }
  },
  {
    "convert": {
      "field": "event.queries",
      "type": "string",
      "ignore_missing": true,
      "ignore_failure": true
    }
  },
  {
    "convert": {
      "field": "event.client",
      "type": "string",
      "ignore_missing": true,
      "ignore_failure": true
    }
  },
  {
    "convert": {
      "field": "event.client_ip",
      "type": "ip",
      "ignore_missing": true,
      "ignore_failure": true
    }
  },
  {
    "convert": {
      "field": "event.day",
      "type": "integer",
      "ignore_missing": true,
      "ignore_failure": true
    }
  },
  {
    "convert": {
      "field": "event.info",
      "type": "string",
      "ignore_missing": true,
      "ignore_failure": true
    }
  },
  {
    "convert": {
      "field": "event.query_value",
      "type": "string",
      "ignore_missing": true,
      "ignore_failure": true
    }
  },
  {
    "convert": {
      "field": "event.misc",
      "type": "string",
      "ignore_missing": true,
      "ignore_failure": true
    }
  }
]

Do you have please any clue for me to find an exit way or to look at logs directly on the elastic host ? I can't find the right log file.

Thanks for your help

Pierre

Hi @Pierre_LANCASTRE

Couple more questions.

What version of components?

What are you using for ingest... filebeat?

What index or data stream are you writing to?

How are you testing? What exactly do you mean testing mode? Kibana Dev Tools?
Kibana ingest pipeline Editor?

Hi Stephen,
I'm using Kibana 8-5-2
I m retrieving the logs from my rsyslog server within "Custom UDP Logs" agents. I have got one agent per kind of logs (dns, clamav, etc.) which are pushing logs into dedicated dataset, pipelines, etc.
For these logs i ve got :
dataset name : udp.dns
ingest pipeline : custom-pipe-logs-udp.dns
all other parameter of the agent are the default ones

For testing, I edit the pipeline, import a test document and check the output. All ticks are green on the different processors, I can see in the output I well get the different expected key/values, but when I'm looking in Analytics-->Discover--> the different event.xxx fields are not there.

NB : I also use the Grok Debugger fom Dev Tools console and my grok command works well

And you set the ingest pipeline? and did you apply syslog parsing or not?

As for logs, you should be able to look at the agent logs for that agent?

Let me take a look and check back a bit later...

I don't use the syslog parser, the few times I tried too I got issues. I will check the agent logs

Ok, in the logs I ve got

[elastic_agent.filebeat][warn] Cannot index event publisher.Event{Content:beat.Event{Timestamp:time.Date(2023, time.January, 27, 15, 33, 51, 903324874, time.Local), Meta:{"input_id":"udp-udp-d5f9c051-1202-4568-acb7-631928755d7a","pipeline":"custom-pipe-logs-udp.dns","raw_index":"logs-udp.dns-default","stream_id":"udp-udp.generic-d5f9c051-1202-4568-acb7-631928755d7a","truncated":false}, Fields:{"agent":{"ephemeral_id":"aa0bdaa4-4759-4571-80ea-a12493ec14df","id":"79647a28-6e08-4fcb-9aad-78c8e67b3311","name":"rsyslog","type":"filebeat","version":"8.5.2"},"data_stream":{"dataset":"udp.dns","namespace":"default","type":"logs"},"ecs":{"version":"8.0.0"},"elastic_agent":{"id":"79647a28-6e08-4fcb-9aad-78c8e67b3311","snapshot":false,"version":"8.5.2"},"event":{"dataset":"udp.dns"},"input":{"type":"udp"},"log":{"source":{"address":"127.0.0.1:45044"}},"message":"\u003c133\u003eJan 27 16:33:51 ns1 bind-logs 27-Jan-2023 16:33:51.899 queries: info: client @0x7f6ca42adad8 192.168.200.4#62380 (discuss.elastic.co): query: discuss.elastic.co IN A + (10.1.17.20)","tags":["syslog","forwarded","dns"]}, Private:interface {}(nil), TimeSeries:false}, Flags:0x1, Cache:publisher.EventCache{m:mapstr.M(nil)}} (status=400): {"type":"mapper_parsing_exception","reason":"failed to parse","caused_by":{"type":"illegal_argument_exception","reason":"Limit of total fields [1000] has been exceeded while adding new fields [1]"}}, dropping event!

I think I miss something since for me I don't have 1000 fields.

I ll try to increase the limit

I figure out the problem. By goind in data streams, I ve looked at the auto generated index (.ds-logs-udp.dns-default-2023.01.03-000001) linked to "data_stream="logs-udp.dns-default".

When I go to the mapping section, there a lot of undesired fields, perhaps a border effect of dynamic mapping of the default udp index or bad pipeline rules, I don't know
For example

  "Jan 24 17:44:21 ns1 telegraf[13876]: 2023-01-24T16:44:21Z E! [outputs": {
            "properties": {
              "influxdb_v2] When writing to [http://127": {
                "properties": {
                  "0": {
                    "properties": {
                      "0": {
                        "properties": {
                          "1:8086]: Post \"http://127": {
                            "properties": {
                              "0": {
                                "properties": {
                                  "0": {
                                    "properties": {
                                      "1:8086/api/v2/write?bucket": {
                                        "type": "keyword",
                                        "ignore_above": 1024
                                      }

is there any good practice to avoid that ? Thanks for your help

Yup something is wrong... looks like values are becoming field names...

Will probably need to clean up and fix... (You may need to clean the underlying index or data stream as the mapping may be corrupt / mapping explosion.)

The first thing I would do is remove the pipeline and see what is arriving in elasticsearch.

Then look at / share what the document looks like

Then build the pipeline

Thanks for your reply. I have deleted the data stream (which removed the index) and now I ve only the wanted fields. I will process like that the time to work on the different log formats. Thanks a lot for your help, we can close the thread I think.

1 Like

Good to hear,

if you paste a sample document that you have ingested... perhaps we can take a look...

Threads close on their own...

Hmm I ran what you had with the sample you provided it worked fine.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.