Are ingest pipelines created by agent expected to be different from the same type of pipeline created by beats?

jerrac · October 18, 2023, 4:33pm

So, I've been assuming that the ingest pipelines created by Elastic Agent and those created by Filebeat, would be pretty much identical. I expected some differences, but nothing that would make them output different fields.

Today I found that there are at least some major differences in the Apache ingest pipelines. Specifically, the one from Elastic Agent has more grok patterns than the one from Filebeat.

The end result is that Filebeat doesn't support the same log formats that the Agent integration supports.

Filebeat's grok processor:

{
    "grok": {
      "field": "event.original",
      "patterns": [
        "%{IPORHOST:destination.domain} %{IPORHOST:source.ip} - %{DATA:user.name} \\[%{HTTPDATE:apache.access.time}\\] \"(?:%{WORD:http.request.method} %{DATA:_tmp.url_orig} HTTP/%{NUMBER:http.version}|-)?\" %{NUMBER:http.response.status_code:long} (?:%{NUMBER:http.response.body.bytes:long}|-)( \"%{DATA:http.request.referrer}\")?( \"%{DATA:user_agent.original}\")?",
        "%{IPORHOST:source.address} - %{DATA:user.name} \\[%{HTTPDATE:apache.access.time}\\] \"(?:%{WORD:http.request.method} %{DATA:_tmp.url_orig} HTTP/%{NUMBER:http.version}|-)?\" %{NUMBER:http.response.status_code:long} (?:%{NUMBER:http.response.body.bytes:long}|-)( \"%{DATA:http.request.referrer}\")?( \"%{DATA:user_agent.original}\")?",
        "%{IPORHOST:source.address} - %{DATA:user.name} \\[%{HTTPDATE:apache.access.time}\\] \"-\" %{NUMBER:http.response.status_code:long} -",
        "\\[%{HTTPDATE:apache.access.time}\\] %{IPORHOST:source.address} %{DATA:apache.access.ssl.protocol} %{DATA:apache.access.ssl.cipher} \"%{WORD:http.request.method} %{DATA:_tmp.url_orig} HTTP/%{NUMBER:http.version}\" (-|%{NUMBER:http.response.body.bytes:long})"
      ],
      "ignore_missing": true
    }
  }

Agents grok processor:

{
    "grok": {
      "field": "event.original",
      "patterns": [
        "(%{IPORHOST:destination.domain} )?%{IPORHOST:source.address} - %{DATA:user.name} \\[%{HTTPDATE:apache.access.time}\\] \"(?:%{WORD:http.request.method} %{DATA:_tmp.url_orig} HTTP/%{NUMBER:http.version}|-)?\" %{NUMBER:http.response.status_code:long} (?:%{NUMBER:http.response.body.bytes:long}|-)( \"%{DATA:http.request.referrer}\")?( \"%{DATA:user_agent.original}\")?( X-Forwarded-For=\"%{ADDRESS_LIST:apache.access.remote_addresses}\")?",
        "%{IPORHOST:source.address} - %{DATA:user.name} \\[%{HTTPDATE:apache.access.time}\\] \"-\" %{NUMBER:http.response.status_code:long} -",
        "\\[%{HTTPDATE:apache.access.time}\\] %{IPORHOST:source.address} %{DATA:apache.access.ssl.protocol} %{DATA:apache.access.ssl.cipher} \"%{WORD:http.request.method} %{DATA:_tmp.url_orig} HTTP/%{NUMBER:http.version}\" (-|%{NUMBER:http.response.body.bytes:long})"
      ],
      "ignore_missing": true,
      "pattern_definitions": {
        "ADDRESS_LIST": "(%{IP})(\"?,?\\s*(%{IP}))*"
      }
    }
  }

So far I've just investigated the apache access pipelines. But I'm guessing there could be differences for other types of pipelines.

Is this expected? If so, why?

leandrojmp · October 18, 2023, 4:39pm

Assuming that the Elastic plan is to replace the beats family with Elastic Agent and the Filebeat modules with Elastic Agent Integrations, I don't think that the filebeat modules are getting the same attention as the Elastic Agent Integrations and some changes on the ingest pipelines of the Integrations are not reflected on the Filebeat Modules.

jerrac · October 18, 2023, 4:43pm

I'd believe that, but the fact that Agent doesn't support hints based auto-discovery while Filebeat does kinda makes me wonder.

system · November 15, 2023, 6:44pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Map Files to specific Ingest Pipelines Beats filebeat	3	912	July 8, 2016
Ingest pipeline is not being used in the events sent by filebeat Elasticsearch	1	439	March 20, 2018
"The processor action grok does not exist" in FileBeat. Why ? (Custom Logs Integration) Elasticsearch	4	1172	June 5, 2023
Customize ingest pipelines in Fleet / Elastic Agent Beats fleet , filebeat	3	1604	February 5, 2021
Elastic Agent error with event.created & event.ingested Elastic Agent ingest-pipeline	1	671	September 8, 2022

Are ingest pipelines created by agent expected to be different from the same type of pipeline created by beats?

Related topics