Elastic agent with Ingest pipeline

Hi Team,

We are using fleet-managed elastic agents to send log events to elasticsearch. The custom log integration has been done as part of this process and the log events are passed through an ingest pipeline before the events are ingested into Elasticsearch. The pipeline uses a Grok processor with %{COMBINEDAPACHELOG} pattern for the apache access logs however the events are not getting parsed. Using other processors works just fine but the GROK one. I am attaching a sample of log event and ingest pipeline screenshot below. Can anyone help me with this issue?

Hi @Ankita_Pachauri, Could you provide more information how configuration looks like ?

Also I guess you are trying to monitor apache web server?

Hi Ashish,
Can you please tell me which configuration do you need?

on elastic agent side ? Or you configured from fleet?

Also in case apache monitoring, worth checking apache integration.

Hi Ashish,
Thanks for your response. However, i need to use the custom log integration as i have few custom logs in addition to the apache access logs.

Could you share your grok processor request which you have created on Ingest pipeline?

Hi @Ankita_Pachauri

Please share samples of the raw log lines 3-5

And your ingest pipeline.

Hi Ashish,
Please find the information below:
Sample Log:

122.161.52.27 - - [02/Jul/2024:08:48:17 +0000] "GET / HTTP/1.1" 503 299 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36"

Request:

PUT _ingest/pipeline/apache
{
  "processors": [
    {
      "grok": {
        "field": "message",
        "patterns": [
          "%{COMBINEDAPACHELOG}"
        ]
      }
    }
  ]
}

Hi @Ankita_Pachauri

BTW that log message parses with the Built In Apache Integration as @ashishtiwari1993 suggested ... I would perhaps try starting with that.

POST _ingest/pipeline/logs-apache.access-1.20.0/_simulate
{
  "docs": [
    {
      "_source" : {
        "@timestamp" : "2024-07-04T17:42:06.606248379Z",
        "message" : "122.161.52.27 - - [02/Jul/2024:08:48:17 +0000] \"GET / HTTP/1.1\" 503 299 \"-\" \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36\""
      }
    },
    {
      "_source" : {
        "@timestamp" : "2024-07-04T17:42:06.606248379Z",
        "message" : """122.161.52.27 - - [02/Jul/2024:08:48:17 +0000] "GET / HTTP/1.1" 503 299 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36" """
      }
    },
    {
      "_source" : {
        "@timestamp" : "2024-07-04T17:42:06.606248379Z",
        "message" : """127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 "http://www.example.com/start.html" "Mozilla/4.08 [en] (Win98; I ;Nav)" """
      }
    }
    ]
}

It Also Parses with the pipeline you supplied but you will need to additional work to properly set timestamp etc.. etc... perhaps try the Apache Integration first

POST _ingest/pipeline/apache/_simulate
{
  "docs": [
    {
      "_source" : {
        "message" : "122.161.52.27 - - [02/Jul/2024:08:48:17 +0000] \"GET / HTTP/1.1\" 503 299 \"-\" \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36\""
      }
    },
    {
      "_source" : {
        "message" : """122.161.52.27 - - [02/Jul/2024:08:48:17 +0000] "GET / HTTP/1.1" 503 299 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36" """
      }
    },
    {
      "_source" : {
        "message" : """127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 "http://www.example.com/start.html" "Mozilla/4.08 [en] (Win98; I ;Nav)" """
      }
    }
  ]
}

Hi Ashish,
I have custom logs to integrate with this process. I have taken apache logs just for testing purpose.

Could you share custom logs then ? You can check complete log pattern behind the COMBINEDAPACHELOG.

1 Like

Hi,
I am using /var/log/messages now for custom logs integration.
Steps followed:

  1. Created an index pipeline:

PUT _ingest/pipeline/customlogs
{
"processors": [
{
"grok": {
"field": "message",
"patterns": [
"%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:host} %{WORD:process}: %{GREEDYDATA:message1}"
],
"tag": "failprocessor",
"ignore_failure": true
}
}
],
"on_failure": [
{
"grok": {
"field": "message",
"patterns": [
"%{SYSLOGTIMESTAMP:timestamp} %{GREEDYDATA:message1}"
]
}
},
{
"set": {
"field": "message",
"value": "failed!!!!!!!!!!!!"
}
}
]
}

  1. Created an index template

PUT _index_template/customlogs
{
"template": {
"settings": {
"index": {
"default_pipeline": "customlogs"
}
},
"mappings": {
"_routing": {
"required": false
},
"numeric_detection": false,
"dynamic_date_formats": [
"strict_date_optional_time",
"yyyy/MM/dd HH:mm:ss Z||yyyy/MM/dd Z"
],
"dynamic": true,
"_source": {
"excludes": ,
"includes": ,
"enabled": true
},
"date_detection": true
}
},
"index_patterns": [
"logs-custom*"
],
"data_stream": {
"hidden": false,
"allow_custom_routing": false
},
"composed_of":
}

  1. Added a custom log integration and added path: /var/log/messages

PUT kbn:/api/fleet/package_policies/a417e116-044a-4b80-ba7e-1c98df3882aa
{
"package": {
"name": "log",
"version": "2.3.1"
},
"name": "customlog",
"namespace": "",
"description": "",
"policy_id": "36dc1784-0bd3-43be-9ac0-9ef8a98c13c5",
"vars": {},
"inputs": {
"logs-logfile": {
"enabled": true,
"streams": {
"log.logs": {
"enabled": true,
"vars": {
"paths": [
"/var/log/messages"
],
"exclude_files": ,
"ignore_older": "72h",
"data_stream.dataset": "custom",
"tags": ,
"custom": ""
}
}
}
}
}
}

But when we add integration, we don't have any option to add pipeline. Once the integration is completed, it created a managed pipeline, editing which gives a warning that it can break kibana. I tried adding the same pipeline later. However, the processors are not working.

Please help.

Hi @Ankita_Pachauri

Here is some background

What I would suggest is to do the following ... clean up the templates etc...

Go to integration and add a custom logs integration and set the dataset name as custom and save it.

BTW setting the name to custom can be a bit confusing something like customapp might be better

This will create the templates / ingest pipelines etc and you will inherit all the good logs mappings etc.

It also creates the Ingest pipelines framework... which will be ready to automatically call your custom pipeline if you simply name it correctly

So then just name your pipeline
PUT _ingest/pipeline/logs-custom@custom

And it will be automatically called... this is the best way to do this...

OR...... you can also create all this with 1 call....

POST kbn:/api/fleet/epm/custom_integrations
{
  "integrationName": "customapp",
  "datasets": [
    {
      "name": "customapp",
      "type": "logs"
    }
  ]
}

GET kbn:/api/fleet/epm/packages/customapp

# if you want to clean up
DELETE kbn:/api/fleet/epm/packages/customapp/1.0.0

And then name your ingest pipeline in this case

PUT _ingest/pipeline/logs-customapp@custom

Hope this helps

Thanks @stephenb it worked!!

1 Like