Elastic agent with Ingest pipeline

Hi Team,

We are using fleet-managed elastic agents to send log events to elasticsearch. The custom log integration has been done as part of this process and the log events are passed through an ingest pipeline before the events are ingested into Elasticsearch. The pipeline uses a Grok processor with %{COMBINEDAPACHELOG} pattern for the apache access logs however the events are not getting parsed. Using other processors works just fine but the GROK one. I am attaching a sample of log event and ingest pipeline screenshot below. Can anyone help me with this issue?

Hi @Ankita_Pachauri, Could you provide more information how configuration looks like ?

Also I guess you are trying to monitor apache web server?

Hi Ashish,
Can you please tell me which configuration do you need?

on elastic agent side ? Or you configured from fleet?

Also in case apache monitoring, worth checking apache integration.

Hi Ashish,
Thanks for your response. However, i need to use the custom log integration as i have few custom logs in addition to the apache access logs.

Could you share your grok processor request which you have created on Ingest pipeline?

Hi @Ankita_Pachauri

Please share samples of the raw log lines 3-5

And your ingest pipeline.

Hi Ashish,
Please find the information below:
Sample Log:

122.161.52.27 - - [02/Jul/2024:08:48:17 +0000] "GET / HTTP/1.1" 503 299 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36"

Request:

PUT _ingest/pipeline/apache
{
  "processors": [
    {
      "grok": {
        "field": "message",
        "patterns": [
          "%{COMBINEDAPACHELOG}"
        ]
      }
    }
  ]
}

Hi @Ankita_Pachauri

BTW that log message parses with the Built In Apache Integration as @ashishtiwari1993 suggested ... I would perhaps try starting with that.

POST _ingest/pipeline/logs-apache.access-1.20.0/_simulate
{
  "docs": [
    {
      "_source" : {
        "@timestamp" : "2024-07-04T17:42:06.606248379Z",
        "message" : "122.161.52.27 - - [02/Jul/2024:08:48:17 +0000] \"GET / HTTP/1.1\" 503 299 \"-\" \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36\""
      }
    },
    {
      "_source" : {
        "@timestamp" : "2024-07-04T17:42:06.606248379Z",
        "message" : """122.161.52.27 - - [02/Jul/2024:08:48:17 +0000] "GET / HTTP/1.1" 503 299 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36" """
      }
    },
    {
      "_source" : {
        "@timestamp" : "2024-07-04T17:42:06.606248379Z",
        "message" : """127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 "http://www.example.com/start.html" "Mozilla/4.08 [en] (Win98; I ;Nav)" """
      }
    }
    ]
}

It Also Parses with the pipeline you supplied but you will need to additional work to properly set timestamp etc.. etc... perhaps try the Apache Integration first

POST _ingest/pipeline/apache/_simulate
{
  "docs": [
    {
      "_source" : {
        "message" : "122.161.52.27 - - [02/Jul/2024:08:48:17 +0000] \"GET / HTTP/1.1\" 503 299 \"-\" \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36\""
      }
    },
    {
      "_source" : {
        "message" : """122.161.52.27 - - [02/Jul/2024:08:48:17 +0000] "GET / HTTP/1.1" 503 299 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36" """
      }
    },
    {
      "_source" : {
        "message" : """127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 "http://www.example.com/start.html" "Mozilla/4.08 [en] (Win98; I ;Nav)" """
      }
    }
  ]
}

Hi Ashish,
I have custom logs to integrate with this process. I have taken apache logs just for testing purpose.

Could you share custom logs then ? You can check complete log pattern behind the COMBINEDAPACHELOG.

1 Like

Hi,
I am using /var/log/messages now for custom logs integration.
Steps followed:

  1. Created an index pipeline:

PUT _ingest/pipeline/customlogs
{
"processors": [
{
"grok": {
"field": "message",
"patterns": [
"%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:host} %{WORD:process}: %{GREEDYDATA:message1}"
],
"tag": "failprocessor",
"ignore_failure": true
}
}
],
"on_failure": [
{
"grok": {
"field": "message",
"patterns": [
"%{SYSLOGTIMESTAMP:timestamp} %{GREEDYDATA:message1}"
]
}
},
{
"set": {
"field": "message",
"value": "failed!!!!!!!!!!!!"
}
}
]
}

  1. Created an index template

PUT _index_template/customlogs
{
"template": {
"settings": {
"index": {
"default_pipeline": "customlogs"
}
},
"mappings": {
"_routing": {
"required": false
},
"numeric_detection": false,
"dynamic_date_formats": [
"strict_date_optional_time",
"yyyy/MM/dd HH:mm:ss Z||yyyy/MM/dd Z"
],
"dynamic": true,
"_source": {
"excludes": ,
"includes": ,
"enabled": true
},
"date_detection": true
}
},
"index_patterns": [
"logs-custom*"
],
"data_stream": {
"hidden": false,
"allow_custom_routing": false
},
"composed_of":
}

  1. Added a custom log integration and added path: /var/log/messages

PUT kbn:/api/fleet/package_policies/a417e116-044a-4b80-ba7e-1c98df3882aa
{
"package": {
"name": "log",
"version": "2.3.1"
},
"name": "customlog",
"namespace": "",
"description": "",
"policy_id": "36dc1784-0bd3-43be-9ac0-9ef8a98c13c5",
"vars": {},
"inputs": {
"logs-logfile": {
"enabled": true,
"streams": {
"log.logs": {
"enabled": true,
"vars": {
"paths": [
"/var/log/messages"
],
"exclude_files": ,
"ignore_older": "72h",
"data_stream.dataset": "custom",
"tags": ,
"custom": ""
}
}
}
}
}
}

But when we add integration, we don't have any option to add pipeline. Once the integration is completed, it created a managed pipeline, editing which gives a warning that it can break kibana. I tried adding the same pipeline later. However, the processors are not working.

Please help.

Hi @Ankita_Pachauri

Here is some background

What I would suggest is to do the following ... clean up the templates etc...

Go to integration and add a custom logs integration and set the dataset name as custom and save it.

BTW setting the name to custom can be a bit confusing something like customapp might be better

This will create the templates / ingest pipelines etc and you will inherit all the good logs mappings etc.

It also creates the Ingest pipelines framework... which will be ready to automatically call your custom pipeline if you simply name it correctly

So then just name your pipeline
PUT _ingest/pipeline/logs-custom@custom

And it will be automatically called... this is the best way to do this...

OR...... you can also create all this with 1 call....

POST kbn:/api/fleet/epm/custom_integrations
{
  "integrationName": "customapp",
  "datasets": [
    {
      "name": "customapp",
      "type": "logs"
    }
  ]
}

GET kbn:/api/fleet/epm/packages/customapp

# if you want to clean up
DELETE kbn:/api/fleet/epm/packages/customapp/1.0.0

And then name your ingest pipeline in this case

PUT _ingest/pipeline/logs-customapp@custom

Hope this helps

Thanks @stephenb it worked!!

1 Like

@stephenb I set up Custom log integration with custom pipeline the same way.
When I add the first processor, everything works:

{
    "set": {
      "field": "beforejson",
      "value": "true"
    }
  },

When I add the second processor, the log line got lost:

[
  {
    "set": {
      "field": "beforejson",
      "value": "true"
    }
  },
  {
    "json": {
      "field": "message",
      "target_field": "json",
      "ignore_failure": true
    }
  }
]

Why is it happening and how can I debug pipelines?

I also set failure processor to debug, but the log line does not arrive even with this settings:

[
  {
    "set": {
      "field": "pipelinefailed",
      "value": "true"
    }
  }
]

Please help.
Thanks.
Regards,
Zsolt

Hi @osztrovszky Welcome to the community.

A couple of "house keeping"
First, it is generally not a good Idea to add to a Solved Topic as people tend not to look at them. You should open a new topic and you can refer to it.

Also, please try not to @ people directly with your questions... it is a community forum, open your topic and see if it will get answered..

All that said / all good....

most likely this is failing and since you set "ignore_failure": true it will just fail silently... and the failure processor will not get called.. for proper failure handling see here

I would recommend trying the _simulate API with your sample documents and see what is failing

Hello,

Thanks for the info, I'll open a new topic next time.

Regarding this issue, I removed the ignore_failure flag, but still not working.

[
  {
    "json": {
      "field": "message",
      "target_field": "messagejson"
    }
  }
]

If I test it with a document from index, parse happens succesfully.
However, after applying this pipeline, no more lines are arriving from the log files. If I remove this json processor, the lines are arriving again.

Any idea how can I debug it? Where could I find some log lines about this failing pipeline?

Thanks.

If you share a full document
The full pipeline
And you're mapping etc. We can take a look but with just snippets it's hard to debug

Sure thing, here are the configs: elastic-sample-logline.json · GitHub

Tell me if there is any other info needed.

Thanks a lot.