Ingest pipeline

Hi there,

Im new to ELK, and trying to extract some custom log no covered by filebeat module into kibana. So I tried to create a ingest node pipeline between filebeat and ES.

So on filebeat.yml: 1) Add the pipeline name on the elasticoutputsection 2) add the logpath for the custom log in the filebeat input

I created the ingest node where I'm using grok processors to parse vsftpd.log and it work with dev tools (where I had to transform the query from grok debugger because it need to be json format).

I want to test with the simulate but the "message" section isn't in json format. I don't understand why it should be in json format as the message is a redirect from vsftpd.log

Completlely lost on how it should work, even after reading documentation. On filebeat side, I see a line that pipeline output is connected to elastic, but no logs are redirect to the default filebeat index.

Here is the simulate output if anyone can help would be appreciate !

POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "processors": [
      {
        "grok": {
          "field": "message",
          "patterns": ["%{SYSLOGTIMESTAMP:timestamp}%{SPACE}+%{YEAR:year}%{SPACE}+.pid%{SPACE}+%{GREEDYDATA:pid}.%{SPACE}+.%{USERNAME:username}.%{SPACE}+%{GREEDYDATA:log_message}%{SPACE}","::ffff:%{IP:sourceIP}","%{SPACE}\"%{PATH:filepath}\",%{SPACE}%{GREEDYDATA:Packet_Size_and_Speed}"]
        }
      }
    ]
  },
  "docs":[
    {
      "_source": {
        "message": "Thu Nov 17 17:31:54 2022 [pid 7512] [rami] OK DOWNLOAD: Client "::ffff:192.168.1.98", "/home/rami/test2", 5 bytes, 8.91Kbyte/sec"
      }
    }
  ]
}

Thanks you

Hi,

As your log message contains quotees you have to escape them correctly. The following request runs the simulation without errors:

POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "processors": [
      {
        "grok": {
          "field": "message",
          "patterns": ["%{SYSLOGTIMESTAMP:timestamp}%{SPACE}+%{YEAR:year}%{SPACE}+\\[pid%{SPACE}+%{NUMBER:pid}\\]%{SPACE}+\\[%{USERNAME:username}\\]%{SPACE}+%{DATA:log_message}%{SPACE}+\"::ffff:%{IP:sourceIP}\",%{SPACE}+\"%{PATH:filepath}\",%{SPACE}%{GREEDYDATA:Packet_Size_and_Speed}"]
        }
      }
    ]
  },
  "docs":[
    {
      "_source": {
        "message": "Thu Nov 17 17:31:54 2022 [pid 7512] [rami] OK DOWNLOAD: Client \"::ffff:192.168.1.98\", \"/home/rami/test2\", 5 bytes, 8.91Kbyte/sec"
      }
    }
  ]
}

It returns:

{
  "docs" : [
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_doc",
        "_id" : "_id",
        "_source" : {
          "sourceIP" : "192.168.1.98",
          "filepath" : "/home/rami/test2",
          "year" : "2022",
          "Packet_Size_and_Speed" : "5 bytes, 8.91Kbyte/sec",
          "pid" : "7512",
          "log_message" : "OK DOWNLOAD: Client",
          "message" : """Thu Nov 17 17:31:54 2022 [pid 7512] [rami] OK DOWNLOAD: Client "::ffff:192.168.1.98", "/home/rami/test2", 5 bytes, 8.91Kbyte/sec""",
          "timestamp" : "Nov 17 17:31:54",
          "username" : "rami"
        },
        "_ingest" : {
          "timestamp" : "2022-11-28T12:33:59.692194616Z"
        }
      }
    }
  ]
}

I hope this helps...

Best regards
Wolfram

1 Like

Wolfram,

Thank you for the reply and for your help. Indeed, I misunderstood something with escape. I will retry it on an other log set.

Regards,

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.