Ingest pipeline

Hi,

As your log message contains quotees you have to escape them correctly. The following request runs the simulation without errors:

POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "processors": [
      {
        "grok": {
          "field": "message",
          "patterns": ["%{SYSLOGTIMESTAMP:timestamp}%{SPACE}+%{YEAR:year}%{SPACE}+\\[pid%{SPACE}+%{NUMBER:pid}\\]%{SPACE}+\\[%{USERNAME:username}\\]%{SPACE}+%{DATA:log_message}%{SPACE}+\"::ffff:%{IP:sourceIP}\",%{SPACE}+\"%{PATH:filepath}\",%{SPACE}%{GREEDYDATA:Packet_Size_and_Speed}"]
        }
      }
    ]
  },
  "docs":[
    {
      "_source": {
        "message": "Thu Nov 17 17:31:54 2022 [pid 7512] [rami] OK DOWNLOAD: Client \"::ffff:192.168.1.98\", \"/home/rami/test2\", 5 bytes, 8.91Kbyte/sec"
      }
    }
  ]
}

It returns:

{
  "docs" : [
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_doc",
        "_id" : "_id",
        "_source" : {
          "sourceIP" : "192.168.1.98",
          "filepath" : "/home/rami/test2",
          "year" : "2022",
          "Packet_Size_and_Speed" : "5 bytes, 8.91Kbyte/sec",
          "pid" : "7512",
          "log_message" : "OK DOWNLOAD: Client",
          "message" : """Thu Nov 17 17:31:54 2022 [pid 7512] [rami] OK DOWNLOAD: Client "::ffff:192.168.1.98", "/home/rami/test2", 5 bytes, 8.91Kbyte/sec""",
          "timestamp" : "Nov 17 17:31:54",
          "username" : "rami"
        },
        "_ingest" : {
          "timestamp" : "2022-11-28T12:33:59.692194616Z"
        }
      }
    }
  ]
}

I hope this helps...

Best regards
Wolfram

1 Like