Hi,
As your log message contains quotees you have to escape them correctly. The following request runs the simulation without errors:
POST _ingest/pipeline/_simulate
{
"pipeline": {
"processors": [
{
"grok": {
"field": "message",
"patterns": ["%{SYSLOGTIMESTAMP:timestamp}%{SPACE}+%{YEAR:year}%{SPACE}+\\[pid%{SPACE}+%{NUMBER:pid}\\]%{SPACE}+\\[%{USERNAME:username}\\]%{SPACE}+%{DATA:log_message}%{SPACE}+\"::ffff:%{IP:sourceIP}\",%{SPACE}+\"%{PATH:filepath}\",%{SPACE}%{GREEDYDATA:Packet_Size_and_Speed}"]
}
}
]
},
"docs":[
{
"_source": {
"message": "Thu Nov 17 17:31:54 2022 [pid 7512] [rami] OK DOWNLOAD: Client \"::ffff:192.168.1.98\", \"/home/rami/test2\", 5 bytes, 8.91Kbyte/sec"
}
}
]
}
It returns:
{
"docs" : [
{
"doc" : {
"_index" : "_index",
"_type" : "_doc",
"_id" : "_id",
"_source" : {
"sourceIP" : "192.168.1.98",
"filepath" : "/home/rami/test2",
"year" : "2022",
"Packet_Size_and_Speed" : "5 bytes, 8.91Kbyte/sec",
"pid" : "7512",
"log_message" : "OK DOWNLOAD: Client",
"message" : """Thu Nov 17 17:31:54 2022 [pid 7512] [rami] OK DOWNLOAD: Client "::ffff:192.168.1.98", "/home/rami/test2", 5 bytes, 8.91Kbyte/sec""",
"timestamp" : "Nov 17 17:31:54",
"username" : "rami"
},
"_ingest" : {
"timestamp" : "2022-11-28T12:33:59.692194616Z"
}
}
}
]
}
I hope this helps...
Best regards
Wolfram