@Badger
Apologies, I've been off for 8 days, minor surgery, all ok now.
I don't think this is a bug.
The original problem posted though does have a problem. The repeated delimiters does not work as @amitavmohanty01 expects.
The dissect filter is quite simplistic (hence the speed) and very strictly left to right processing.
It is best to think of the dissection (the value bit of the mapping key value pair) as a stack of operations. I'll try to ascii art what I mean. For this dissection,
'%{year}/%{month}/%{day}T%{hour}:%{minute}:%{second}
<find a '/'>
<find a '/'>
<find a 'T'>
<find a ':'>
<find a ':'>
For this string 1999/12/31T23:59:59
...
Dissect starts at index 0 of the byte array (multi-byte UTF-8 characters are found as a group of bytes, i.e. one character might be 2,3 or 4 bytes)
It pops the first operation off of the stack, executes it and moves the index to the position of the byte following the /
(5). It updates the operation data structure with the start (0) and offset (4)
It pops the second operation off of the stack and executes it, but from the index point (5). It updates the operation data structure with the start (5) and offset (2).
It pops the third operation off of the stack and executes it, but from the index point (7). It updates the operation data structure with the start (7) and offset (2).
And so on until it reaches the end of the byte array or the stack is empty.
It never goes back.
In the OP case, "%{msg} src=%{host->}:%{port} %{abc}"
, once the :
delimiter is found, that operation is not retried hence the value of 500:500
. The %{host->}:
operation is looking for one or more :
not one or more <data>:
Currently, one will have to use a second dissect on the port
field's value but in a conditional and without the ->
consecutive directive.
filter {
dissect {
mapping => {
"message" => "%{msg} src=%{host}:%{port} %{abc}"
}
}
if [port] =~ /\d+\:\d+/ {
dissect {
mapping => {
"port" => "%{}:%{port}"
}
}
}
}
BTW, this config on LS 7.3.0 does not fail.
input {
generator {
lines => [
"LinePrinter O isUserPresentAndValid: false",
"SystemOut O isUserPresentAndValid: true"
]
count => 1
}
}
filter {
dissect {
mapping => {
message => '%{channel->} %{zero} %{msg}'
}
}
}
output {
stdout {
codec => rubydebug
}
}
Gives
{
"@version" => "1",
"sequence" => 0,
"@timestamp" => 2019-08-28T10:52:05.496Z,
"message" => "SystemOut O isUserPresentAndValid: true",
"host" => "Elastics-MacBook-Pro.local",
"msg" => "isUserPresentAndValid: true",
"channel" => "SystemOut",
"zero" => "O"
}
[2019-08-28T11:52:05,928][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
{
"@version" => "1",
"sequence" => 0,
"@timestamp" => 2019-08-28T10:52:05.480Z,
"message" => "LinePrinter O isUserPresentAndValid: false",
"host" => "Elastics-MacBook-Pro.local",
"msg" => "isUserPresentAndValid: false",
"channel" => "LinePrinter",
"zero" => "O"
}
HOWEVER...
If I change the space as a delimiter into, say, a hyphen, it fails, hmmm? That said, most padded type logs use a space for better visual alignment. However, there are certainly some cases where a non space delimiter will need be consumed greedily.
I'll create a bug issue in the repo.