Hi @yquirion
I am a little confused after the very detailed answer I gave why you did not confirm if you could reproduce the same.
At this point, I am going to assume you could.
In general if you are going to try to recreate module with inputs you will need to look at the ingest pipeline and what else the module is doing you can see this under module directory
in this case.
cd ./module/system/auth/
So first timezone.
This is simple, the modules add the add_locale processors which adds the timezone then the pipeline uses that to adjust the syslog timestamp ... the filestream input does not know anything about that.
You can
So this is a simple fix below.
- type: filestream
# Change to true to enable this input configuration.
enabled: true
pipeline: filebeat-8.6.2-system-auth-pipeline
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /Users/sbrown/workspace/sample-data/discuss/syslog-pipeline/fsci-secure.log
index: 'filebeat-auth-8.6.2-sys-linux'
tags: "preserve_original_event"
processors:
- add_locale: ~
How did I know this I went and looked at that module in detail
from filebeat directory, this is where the definitions of the module is if you want an plain input to work like a module you need to look in these directories and understand what it is doing
cd ./module/system/auth/
cat config/auth.yml
and if you look at the ingest pipeline which if you are going to use "outside" the module you really need to look at and understand.
You can see here the date processor uses the timezone
{
"date": {
"target_field": "@timestamp",
"formats": [
"MMM d HH:mm:ss",
"MMM dd HH:mm:ss",
"ISO8601"
],
"timezone": "{{{ event.timezone }}}",
"if": "ctx.event?.timezone != null",
"field": "system.auth.timestamp",
"on_failure": [
{
"append": {
"value": "{{{ _ingest.on_failure_message }}}",
"field": "error.message"
}
}
]
}
}
Your next question about the message field again you just need to look at the ingest pipeline and follow the logic and / or run _simluate/verbose=true to see what is happening....
When this message is sent through the pipeline it is "fully" parsed and thus there is no message left as it is completely parsed all the pertinent data ends up in fields.
Sep 15 11:56:18 dinf-miro sshd[164528]: Accepted password for myuser from 10.3.1.2 port 39100 ssh2
Go look at the code and you should see...
1st) The message gets renamed as event.original
2nd) There is an initial syslog parsing the leftover event is in _temp.message
3rd) There there are subsequent Groks...
The first tries a bunch of combinations and if nothing matches it puts _temp.message back into message
This grok
{
"grok": {
"tag": "grok-specific-messages",
"field": "_temp.message",
"ignore_missing": true,
"patterns": [
"^%{DATA:system.auth.ssh.event} %{DATA:system.auth.ssh.method} for (invalid user)?%{DATA:user.name} from %{IPORHOST:source.ip} port %{NUMBER:source.port:long} ssh2(: %{GREEDYDATA:system.auth.ssh.signature})?",
"^%{DATA:system.auth.ssh.event} user %{DATA:user.name} from %{IPORHOST:source.ip}",
"^Did not receive identification string from %{IPORHOST:system.auth.ssh.dropped_ip}",
"^%{DATA:user.name} :( %{DATA:system.auth.sudo.error} ;)? TTY=%{DATA:system.auth.sudo.tty} ; PWD=%{DATA:system.auth.sudo.pwd} ; USER=%{DATA:system.auth.sudo.user} ; COMMAND=%{GREEDYDATA:system.auth.sudo.command}",
"^new group: name=%{DATA:group.name}, GID=%{NUMBER:group.id}",
"^new user: name=%{DATA:user.name}, UID=%{NUMBER:user.id}, GID=%{NUMBER:group.id}, home=%{DATA:system.auth.useradd.home}, shell=%{DATA:system.auth.useradd.shell}$"
],
"description": "Grok specific auth messages.",
"on_failure": [
{
"rename": {
"description": "Leave the unmatched content in message.",
"field": "_temp.message",
"target_field": "message"
}
}
]
}
}
But this DOES in fact exactly match the 2nd grok exactly so it does not fail and put the _temp.message back into message that is the end of the processing... it is fully parsed and no need for a message fields.
If it Had Not matched it would have tried the next grok to try decode pam messages which yours do not fit the pattern.
etc.
This is why on some of the specific messages there are no leftover message because it is fully consumed.
Also if you look at the bottom of the ingest pipeline you will see...
{
"remove": {
"ignore_missing": true,
"field": "event.original",
"if": "ctx?.tags == null || !(ctx.tags.contains('preserve_original_event'))",
"ignore_failure": true
}
}
so if you just add a tag to either the module or input that will preserve the event.original
tags: "preserve_original_event"
All your event.original issues are solved, so open up those ingest pipelines and read the code
they are executed in order.
(note not every pipeline is guaranteed to have that logic so you need to look)
Hope this helps