I'm using filebeat to read in a multiline log. I'm able to get the data into elasticsearch with the multiline event stored into the message field.
Log Sample:
Date: Wed Apr 19 09:57:45 2023
Computer Name: SystemX
User Name: SystemX.User
Project includes 1 folder(s) and 4 file(s).
============================================================================================
encrypt mode:
AS_ENCRYPT_MODE_AES256_SHA2
set a password for this encryption:
using a user supplied password
set up a group and master password:
unencrypted
no encrypted with groupinfo
============================================================================================
C:\Users\User\Desktop\Test files\File1.txt 8b6ccb43dca2040c3cfbcd7bfff0b387d4538c33 15bytes 2023/4/6 19:49:45
C:\Users\User\Desktop\Test files\File2.docx a3dcef559e04628b1c71a1d87d353e070bd5d40a 11853bytes 2023/4/6 19:49:45
C:\Users\User\Desktop\Test files\File3.pptx 2ca33d9f81a91d2648971f5a12d03ec0ef9fc408 31579bytes 2023/4/6 19:49:45
C:\Users\User\Desktop\Test files\File4.xlsx f4e15a60f7313fae60b9f05b0dc016ab6c68f031 8426bytes 2023/4/6 19:49:45
END OF FILE
Filebeat.yml excerpt:
# ============================== Filebeat inputs ===============================
filebeat.inputs:
# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.
# filestream is an input for collecting log messages from files.
- type: filestream
close_timeout: 5m
# Unique ID among all inputs, an ID is required.
id: "WinZip Safe Media"
# Change to true to enable this input configuration.
enabled: true
# Paths that should be crawled and fetched. Glob based paths.
paths:
- "C:\\ProgramData\\WinZip Log Files\\*"
#- c:\programdata\elasticsearch\logs\*
# Exclude lines. A list of regular expressions to match. It drops the lines that are
# matching any regular expression from the list.
#exclude_lines: ['^DBG']
# Include lines. A list of regular expressions to match. It exports the lines that are
# matching any regular expression from the list.
#include_lines: ['^ERR', '^WARN']
# Exclude files. A list of regular expressions to match. Filebeat drops the files that
# are matching any regular expression from the list. By default, no files are dropped.
prospector.scanner.exclude_files: ['.zip$']
# Optional additional fields. These fields can be freely picked
# to add additional information to the crawled log files for filtering
#fields:
# level: debug
# review: 1
parsers:
- multiline:
type: pattern
pattern: '^Date\:.*'
negate: true
match: after
Visualize in Discover:
This issue is that I tried to create an ingest pipeline to parse out the data into custom fields. My grok processor does not match because the data is coming in encoded. I can see this when viewing the data in JSON.
Here is my grok processor match statement:
{
"grok": {
"field": "message",
"patterns": [
"(?m).*Date: %{DATA:event_timestamp}\\n\\n.*User Name: .*\\.%{DATA:user_name}\\n\\n.*Project.*encrypt mode:\\n\\n%{DATA:encrypt_algo}\\n\\n.*\\=\\n\\n%{GREEDYDATA:file_list}.*END OF FILE"
],
"ignore_failure": true
}
},
{
"set": {
"field": "user.name",
"value": "{{user_name}}",
"ignore_failure": true
}
}
]
I've never had issues with the log input type in the past, not sure if there is something I'm missing with this filestream input.
I tested the grok statement in Dev Tools Grok Debugger and it works fine.