79g
(Daniel)
July 28, 2022, 2:35pm
1
Hello all,
I'm sending some Filebeat logs to an Elasticsearch pipeline to transform and store on Elasticsearch.
The input event looks like:
Thu Jul 28 13:57:44 2022 us=22598 1.2.3.4:37491 [my_user] Peer Connection Initiated with [AF_INET]1.2.3.4:37491
And the GROK expression (works on Kibana debugger but not in the pipeline) is:
%{GREEDYDATA:dayweek} %{GREEDYDATA:month} %{GREEDYDATA:daymonth} %{GREEDYDATA:time} %{GREEDYDATA:year} us=%{WORD} %{IPORHOST:ip}:%{POSINT:puerto} \[%{GREEDYDATA:user}\] Peer Connection Initiated with %{GREEDYDATA:rest}
It works on Kibana Grok Debugger but not in the pipeline.
When creating the pipeline I have to remove the [ and ] characters from the pattern, because if not the pattern is not valid. And the error I get is:
... object mapping for [user] tried to parse field [user] as object, but found a concrete value"}, dropping event!","service.name":"filebeat","ecs.version":"1.6.0"}
How can I remove the brackets from the user and store the data on Elasticsearch? I tried with adding slash before without, but does not work in the pipeline.
Any help would be really appreciated.
Thanks
stephenb
(Stephen Brown)
July 28, 2022, 9:33pm
2
user
Is it defined field and it is an complex object see here.
You are trying to write a simple field into an object again, that's what the error means
Try this instead
user.name
Also, using GREEDYDATA
for every field is not efficient.
You should look at predefined patterns.
USERNAME [a-zA-Z0-9._-]+
USER %{USERNAME}
EMAILLOCALPART [a-zA-Z][a-zA-Z0-9_.+-=:]+
EMAILADDRESS %{EMAILLOCALPART}@%{HOSTNAME}
INT (?:[+-]?(?:[0-9]+))
BASE10NUM (?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+)))
NUMBER (?:%{BASE10NUM})
BASE16NUM (?<![0-9A-Fa-f])(?:[+-]?(?:0x)?(?:[0-9A-Fa-f]+))
BASE16FLOAT \b(?<![0-9A-Fa-f.])(?:[+-]?(?:0x)?(?:(?:[0-9A-Fa-f]+(?:\.[0-9A-Fa-f]*)?)|(?:\.[0-9A-Fa-f]+)))\b
POSINT \b(?:[1-9][0-9]*)\b
NONNEGINT \b(?:[0-9]+)\b
WORD \b\w+\b
NOTSPACE \S+
SPACE \s*
DATA .*?
GREEDYDATA .*
QUOTEDSTRING (?>(?<!\\)(?>"(?>\\.|[^\\"]+)+"|""|(?>'(?>\\.|[^\\']+)+')|''|(?>`(?>\\.|[^\\`]+)+`)|``))
UUID [A-Fa-f0-9]{8}-(?:[A-Fa-f0-9]{4}-){3}[A-Fa-f0-9]{12}
# URN, allowing use of RFC 2141 section 2.3 reserved characters
This file has been truncated. show original
If you share your whole pipeline we might be able to help
stephenb
(Stephen Brown)
July 28, 2022, 11:13pm
3
Try This
PUT _ingest/pipeline/discuss-test
{
"processors": [
{
"grok": {
"field": "message",
"patterns": ["""%{DAY:dayweek} %{MONTH:month} %{MONTHDAY:daymonth} %{TIME:time} %{GREEDYDATA:year} us=%{WORD:code1} %{IPORHOST:ip}:%{POSINT:puerto} \[%{USER:user.name}\] Peer Connection Initiated with \[%{WORD:code2}\]%{IPORHOST:other_ip}:%{POSINT:other_puerto}"""]
}
}
]
}
POST _ingest/pipeline/discuss-test/_simulate
{
"docs": [
{
"_source": {
"message": "Thu Jul 28 13:57:44 2022 us=22598 1.2.3.4:37491 [my_user] Peer Connection Initiated with [AF_INET]1.2.3.4:37491"
}
}
]
}
Result
{
"docs" : [
{
"doc" : {
"_index" : "_index",
"_id" : "_id",
"_source" : {
"code2" : "AF_INET",
"code1" : "22598",
"daymonth" : "28",
"dayweek" : "Thu",
"year" : "2022",
"ip" : "1.2.3.4",
"other_ip" : "1.2.3.4",
"message" : "Thu Jul 28 13:57:44 2022 us=22598 1.2.3.4:37491 [my_user] Peer Connection Initiated with [AF_INET]1.2.3.4:37491",
"other_puerto" : "37491",
"puerto" : "37491",
"month" : "Jul",
"time" : "13:57:44",
"user" : {
"name" : "my_user"
}
},
"_ingest" : {
"timestamp" : "2022-07-28T23:14:38.476496Z"
}
}
}
]
}
You should probably create a mapping too!
2 Likes
79g
(Daniel)
July 29, 2022, 8:21am
4
Stephen, thank you so much!!
The trick in the patern with """
solves the issue for me. Now it works succesfully.
Also thanks for your advises regardign not using greedydata always. It was for testing purpose.
Really apreciate your help.
Best regards,
Daniel.
1 Like
system
(system)
Closed
August 26, 2022, 10:21am
5
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.