Hello,
I am trying to do the following:
what I have
- I have a csv file that contains some a data-set
LG_data.csv
OWNER;CATEGORY;TASK;ID;TEN_ID;CHANNEL_ID;CODE;PLANNED_START;TIME_STORED;TIME_QUEUED;TIME_STARTED;TIME_DONE;RUNTIME;WAIT_TIME;DURATION
loadtestnonxa_res;;CW_SCHEDULED_MESSAGE;1520365718375;1;1;BOOTSTRAP;2018-03-06 19:48:38.375;2018-03-06 19:48:38.375;2018-03-06 19:48:38.748;2018-03-06 19:48:38.796;2018-03-06 19:48:47.513;8717;421;9138
loadtestnonxa_res;;CW_SCHEDULED_MESSAGE;1520365718376;1;2;BOOTSTRAP;2018-03-06 19:48:40.495;2018-03-06 19:48:40.495;2018-03-06 19:48:40.793;2018-03-06 19:48:40.819;2018-03-06 19:48:49.883;9064;324;9388
- I have a json file that contains metadata for this data-set
LG_data_metadata.json
{
"teamcity.build.id": "9182737475",
"env.loadtest-usecases.profile": "run-uc1-small",
"project-config.version": "6.3.0-SNAPSHOT",
"loadtest-usecases.branch.name": "feature/release-6.3.0"
}
what I'm trying to do
After reading the csv data, I use grok to match the filename and use it in the next input/file parser to read the json file (name convention).
Then I use grok again to match the metadata in the json file and add it (mutate) as a field.
what I expect
- each data-item has the added field with the corresponding value of the match field
Here is my configuration:
input {
file {
path => "/devevelop/scratch/data/*.csv"
type => "csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter{
grok{
match => ["path","/devevelop/scratch/data/%{GREEDYDATA:filename}.csv"]
}
}
input {
file {
path => "/devevelop/scratch/data/%{filename}_metadata.json"
type => "json"
start_position => "beginning"
sincedb_path => "/dev/null"
codec => multiline {
pattern => '.*'
negate => true
what => previous
}
}
}
filter {
if [type]== "json" {
grok {
match => {"message" => "\"teamcity.build.id\":%{GREEDYDATA:teamcity.build.id}"}
match => {"message" => "\"env.loadtest-usecases.profile\":%{GREEDYDATA:env.loadtest-usecases.profile}"}
match => {"message" => "\"project-config.version\":%{GREEDYDATA:project-config.version}"}
match => {"message" => "\"loadtest-usecases.branch.name\":%{GREEDYDATA:loadtest-usecases.branch.name}"}
}
if "_grokparsefailure" in [tags] {
drop { }
}
}
mutate {
add_field => { "teamcity.build.id" => "%{teamcity.build.id}" }
}
}
filter {
if [type]== "csv" {
csv {
separator => ";"
columns => ["OWNER","CATEGORY","TASK","ID","TEN_ID","CHANNEL_ID","CODE","PLANNED_START","TIME_STORED","TIME_QUEUED","TIME_STARTED","TIME_DONE","RUNTIME","WAIT_TIME","DURATION", "RELEASE"]
convert => {
"PLANNED_START" => "date_time"
"TIME_STORED" => "date_time"
"TIME_QUEUED" => "date_time"
"TIME_STARTED" => "date_time"
"TIME_DONE" => "date_time"
"DURATION" => "integer"
"WAIT_TIME" => "integer"
"RUNTIME" => "integer"
}
}
}
}
output {
stdout {
codec => rubydebug
}
}
But I get:
...
{
"message" => "loadtestnonxa_res;;CW_SCHEDULED_MESSAGE;1520365718377;1;3;BOOTSTRAP;2018-03-06 19:48:42.587;2018-03-06 19:48:42.587;2018-03-06 19:48:42.826;2018-03-06 19:48:42.882;2018-03-06 19:48:53.503;10621;295;10916",
"@version" => "1",
"@timestamp" => "2018-03-29T15:35:23.388Z",
"path" => "/devevelop/scratch/data/LG_data.csv",
"host" => "elGuapo",
"type" => "csv",
"filename" => "LG_data",
"teamcity.build.id" => "%{teamcity.build.id}",
"OWNER" => "loadtestnonxa_res",
"CATEGORY" => nil,
"TASK" => "CW_SCHEDULED_MESSAGE",
"ID" => "1520365718377",
"TEN_ID" => "1",
"CHANNEL_ID" => "3",
"CODE" => "BOOTSTRAP",
"PLANNED_START" => "2018-03-06 19:48:42.587",
"TIME_STORED" => "2018-03-06 19:48:42.587",
"TIME_QUEUED" => "2018-03-06 19:48:42.826",
"TIME_STARTED" => "2018-03-06 19:48:42.882",
"TIME_DONE" => "2018-03-06 19:48:53.503",
"RUNTIME" => 10621,
"WAIT_TIME" => 295,
"DURATION" => 10916
}
...
Problem
- The field
teamcity.build.id
is not resolved to the expected value. (it works forfilename
)
The json part works. I trued it in a single file configuration:
input {
stdin {
codec => multiline {
pattern => '.*'
negate => true
what => previous
}
}
}
filter {
grok {
match => {"message" => "\"teamcity.build.id\":%{GREEDYDATA:teamcity.build.id}"}
match => {"message" => "\"env.loadtest-usecases.profile\":%{GREEDYDATA:env.loadtest-usecases.profile}"}
match => {"message" => "\"project-config.version\":%{GREEDYDATA:project-config.version}"}
match => {"message" => "\"loadtest-usecases.branch.name\":%{GREEDYDATA:loadtest-usecases.branch.name}"}
}
if "_grokparsefailure" in [tags] {
drop { }
}
}
output {
stdout { codec => rubydebug }
}
And the result is:
{
"@timestamp" => "2018-03-29T15:50:55.857Z",
"message" => " \"teamcity.build.id\": \"9182737475\",",
"@version" => "1",
"host" => "elGuapo",
"teamcity.build.id" => " \"9182737475\","
}
{
"@timestamp" => "2018-03-29T15:50:55.857Z",
"message" => " \"env.loadtest-usecases.profile\": \"run-uc1-small\",",
"@version" => "1",
"host" => "elGuapo",
"env.loadtest-usecases.profile" => " \"run-uc1-small\","
}
{
"@timestamp" => "2018-03-29T15:50:55.858Z",
"message" => " \"project-config.version\": \"6.3.0-SNAPSHOT\",",
"@version" => "1",
"host" => "elGuapo",
"project-config.version" => " \"6.3.0-SNAPSHOT\","
}
{
"@timestamp" => "2018-03-29T15:50:55.858Z",
"message" => " \"loadtest-usecases.branch.name\": \"feature/release-6.3.0\"",
"@version" => "1",
"host" => "elGuapo",
"loadtest-usecases.branch.name" => " \"feature/release-6.3.0\""
}
Questions
- Can I use a grok match as part of the file path?
- If yes, shouldn't it be able to access the value of the fields when I add it to every data-item of the next filter (like with
filename
)? - What am I doing wrong?