Grok Help Request - Get FileDate from File Name into new field FileDate

sconrod · December 5, 2017, 6:13pm

Hi, I have a lot of csv files I am importing using the CSV filter but they have not timestamp included in the logs but I do have the date the log was produced in the filename, therefore, I want to create a new filedate field and then I want to grok out the date from the filename and send it to the filedate field but having problems getting this working.

here is my file name(there are just two columns in these files a name and a number:
Cangenbus-17-10-20.csv

here is my index and mapping:
PUT cangenbus-11
{
"mappings": {
"doc": {
"properties": {
"Name": { "type": "text" },
"Number": { "type": "integer","ignore_malformed": true},
"FileDate": { "type": "date" },
"Path": { "type": "text" }
}
}
}
}

===========================================================

here is my configuration file:
input {
file {
path => "/opt/sample-data/cangenbus-csv/*.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
separator => ","
columns => ["Name","Number"]
}

grok {
match => (?[%{YEAR-}%{MONTHNUM-}%{MONTHDAY-}])
add_field => ["filedate", "%{year-}%{month-}{day-}"]
}

date{
match => ["temptimestamp", "[yyyy-MM-dd]"]
target => "filedate"
}
}
output {
elasticsearch {
hosts => "http://10.0.2.15:9200"
index => "cangenbus-v9"
}
stdout {}
}

=====================================================

here is the error:

[ERROR] 2017-12-05 10:03:51.463 [Ruby-0-Thread-1: /usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/stud-0.0.23/lib/stud/task.rb:22] agent - Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"LogStash::ConfigurationError", :message=>"Expected one of #, ", ', -, [, { at line 15, column 10 (byte 240) after filter {\n csv {\n separator => ","\n columns => ["Name","Number"]\t\n }\n \ngrok {\nmatch => ", :backtrace=>["/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:42:in compile_ast'", "/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:50:incompile_imperative'", "/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:54:in compile_graph'", "/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:12:inblock in compile_sources'", "org/jruby/RubyArray.java:2486:in map'", "/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:11:incompile_sources'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:107:in compile_lir'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:49:ininitialize'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:215:in initialize'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:35:inexecute'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:335:in block in converge_state'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:141:inwith_pipelines'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:332:in block in converge_state'", "org/jruby/RubyArray.java:1734:ineach'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:319:in converge_state'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:166:inblock in converge_state_and_update'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:141:in with_pipelines'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:164:inconverge_state_and_update'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:90:in execute'", "/usr/share/logstash/logstash-core/lib/logstash/runner.rb:362:inblock in execute'", "/usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/stud-0.0.23/lib/stud/task.rb:24:in `block in initialize'"]}

Help would be appreciated.

Thank you.

Badger · December 5, 2017, 7:19pm

The file input puts the filename in path. This seems to work.

filter {
  grok {
    match => { "path" => "-%{INT:year}-%{INT:month}-%{INT:day}\.csv" }
    add_field => ["filedate", "%{year}-%{month}-%{day}"]
  }
}

sconrod · December 5, 2017, 9:32pm

Thank you this works! I really appreciate it. In case anyone else is trying to do this here is my config file:

root@ubuntu-16:~# clear
root@ubuntu-16:~# tmux
input {
file {
path => "/opt/sample-data/cangenbus-csv/*.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
separator => ","
columns => ["Name","Number"]
}

grok {
match => {"path" => "-%{INT:year}-%{INT:month}-%{INT:day}.csv"}
add_field => ["filedate", "%{year}-%{month}-%{day}"]
}
}

output {
elasticsearch {
hosts => "http://10.0.2.15:9200"
index => "cangenbus-v13"
}
stdout {}
}

sconrod · December 5, 2017, 9:34pm

similarly, if I want to grok out the filename and set is a new filed called filename would it be like this?:

grok {
match => {"path" => "{WORD:filename}-%.csv"}
add_field => ["filename", "%keyword"]
}

sconrod · December 7, 2017, 9:59pm

Thanks can you help me with one more please? I am trying to make another grok filter statement to createa new field filename and populate it with the filename in the path field. It is not working for me.

grok {
match => {"path" => "{WORD:filename}-%{INT:year}-%{INT:month}-%{INT:day}.csv"}
add_field => ["filename", "%{keyword}"]
}

Badger · December 8, 2017, 12:00am

@sconrod, the match is missing % before {WORD:filename}. I do not know what you are trying to do with the add_field. If the match pattern is corrected then the filename field gets added.

sconrod · December 8, 2017, 12:06am

Hi thanks I am trying to add a new field which will also show up in the index called "filename" and I want just the first part of the filename without the date in that filename field.

so my actual filename or logname is: Cangenbus-17-10-20.csv
and I want to populate the new field I am creating called 'filename' with just the first part of the name which is Cangenbus

sconrod · December 8, 2017, 12:08am

I tried this as well and it isn't working:
grok {
match => {"path" => " "%{WORD:filename}-%{GREEDYDATA}.csv"}
add_field => ["filename", "%{filename}"]
}

Badger · December 8, 2017, 12:16am

Use the first one, with the % added

match => {"path" => "%{WORD:filename}-%{INT:year}-%{INT:month}-%{INT:day}.csv"}

sconrod · December 8, 2017, 12:41am

Thanks that works, but I am getting the name twice in the new filename field.

This is my entire grok filter...do I need the second line then?

grok {
match => {"path" => "%{WORD:filename}-%{INT:year}-%{INT:month}-%{INT:day}.csv"}
add_field => ["filename", "%{filename}"]
}

}

Badger · December 8, 2017, 12:42am

No, you do not need the second line.

sconrod · December 8, 2017, 1:01am

Thank you.

I will remove it. I appreciate the help. If you have time I have another one open on a log with binary data in it that will not parse.....:0)

system · January 5, 2018, 1:02am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Grok Config to populate Date with Name of File Logstash	1	739	December 29, 2017
Timestamp from filename Logstash	3	1589	November 29, 2018
Logstash + Get DateTime from filename Logstash	6	5351	July 6, 2017
Help with file name to date logstash grok Logstash	3	250	September 21, 2023
Help with Grok patterns to parse date and name index from path Logstash	3	1007	August 23, 2018

Grok Help Request - Get FileDate from File Name into new field FileDate

Related topics