Logstash : ingestion via Grok pattern does not work

romainfoulono · February 19, 2020, 3:34pm

Hello everybody,

I wish to ingest some CSV and texts in Elasticsearch via Logstash. Problem is than Logstash generates a unique field named « message » with the text line inside. I would like to split this text line into few columns. Someone gave me the tip to use Grok pattern insted of split and mutate system. So I tried to use a Grok pattern but it does not work.

I’ve seen that syntax is : %{SYNTAX :SEMANTIC}

Is there a list of keyword about SYNTAX somewhere ?

Here you are two examples of test files I want to ingest :

1234;unmot

1234;un mot;;UN AUTRE MOT;PA ;;14/02567/AB/167;

This is the current Logstash configuration

input {
  file {
		path => "C:/logs/*.txt"
		start_position => "beginning"
		sincedb_path => "NULL"
	}
}
filter {
	grok {
		match => {
			"message" => "%{NUMBER:col0};%{WORD:col1}" 
		}
	}
}
output {
 elasticsearch { hosts => ["localhost:9200"] }
  stdout { codec => rubydebug }
}

This is the log content :

[2020-02-13T14:35:46,738][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2020-02-13T14:35:46,845][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.5.1"}
[2020-02-13T14:35:48,795][INFO ][org.reflections.Reflections] Reflections took 38 ms to scan 1 urls, producing 20 keys and 40 values 
[2020-02-13T14:35:50,239][INFO ][logstash.outputs.elasticsearch][main] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}}
[2020-02-13T14:35:50,423][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"http://localhost:9200/"}
[2020-02-13T14:35:50,471][INFO ][logstash.outputs.elasticsearch][main] ES Output version determined {:es_version=>7}
[2020-02-13T14:35:50,479][WARN ][logstash.outputs.elasticsearch][main] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>7}
[2020-02-13T14:35:50,547][INFO ][logstash.outputs.elasticsearch][main] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//localhost:9200"]}
[2020-02-13T14:35:50,607][INFO ][logstash.outputs.elasticsearch][main] Using default mapping template
[2020-02-13T14:35:50,703][INFO ][logstash.outputs.elasticsearch][main] Attempting to install template {:manage_template=>{"index_patterns"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s", "number_of_shards"=>1, "index.lifecycle.name"=>"logstash-policy", "index.lifecycle.rollover_alias"=>"logstash"}, "mappings"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}
[2020-02-13T14:35:50,835][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][main] A gauge metric of an unknown type (org.jruby.specialized.RubyArrayOneObject) has been create for key: cluster_uuids. This may result in invalid serialization.  It is recommended to log an issue to the responsible developer/development team.
[2020-02-13T14:35:50,843][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>1000, "pipeline.sources"=>["C:/logstash-file-read.conf"], :thread=>"#<Thread:0x154d1065 run>"}
[2020-02-13T14:35:51,507][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2020-02-13T14:35:51,559][INFO ][filewatch.observingtail  ][main] START, creating Discoverer, Watch with file and sincedb collections
[2020-02-13T14:35:51,567][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2020-02-13T14:35:52,227][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}

Thankyou for your help

A_B · February 19, 2020, 3:50pm

Hi @romainfoulono,

the GROK pattern seems fine. I have had problems getting Logstash to "pick up" files, especially if they exist already when Logstash starts...

Lately I have used Filebeat anytime I want to ingest from a file. There is even a CSV module for Filebeat

There is also a CSV filter for Logstash

Hope that helps,
AB

Badger · February 19, 2020, 7:08pm

There is patterns directory somewhere under the logstash install directory that contains multiple text files that define patterns. Also...

sincedb_path => "NULL"

If you do not want the file input to persist the sincedb to disk when it stops you should use the value "NUL", not "NULL".

romainfoulono · February 20, 2020, 4:46pm

Hi both of you
@Badger

I guess this is the path you were talking about :

C:\logstash-7.5.1\vendor\bundle\jruby\2.5.0\gems\logstash-patterns-core-4.1.2\patterns

In this folder, there is a file named grok-patterns.

@A_B
Okay and I think I'm going to use conditions to treat different kind of files.
I have a question, could we ingest an entire file and not only line by line into ES by Filebeat ? I mean, if I need to ingest an entire log file and I want ES to recognize all of its lines as one distinct file, so to have a single record containing the entire content from the file. Is it possible ?

Thank you

A_B · February 21, 2020, 8:12am

This depends a bit on the structure of the file... We do handle many "multiline" logs, like JAVA stack traces with Filebeat. Not sure if these is any limits to single document size in ES...

system · March 20, 2020, 8:12am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash découpage avec split et ingestion ne fonctionne pas Discussions en français	4	1214	March 18, 2020
Grok pattern to insert log files into Elasticsearch through Filebeat Elasticsearch	1	482	July 10, 2018
File input to grok filter to elasticsearch output Logstash	4	1723	November 22, 2017
Can't Get Custom Grok Filter To Work Logstash	2	353	August 1, 2018
Auto Field Detection in Logs Logstash	7	2834	July 6, 2017

Logstash : ingestion via Grok pattern does not work

Related topics