Hello everybody,
I wish to ingest some CSV and texts in Elasticsearch via Logstash. Problem is than Logstash generates a unique field named « message » with the text line inside. I would like to split this text line into few columns. Someone gave me the tip to use Grok pattern insted of split and mutate system. So I tried to use a Grok pattern but it does not work.
I’ve seen that syntax is : %{SYNTAX :SEMANTIC}
Is there a list of keyword about SYNTAX somewhere ?
Here you are two examples of test files I want to ingest :
1234;unmot
1234;un mot;;UN AUTRE MOT;PA ;;14/02567/AB/167;
This is the current Logstash configuration
input {
file {
path => "C:/logs/*.txt"
start_position => "beginning"
sincedb_path => "NULL"
}
}
filter {
grok {
match => {
"message" => "%{NUMBER:col0};%{WORD:col1}"
}
}
}
output {
elasticsearch { hosts => ["localhost:9200"] }
stdout { codec => rubydebug }
}
This is the log content :
[2020-02-13T14:35:46,738][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2020-02-13T14:35:46,845][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"7.5.1"}
[2020-02-13T14:35:48,795][INFO ][org.reflections.Reflections] Reflections took 38 ms to scan 1 urls, producing 20 keys and 40 values
[2020-02-13T14:35:50,239][INFO ][logstash.outputs.elasticsearch][main] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}}
[2020-02-13T14:35:50,423][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"http://localhost:9200/"}
[2020-02-13T14:35:50,471][INFO ][logstash.outputs.elasticsearch][main] ES Output version determined {:es_version=>7}
[2020-02-13T14:35:50,479][WARN ][logstash.outputs.elasticsearch][main] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>7}
[2020-02-13T14:35:50,547][INFO ][logstash.outputs.elasticsearch][main] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//localhost:9200"]}
[2020-02-13T14:35:50,607][INFO ][logstash.outputs.elasticsearch][main] Using default mapping template
[2020-02-13T14:35:50,703][INFO ][logstash.outputs.elasticsearch][main] Attempting to install template {:manage_template=>{"index_patterns"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s", "number_of_shards"=>1, "index.lifecycle.name"=>"logstash-policy", "index.lifecycle.rollover_alias"=>"logstash"}, "mappings"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}
[2020-02-13T14:35:50,835][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][main] A gauge metric of an unknown type (org.jruby.specialized.RubyArrayOneObject) has been create for key: cluster_uuids. This may result in invalid serialization. It is recommended to log an issue to the responsible developer/development team.
[2020-02-13T14:35:50,843][INFO ][logstash.javapipeline ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>1000, "pipeline.sources"=>["C:/logstash-file-read.conf"], :thread=>"#<Thread:0x154d1065 run>"}
[2020-02-13T14:35:51,507][INFO ][logstash.javapipeline ][main] Pipeline started {"pipeline.id"=>"main"}
[2020-02-13T14:35:51,559][INFO ][filewatch.observingtail ][main] START, creating Discoverer, Watch with file and sincedb collections
[2020-02-13T14:35:51,567][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2020-02-13T14:35:52,227][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
Thankyou for your help