Problems when indexing csv files

Hello,

I use ELK (for the first time) to process CSV files. However, I have a problem when I run logstash with my configuration file:

input {
	file { 
		path => "/home/user/network_capt/csv/*.csv"
		start_position => "beginning"
		sincedb_path => "/dev/null"
	}
}

filter {
	csv {
		columns => [ "frame.number","frame.time_epoch","frame.time","frame.len","ip.src","ip.dst","ip.proto","tcp.srcport","tcp.dstport","udp.srcport","udp.dstport","_ws.col.Protocol","_ws.col.Info","tcp.window_size","tcp.flags","tcp.flags.syn","tcp.flags.ack","tcp.flags.fin","tcp.flags.push","tcp.flags.reset","tcp.flags.urg","tcp.analysis.retransmission","icmp.type","icmp.code","dns.qry.name","dns.qry.type"]
		separator => ";"
		skip_empty_columns => "true"
		skip_empty_rows => "true"
		
	}
	mutate {
		add_field => { "hostname" => "%{ip.src}"}
	}
	date {
		match => ["timestamp", "UNIX", "UNIX_MS"]
		locale => "en"
	}
	geoip {
		source => "ip.src" 
		target => "ip.src"
	}
	geoip {
		source => "ip.dst" 
		target => "ip.dst" 
	}
}

output { 
	elasticsearch {
		hosts => "localhost:9200"
		index => "honeypot-%{+YYYYMMdd}"
	}
stdout{}
}

Here's what I get:

Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
[WARN ] 2019-05-24 16:03:55.748 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified
[INFO ] 2019-05-24 16:03:55.765 [LogStash::Runner] runner - Starting Logstash {"logstash.version"=>"7.1.0"}
[INFO ] 2019-05-24 16:04:04.562 [[main]-pipeline-manager] elasticsearch - Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}}
[WARN ] 2019-05-24 16:04:04.806 [[main]-pipeline-manager] elasticsearch - Restored connection to ES instance {:url=>"http://localhost:9200/"}
[INFO ] 2019-05-24 16:04:05.054 [[main]-pipeline-manager] elasticsearch - ES Output version determined {:es_version=>7}
[WARN ] 2019-05-24 16:04:05.058 [[main]-pipeline-manager] elasticsearch - Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>7}
[INFO ] 2019-05-24 16:04:05.103 [[main]-pipeline-manager] elasticsearch - New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//localhost:9200"]}
[INFO ] 2019-05-24 16:04:05.128 [Ruby-0-Thread-5: :1] elasticsearch - Using default mapping template
[INFO ] 2019-05-24 16:04:05.135 [[main]-pipeline-manager] geoip - Using geoip database {:path=>"/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-filter-geoip-6.0.1-java/vendor/GeoLite2-City.mmdb"}
[INFO ] 2019-05-24 16:04:05.183 [[main]-pipeline-manager] geoip - Using geoip database {:path=>"/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/logstash-filter-geoip-6.0.1-java/vendor/GeoLite2-City.mmdb"}
[INFO ] 2019-05-24 16:04:05.242 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, :thread=>"#<Thread:0x6d4ac5e1 run>"}
[INFO ] 2019-05-24 16:04:05.342 [Ruby-0-Thread-5: :1] elasticsearch - Attempting to install template {:manage_template=>{"index_patterns"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s", "number_of_shards"=>1}, "mappings"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}
[INFO ] 2019-05-24 16:04:05.827 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}
[INFO ] 2019-05-24 16:04:06.023 [Ruby-0-Thread-1: /usr/share/logstash/lib/bootstrap/environment.rb:6] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[INFO ] 2019-05-24 16:04:06.051 [[main]<file] observingtail - START, creating Discoverer, Watch with file and sincedb collections
[INFO ] 2019-05-24 16:04:06.932 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600}
[WARN ] 2019-05-24 16:04:09.679 [[main]>worker0] elasticsearch - Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"honeypot-20190524", :_type=>"_doc", :routing=>nil}, #<LogStash::Event:0x76e5053a>], :response=>{"index"=>{"_index"=>"honeypot-20190524", "_type"=>"_doc", "_id"=>"0Csn6moB1EZoSNMt-Ht2", "status"=>400, "error"=>{"type"=>"illegal_argument_exception", "reason"=>"Can't merge a non object mapping [tcp.flags] with an object mapping [tcp.flags]"}}}}
[WARN ] 2019-05-24 16:04:09.684 [[main]>worker0] elasticsearch - Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"honeypot-20190524", :_type=>"_doc", :routing=>nil}, #<LogStash::Event:0x17e6b7af>], :response=>{"index"=>{"_index"=>"honeypot-20190524", "_type"=>"_doc", "_id"=>"0Ssn6moB1EZoSNMt-Ht2", "status"=>400, "error"=>{"type"=>"illegal_argument_exception", "reason"=>"Can't merge a non object mapping [tcp.flags] with an object mapping [tcp.flags]"}}}}

Then as a result of these warnings, here is what I get:

My index is well created, only I feel that it is empty. In addition, the field "mapping" is also empty which intrigues me:

In addition, when I want to create an index pattern, I do not see the index that was created, so I can not create an index pattern that matches with my index:

I would like to have a visual on the data of my CSV, but I am currently blocked by the problem that I presented above. I am a beginner with ELK, I only use it since today, do not hesitate to give me your advice.

Excuse me for the quality of this topic (& for my english) but I lack of time and I need to find a solution rather quickly ...

Thank you in advance to all for your help.

I answered you in the french version of the post you posted yesterday evening: Problèmes lors d'indexation de fichiers CSV

There is no need to post it twice. Read this and specifically the "Also be patient" part.

tcp.flags cannot has a concrete value.
Because of you having tcp.flags.reset field,
Elasticsearch interprets tcp.flags as an object.
However, tcp.flags seems to have a concrete text value in your output of logstash.

Thanks.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.