Logstash read csv column problem


(Chi Hong Chen) #1

I added autodetect_column_names to the csv block on the filter to automatically detect the head value, but the index on kibana appears column2, column3 ... , what is the cause?

config

nput {
        file {
                path => ["/usr/share/logstash/DataSet/TBrain_IPS.csv"]
                close_older => 3600
                codec => "plain"
                delimiter => "n"
                discover_interval => 30
                enable_metric => true
                id => "ips"
                max_open_files => 5
                sincedb_path => "/dev/null"
                sincedb_write_interval => 15
                start_position => "beginning"
                stat_interval => 7200
                tags => "ips"
                type => "ips"
        }
}
filter {
        csv {
                separator => ","
                autodetect_column_names=> true
                skip_empty_columns=> false
                skip_empty_rows=> false
                skip_header=> false
                periodic_flush => true
                id => "csv"
        }
        if [tags] == "ips" {
                mutate {
                        convert => {
                                "event_protocol_id" => "integer"
                        }
                        rename => {
                                "event_rule_reference" => "event_rule_referenceCVE"
                        }
                        split => {
                                "event_rule_reference" => ";"
                        }
                }

        }

}
output {
        elasticsearch {
                hosts => "elasticsearch:9200"
                document_type => "ips-csv"
                index => "ips-%{+YYYY.MM.dd}"
        }
        stdout {
                 codec => rubydebug
        }
}


(Dan Hermann) #2

The autodetect_column_names option on the CSV filter work reliably only if you set the number of worker threads to 1. With more than one worker thread, there's a race condition in which an indeterminate row will be selected as the header row. There's a bug filed on the CSV filter for that, but a fix is very difficult.


(Chi Hong Chen) #3

Can I currently only specify with columns and pipline.work?
Still have a better way to solve it ?


(system) closed #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.