Using other than comma as separator in csv filter

I have files that has either a semi-colon or pipe as separator. The values within the fields can have commas, so I can't use a commas as separator. I have tried using the csv filter and altering the separator. My conf file and error message are below.

input {
        file {
         path => "C:/Users/dclar/Documents/ESTK/material/*.csv"
         start_position => "beginning"
         sincedb_path => "NUL"
        }
    }
    filter {
        csv {
    	 separator => "|"
    	 columns => ["Host"|"AC"|"UUID"|"GUID"|"Bn_Number"|"ID"|"INCREMENT"|"NUMBER"|"UTC"|"DESCRIPTION"|"need"|"Reported"|"Percentage"|"Percentage"]
        }
    }
    output {
       elasticsearch {
         action => "index"
         hosts => "http://localhost:9200"
         index => "brute_force"
        }
    }
C:\logstash-7.8.0\bin>logstash
Sending Logstash logs to C:/logstash-7.8.0/logs which is now configured via log4j2.properties
[2020-07-06T19:25:53,353][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.8.0", "jruby.version"=>"jruby 9.2.11.1 (2.5.7) 2020-03-25 b1f55b1a40 Java HotSpot(TM) Client VM 25.251-b08 on 1.8.0_251-b08 +indy +jit [mswin32-i386]"}
[2020-07-06T19:25:54,098][ERROR][logstash.agent           ] Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:af8, :exception=>"LogStash::ConfigurationError", :message=>"Expected one of [ \\t\\r\\n], \"#\", \"{\", \",\", \"]\" at line 11, column 21 (byte 223) after filter {\r\n    csv {\r\n\t separator => \"|\"\r\n\t columns => [\"Host\"", :backtrace=>["C:/logstash-7.8.0/logstash-core/lib/logstash/compiler.rb:58:in `compile_imperative'", "C:/logstash-7.8.0/logstash-core/lib/logstash/compiler.rb:66:in `compile_graph'", "C:/logstash-7.8.0/logstash-core/lib/logstash/compiler.rb:28:in `block in compile_sources'", "org/jruby/RubyArray.java:2577:in `map'", "C:/logstash-7.8.0/logstash-core/lib/logstash/compiler.rb:27:in `compile_sources'", "org/logstash/execution/AbstractPipelineExt.java:181:in `initialize'", "org/logstash/execution/JavaBasePipelineExt.java:67:in `initialize'", "C:/logstash-7.8.0/logstash-core/lib/logstash/java_pipeline.rb:43:in `initialize'", "C:/logstash-7.8.0/logstash-core/lib/logstash/pipeline_action/create.rb:52:in `execute'", "C:/logstash-7.8.0/logstash-core/lib/logstash/agent.rb:342:in `block in converge_state'"]}
[2020-07-06T19:25:54,360][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
[2020-07-06T19:25:59,398][INFO ][logstash.runner          ] Logstash shut down.

C:\logstash-7.8.0\bin>

The separator option sets the separator expected in each event. So you would set that if the lines look like

someHost|Foo,bar|baz|someGuid...

The columns option of a csv filter expects an array of strings, and that means they have to be comma separated.

columns => ["Host", "AC", "UUID", "GUID" ... ]

Thank you.

Will column headers be picked up automatically then if I don't list them in the array, but they are in the first line of the csv file?

Badger,

Thank you, sorry, it took a moment to sink in what you said. The columns option is an array. It is not for defining the columns using the chosen separator. Many thanks.

If your file has the column names in the first line and that uses the same seperator as the other events then the csv filter can handle that using the autodetect_column_names option.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.