How to handle duplicate column name while parsing csv?


(Saket Kumar) #1

First help:

I have log where duplicate field name occurs with same values. while parsing that csv file just want to entertain single field and their values.

e.g. Fields_A |Fields_B| Fields_A| Fields_C

Just want Fields_A| Fields_B| Fields_C to output elasticsearch.

Second help:

how to assign string for the fields having null values. I want to replace field values with some string if they have null values and output them to elastic search.

Any help is much appreciated.

thanks.


(Magnus Bäck) #2

I have log where duplicate field name occurs with same values. while parsing that csv file just want to entertain single field and their values.

e.g. Fields_A |Fields_B| Fields_A| Fields_C

Just want Fields_A| Fields_B| Fields_C to output elasticsearch.

Just delete the field you don't want?

filter {
  csv {
    columns => ["fieldA", "fieldB", "fieldA_duplicate", "fieldC"]
    ...
    remove_field => ["fieldA_duplicate"]
  }
}

how to assign string for the fields having null values. I want to replace field values with some string if they have null values and output them to elastic search.

The best I can come up with is a ruby filter:

ruby {
  code => "
    event.to_hash.each_pair { |k, v|
      event[k] = 'replacement string' if v.nil?
    }
  "
}

(Saket Kumar) #3

thanks magnus for reply....

Just one doubt for the duplicate field deletion:

In my case both field has same header name; hence by deleting as suggested above "remove_field => ["fieldA_duplicate"]" will not be cause of removal for both the column correct?

regards


(Magnus Bäck) #4

But does Logstash even pick up the header names? You're naming them with the columns parameter, no? This'll be easier if you show us what an actual message (as parsed by Logstash) looks like.


(system) #5