Issue parcing csv file using Input csv plugin - ELK 5.5.0

elkuserh · September 25, 2017, 4:49pm

I am trying to breakdown a csv file with incident data using the csv input filter and have run into the following error. I had used a multiline codec to combine multiple lines into a single line (multiple lines within a single column of the csv).

The config file i had come up with is

input {
        file {
                path => "/home/elkuser/KP/IN_DATA/incident/sample/KP_Incident_test2.csv"
                start_position => "beginning"
                sincedb_path => "/dev/null"
                codec => multiline {
                        pattern => "^(INC)"
                        negate => "true"
                        what => "previous"
                        max_lines => 4000
                }
              }
}


filter{
        csv{
                columns => ["INCIDENTNUM","CREATEDTTM","FULLNAME","LASTRESOLVEDDTTM","CLOSEDTTM","NAME","STATUS","DETAIL","SUMMARY","#_DAYS_REPORT_VS_RESOLVED_DT","#_HOURS_REPORT_VS_RESOLVED_DT","#_DAYS_REPORTED_VS_CLOSED_DT","TIER1","TIER2","TIER3","TIER4","RESOLUTION","LOGINID","LOGINID","SRCMODDTTM","REPORTDTTM","SERVICE_TYPE","STATUS_REASON","OWNER_GROUP","CATEGORIZATION_TIER_1","CATEGORIZATION_TIER_2","CATEGORIZATION_TIER_3","RESOLUTION_CATEGORY","RESOLUTION_CATEGORY_TIER_2","RESOLUTION_CATEGORY_TIER_3","PRIORITY","CI"]
                separator => ","
                convert => { "#_DAYS_REPORT_VS_RESOLVED_DT" => "integer" "#_HOURS_REPORT_VS_RESOLVED_DT" => "integer" "#_DAYS_REPORTED_VS_CLOSED_DT" => "integer" }
        }
}

output {
        file{
              path => "/home/elkuser/KP/OUT_DATA/H-OUT.csv"
              codec => line
        }
        stdout {
                        codec => rubydebug
                        }
        }

21:56:24.725 [[main]>worker0] WARN logstash.filters.csv - Error parsing csv {:field=>"message", :source=>"INC000015663654,2017-01-21-07.05.34.000000,Gani Vinay,2017-01-24-07.03.41.000000,2017-02-09-02.02.04.000000,WINT CC,Closed,\"Summary: mws002.crdc.ap.org is unreachable. The host has failed to respond to the ping request.\nDate: Jul 29,2016 10:28 CUT\tSeverity: Minor\nResourceId: mws002.crdc.ap.org\nTicketGroup: WINT CC\tCustomerCode: kph\nInstanceId: 094.13.40.244\nInstanceValue: unreachable\nInstanceSituation: Node Status\nComponentType: ComputerSystem\nComponent: NodeAvail\nSubComponent: Ping\nNode: mws002.crdc.ap.org\nAlertKey\",mws002.crdc.ap.org(172.21.40.244) is unreachable. The host has failed to respond,2,71,18,Software,Infrastructure,Utility,Tivoli Monitoring,Server is Decommission WO0000007223546.,E388276,E388276,2017-03-12-16.35.39.000000,2017-01-21-07.05.33.000000,3,\"19,000\",IPCENTER AUTO,,,,Operating System,,,Medium,\r", :exception=>#<CSV::MalformedCSVError: Unquoted fields do not allow \r or \n (line 1).>}

elkuserh · September 25, 2017, 7:02pm

I managed to fix this, by adding a ruby function to the above config!.

ruby {
                init => '
                        #Function for combining multiple Syslog records in to one record.
                        def removecharacters(message)
                               begin
                                        result = "";
                                        result=message.tr! "\s\r\n\t\f", "";
                                        return result;
                               rescue Exception => e
                                        puts "Exception raised Inside the removecharacters method";
                                        puts e.message;
                                        puts e.backtrace.inspect;
                               end
                         end
                        '
                code => "msg = removecharacters(event.get('message'));
                         event.set('message',msg);"
        }

warkolm · September 25, 2017, 10:35pm

Thanks for sharing this solution! The CSV filter doesn't really work well with CRLFs in fields so this is super handy.

Also, we’ve renamed ELK to the Elastic Stack, otherwise Beats feels left out

guyboertje · September 26, 2017, 7:08am

you could use mutate gsub before the csv filter to replace all whitespace with one space.

elkuserh · September 26, 2017, 9:12am

Hi,
Can you help me with a sample of how it can be done with mutate?. I am new to Elastic Stack and hence this would be a good starting point to make better use of the available options.

guyboertje · September 26, 2017, 1:08pm

Try this:

filter {
  mutate {
       gsub => ["[message]", "\s", " "]
     }
  }
  csv{
      columns => ["INCIDENTNUM","CREATEDTTM","FULLNAME","LASTRESOLVEDDTTM","CLOSEDTTM","NAME","STATUS","DETAIL","SUMMARY","#_DAYS_REPORT_VS_RESOLVED_DT","#_HOURS_REPORT_VS_RESOLVED_DT","#_DAYS_REPORTED_VS_CLOSED_DT","TIER1","TIER2","TIER3","TIER4","RESOLUTION","LOGINID","LOGINID","SRCMODDTTM","REPORTDTTM","SERVICE_TYPE","STATUS_REASON","OWNER_GROUP","CATEGORIZATION_TIER_1","CATEGORIZATION_TIER_2","CATEGORIZATION_TIER_3","RESOLUTION_CATEGORY","RESOLUTION_CATEGORY_TIER_2","RESOLUTION_CATEGORY_TIER_3","PRIORITY","CI"]
      separator => ","
      convert => { "#_DAYS_REPORT_VS_RESOLVED_DT" => "integer" "#_HOURS_REPORT_VS_RESOLVED_DT" => "integer" "#_DAYS_REPORTED_VS_CLOSED_DT" => "integer" }
  }
}

system · October 24, 2017, 1:08pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Multilined csv error while parsing Logstash	1	164	November 7, 2023
Logstash file input plugin no output response Logstash	3	408	August 20, 2019
Convert multiline csv data to a single line csv in filter Logstash	7	1227	August 9, 2020
Multiple tables in csv file Logstash	7	4006	December 8, 2018
Splitting csv files send to elasticsearch via logstash http input Logstash	1	295	August 5, 2021

Issue parcing csv file using Input csv plugin - ELK 5.5.0

Related topics