Replace the empty fields in csv


(Chander Mohan) #1

While ingesting data (which include 20+ columns in csv), it throwing an error while creating pipeline to solve this I convert the file in .txt and ingest. It did well, though certain columns information overlap or exchanged with other.

I am suspecting it happen because of empty fields in the data.

How can I replace the empty fields ?


(Magnus Bäck) #2

Please give concrete examples instead. What does your input file look like? What does your configuration look like? How does Logstash parse it? Use a stdout { codec => rubydebug } output.


(Chander Mohan) #3

My Logstash config file....

input {
      file {
        path => "C:\Users\1480587\Documents\Chander\Elastic\Data\Inc_Details.txt"
        start_position => "beginning"
        sincedb_path => "nul"
      }
    }
    filter{

    	mutate {
    		gsub => ["[message]", "\s", "0"]
    		}
    	csv {
     	separator => ","
    	#skip_empty_columns => true
    	columns => [ "Month", "Quarter", "Year", "INCIDENT_ID", "REQ_ID", "COUNTRY", "SERVICE", "ASSIGNED_GROUP", "STATUS", "SERVICE_TYPE", "PRIORITY", "URGENCY", "REPORTED_DATE", "SUBMIT_DATE", "LAST_RESOLVED_DATE", "LAST_MODIFIED_DATE", "CLOSED_DATE", "RESPONDED_DATE", "OPS_CATEGORIZATION_TIER_1", "OPS_CATEGORIZATION_TIER_2", "OPS_CATEGORIZATION_TIER_3", "PRODUCT_CATEGORIZATION_TIER_1", "PRODUCT_CATEGORIZATION_TIER_2", "PRODUCT_CATEGORIZATION_TIER_3", "SUMMARY", "SLM_STATUS", "ASSIGNED_SUPPORT_ORGANIZATION", "FIRST_NAME", "LAST_NAME", "OWNER_GROUP", "OWNER_SUPPORT_ORGANIZATION", "DIRECT_CONTACT_COMPANY", "RESOLUTION", "RESOLUTION_CATEGORY", "RESOLUTION_CATEGORY_TIER_2", "RESOLUTION_CATEGORY_TIER_3", "CLOSURE_PRODUCT_CATEGORY_TIER1", "CLOSURE_PRODUCT_CATEGORY_TIER2", "CLOSURE_PRODUCT_CATEGORY_TIER3", "SLA_RESUME_MIN", "SLA_GOAL", "INC_SLA", "SLA_OVERALLSTARTTIME", "SLA_OVERALLSTOPTIME" ]
        }
    	mutate {convert => ["Month", "integer"]}
    	mutate {convert => ["Quarter", "integer"]}
      	mutate {convert => ["Year", "integer"]}
      	mutate {convert => ["INCIDENT_ID", "integer"]}
      	mutate {convert => ["REQ_ID", "integer"]}
      	mutate {convert => ["SLA_RESUME_MIN", "integer"]}
    }

    output{
    	elasticsearch {
        hosts => "localhost"
        index => "reports"
        #document_type => "Inc Test"
    		}
    	file{
        path => "C:\Users\1480587\Documents\Chander\Elastic\Data\H-Out.csv"
        codec => line { format => "custom format: %{message}"}
        }
      
      stdout{
    	codec => rubydebug
      }
    }

(Magnus Bäck) #4

Please answer the remainder of my questions.


(Chander Mohan) #5

Objective:- Ingest the data in LS to derive the metrics and dashboards.
// Below is my input in .csv, which I am convering from excel to csv.
//Logstash is parsing message and converting removing all "whitespace", followed by assigning right unit (integer etc.) and creating an output file to view.

|Month|Quarter|Year|INCIDENT_ID|REQ_ID|COUNTRY|SERVICE|ASSIGNED_GROUP|STATUS|STATUS_REASON|SERVICE_TYPE|PRIORITY|URGENCY|IMPACT|REPORTED_SOURCE|REPORTED_DATE|SUBMIT_DATE|LAST_RESOLVED_DATE|LAST_MODIFIED_DATE|CLOSED_DATE|RESPONDED_DATE|OPS_CATEGORIZATION_TIER_1|OPS_CATEGORIZATION_TIER_2|OPS_CATEGORIZATION_TIER_3|
    |---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
    |May|Q2|2017|INC000004412614|REQ000006101332|India|eBBS|GBL-ISCI-DETICA-CDD|Closed|Automated Resolution Reported|Infrastructure Event|Low|4-Low|3-Moderate/Limited|External Escalation|5/1/2017 0:23|5/1/2017 0:23|5/1/2017 5:46|5/16/2017 2:21|5/16/2017 2:00|5/1/2017 0:23|APPLICATION|DETICA-TS|Data Extract/Report Related query|

(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.