Data transformation for date field


(Gautham) #1

Hi All,

We are trying to upload a csv file where it has a date field, so using date filter we are converting the field from string to date, the problme we are facing here is,
few fields has "proper date" few fields are "NA" so we planned to use a IF condition in logstash to convert only date field and remove the NA field but it was not successful.

Here is my config file:

input {
  file {
    path => "/opt/installables/csv/data.csv"
    start_position => "beginning"
    sincedb_path => "/dev/null"
    codec => plain {
                    charset => "ISO-8859-1"
            }
  }
}
filter {
  csv {
      separator => ","
      columns => ["Asset No","Asset type","Asset in Store","Critical Asset","Assigned Flag","Assigned to","Licence till","Software Type","Software Provider","Device","Model","Manufacturer","Version","Installation Date","Licence Expired","Under Warranty","End of life","Under AMC","Compliance","Decommissioned","Unpatched Software"]
  }
if [Installation Date] == "NA" {
   mutate{
   remove_field => ["Installation Date"]
  }
 }
else {
date {
   match => ["Installation Date", "YYYY"]
   }
 }
}
output {
   elasticsearch {
     hosts => "1.1.1.6:9200"
     index => "assetmgmt"
  }

Instead of sending to elasticsearch if i use rubydebug i can see the results which are expected, looks like there is a problem while indexing.

have done proper mapping in kibana as well.
This is wat i did in kibana

PUT assetmgmt
{
  "mappings": {
    "doc": { 
      "properties": { 
  "Installation Date":  {
          "type":   "date", 
          "format": "year"}
   }
  }
 }
}

Instead of "year" i had tried "YYYY" also, didnt help

Any advice please, i'm breaking my head on this from past one week.

Thanks
Gauti


(Alexander Reelsen) #2

can you please check your logstash logfiles and also attach any possible error messages there? Can you also include the rubydebug output so people can compare things?

Also, I personally would refrain from using whitespaces and capital letters in field names and just go with snake case, but that might just be me.

--Alex


(Gautham) #3

@spinscale here is the rubydebug output

  {
                 "path" => "/opt/installables/csv/assetmgmt.csv",
            "Assigned Flag" => "Yes",
              "Assigned to" => "RMZ",
                  "message" => "Cell58263,Cell Phones,No,No,Yes,TCS,NA,Cell Phones,2017,MI,MI4,NA,No,NA,NA,NA,No\r",
          "Licence Expired" => "NA",
               "@timestamp" => 2018-08-30T05:05:40.246Z,
        "Installation Date" => "NA",
           "Under Warranty" => "NA",
                   "Device" => "MI",
            "Software Type" => "Cell Phones",
              "End of life" => "No",
                     "host" => "0.0.0.0",
               "Asset type" => "Cell Phones",
                  "Version" => "No",
        "Software Provider" => "2017",
                 "@version" => "1",
           "Critical Asset" => "No",
           "Asset in Store" => "No",
                    "Model" => "MI4",
                 "Asset No" => "Cell58263",
             "Licence till" => "NA",
             "Manufacturer" => "NA" 
     }
{
               "Version" => "NA",
     "Installation Date" => "2018",
     "Software Provider" => "Vmware",
                 "Model" => "NA",
                  "path" => "/opt/installables/csv/data.csv",
            "Asset type" => "Software",
        "Asset in Store" => "No",
        "Decommissioned" => "NA",
           "End of life" => "NA",
       "Licence Expired" => "No",
          "Manufacturer" => "NA",
            "@timestamp" => 2017-12-31T18:30:00.000Z,
           "Assigned to" => "TCS",
              "Asset No" => "VM8563",
                "Device" => "NA",
             "Under AMC" => "NA",
               "message" => "VM8563,Software,No,NA,Yes,TCS,2020,Vcentre,Vmware,NA,NA,NA,NA,2018,No,NA,NA,NA,Yes,NA,No\r",
          "Licence till" => "2020",
    "Unpatched Software" => "No",
         "Assigned Flag" => "Yes",
            "Compliance" => "Yes",
                  "host" => "0.0.0.0"
        "Critical Asset" => "NA",
              "@version" => "1",
        "Under Warranty" => "NA",
         "Software Type" => "Vcentre"
}

There are no errors in the log files, if i start sending the data to elasticsearch whereever "Installation Date" is NA only those are getting loaded which ever has date in it is not loaded into elasticsearch.

Any suggestions?

Thanks
Gauti


(Alexander Reelsen) #4

can you try a manual dev-tools or curl call and index a document into that index? does it work using NA or a year?


(Gautham) #5

@spinscale while uploading document through dev-tools if i specify "Installation date" as some year (2018) its successfully updating but if i mention "Installation Date" as "NA" then i'm getting the below error

{
  "error": {
    "root_cause": [
      {
        "type": "mapper_parsing_exception",
        "reason": "failed to parse [Installation Date]"
      }
    ],
    "type": "mapper_parsing_exception",
    "reason": "failed to parse [Installation Date]",
    "caused_by": {
      "type": "illegal_argument_exception",
      "reason": "Invalid format: \"NA\""
    }
  },
  "status": 400
}

Thanks
Gauti


(Gautham) #6

Bingoooo............Got the solution..........
You have to mention "ignore_malformed": true in your mappings in kibana.

PUT snow1
{
  "mappings": {
    "doc": { 
      "properties": { 
  "result.closed_at":  {
          "type":   "date", 
          "format": "YYYY-MM-dd HH:mm:ss",
          "ignore_malformed": true
    }
   }
  }
 }
}

Thanks
Gauti


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.