Pb with format during a CSV import

Hi all,
Sorry in advance for my bad english and my poor knowledge on ELK.

I need to import regularly a csv file like this :

data_id iso event_id_cnty event_id_no_cnty event_date year time_precision event_type sub_event_type actor1
9722610 887 YEM78146 78146 6 January 2023 2023 1 Explosions/Remote violence Remote explosive/landmine/IED AQAP: Al Qaeda in the Arabian Peninsula
9722612 887 YEM78148 78148 6 January 2023 2023 1 Explosions/Remote violence Remote explosive/landmine/IED National Resistance Forces
9722613 887 YEM78149 78149 6 January 2023 2023 1 Battles Armed clash Military Forces of Yemen (2016-) Supreme Political Council
9722615 887 YEM78151 78151 6 January 2023 2023 1 Battles Armed clash Military Forces of Yemen (2016-) Supreme Political Council
....

My first problem is the format of the date field "event_date". As you can see, the date is like "06 january 2023" but when I want to import it as a date field with Kibana (csv upload), I have an error during the upload process. Kibana cannot parse this field when I put the date type instead of a Keyword type.

My second problem is that I would like to find a way to apply (when I find the solution for the date field) a template to this file each time I want to import it with Kibana. I don't know if (and how) it's possible to apply a template (with the right types) to the upload csv process without changing manually the type of the date field.

UPDATE : I have found the problem with the date format, so now I have got a ingest pipeline in Kibana which can transform correctly the event_date field, but I have a field (actor1) which is recognized as a text field and I would like to store it in my index as a keyword field. Is it possible to do that in the ingest pipeline ?

Thx for your help.

In the File Upload interface you can set up both the Ingest Pipeline AND the mappings for your new index so you can define the data types there

I took your table and added the following mappings and pipeline:

{
  "properties": {
    "actor1": {
      "type": "keyword"
    },
    "data_id": {
      "type": "long"
    },
    "event_date": {
      "type": "date"
    },
    "event_id_cnty": {
      "type": "keyword"
    },
    "event_id_no_cnty": {
      "type": "long"
    },
    "event_type": {
      "type": "keyword"
    },
    "iso": {
      "type": "long"
    },
    "sub_event_type": {
      "type": "keyword"
    },
    "time_precision": {
      "type": "long"
    },
    "year": {
      "type": "long"
    }
  }
}


{
  "description": "Ingest pipeline created by text structure finder",
  "processors": [
    {
      "csv": {
        "field": "message",
        "target_fields": [
          "data_id",
          "iso",
          "event_id_cnty",
          "event_id_no_cnty",
          "event_date",
          "year",
          "time_precision",
          "event_type",
          "sub_event_type",
          "actor1"
        ],
        "ignore_missing": false
      }
    },
    {
      "convert": {
        "field": "data_id",
        "type": "long",
        "ignore_missing": true
      }
    },
    {
      "convert": {
        "field": "event_id_no_cnty",
        "type": "long",
        "ignore_missing": true
      }
    },
    {
      "convert": {
        "field": "iso",
        "type": "long",
        "ignore_missing": true
      }
    },
    {
      "convert": {
        "field": "time_precision",
        "type": "long",
        "ignore_missing": true
      }
    },
    {
      "convert": {
        "field": "year",
        "type": "long",
        "ignore_missing": true
      }
    },{
      "date": {
        "field": "event_date",
        "formats": ["d MMMM yyyy"],
        "timezone" : "Europe/Amsterdam",
        "target_field": "event_date"
      }
    },
    {
      "remove": {
        "field": "message"
      }
    }
  ]
}

See the types for actor1 and event_date and the processor for the date.

And the discover screen after import:

Some notes about this (for you and whoever comes here in the future):

POST /_ingest/pipeline/_simulate
{
  "pipeline" :
  {
    "description": "_description",
    "processors": [
      {
      "date": {
        "field": "event_date",
        "formats": ["d MMMM yyyy"],
        "timezone" : "Europe/Amsterdam",
        "target_field": "event_date"
      }
    }
    ]
  },
  "docs": [
    {
      "_index": "index",
      "_id": "id",
      "_source": {
        "event_date": "6 January 2023"
      }
    }
  ]
}
  • Finally, if you want to run this repeatedly maybe it is more convenient to you to store the pipeline, and put the mapping inside an index template and configure filebeat to read your CSVs.
1 Like

Hi Jorge,

Thx for your answer.
I have created all files (Pipeline, template) and as you mentioned it, I need to find a way to apply automatically and repeatedly these files to my next uploads. If I understand well, it's not possible to automatically apply a pipeline and a template with the upload menu of Kibana so I'm going to try your solution with Filebeat.
Thx so much.