Empty field filled with _source

Hi,

I'm having some trouble importing data I exported as CSV in elastic/kibana 7.8.0
The data looks like that

timestamp,AL_code,AL_type,type,target,sgbIndex,rack_validBatteryCount,rack_rackId,rack_invalidBatteryCount,priority,platformId,nbPlayersWithProblems,nbPlayersConnected,criticalityLevel,domain,flightTime,location,messageAL_code,isLandSuccess,messageAL_type,messageDomain,messageEvent,isFlightSuccess,evt,host,category,nbFreePlayerSlot,Name,messageType,droneName,message
2020-07-22T12:08:26,,,,,,,,,,DEV-ARC-001,0,4,NONE,,,Park,,,,,START_GAME,,0,LauncherServer,ATTRACTION,0,session,,,
2020-07-22T11:52:31,0,9,,,,,,,,DEV-ARC-001,,,NONE,,,Park,OK,,CORE_LAUNCH_CLIENT_APP,,,,,desktop-pc1,PCJ,,,,,Game started

As you can see, the field message sometimes is empty.
But when I go through the Machine Learning > Import data features and import the data, this same field message act weirdly:

  • If there was a message (like in 2nd row), the field is good and show the message (here "Game started")
  • If there was no message (like in the 1st row), the field message looks like it is filled with _source and become 2020-07-22T12:08:26,,,,,,,,,,DEV-ARC-001,0,4,NONE,,,Park,,,,,START_GAME,,0,LauncherServer,ATTRACTION,0,session,,,

I first though it was due to one of my pipeline that do some work on fields, but even after deleting the pipeline I got the same result.

It's because each row of the CSV file is passed from Kibana to the ingest pipeline in Elasticsearch in the field message. Then this is processed by the ingest pipeline. But the default behaviour of the CSV ingest processor is not to add fields that had empty values in the CSV. Hence where the message column in your CSV is not empty it overwrites the original message field, but where it's empty the original message field containing the whole row shows in the final document.

A workaround would be to change the ingest pipeline on the "Advanced" tab when importing your file:

  • Before the CSV processor insert a rename processor that renames message to original_message
  • Change the field of the CSV processor to original_message
  • Add a remove processor after the CSV processor that removes the original_message field

That works indeed!
I instead rename "message" in "data" in the override settings, and added a rename processor "data" > "message" after the remove "message" processor. This avoid breaking our visualisations :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.