New here! And new to ELK stack. Me and my colleague have spent days trying to figure out a solution to our problem, and yet have not come up with a resolution.
Background: Our company has tasked us to ingest Gigabytes worth of .CSV files to be used with Elastic / Kibana.
Problem: The data input is a .CSV file. The first three columns parse fine. The fourth column ( and always fourth). Contains JSON data that is not properly escaped. Using the ',' delimiter obviously breaks the JSON column into multiple fields, sometimes 4 - 48, dependent on the amount of commas in the JSON data.
We have so far looked into using a space as the delimiter, but this did not work either as there are spaces in the JSON data.
Does anyone know how we can parse the JSON data, and prevent the commas and illegal quotations from failing / creating extra fields?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.