Error trying tutorial for csv parsing for ingest pipeline


I am trying this tutorial

However, I am doing everything as suggested. I removed the header line and converted the double quotes to single quotes from the csv file using sed using my RHEL box.
When I try to run the curl command to send the file over to elastic search, I get the following error.

{"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"failed to parse"}],"type":"mapper_parsing_exception","reason":"failed to parse","caused_by":{"type":"json_parse_exception","reason":"Illegal unquoted character ((CTRL-CHAR, code 13)): has to be escaped using backslash to be included in string value\n at [Source: org.elasticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper@eb7190; line: 1, column: 195]"}},"status":400}

Not really sure what I am doing wrong. Can you help me here please?

@gmoskovicz as you wrote the blog post, do you have an idea?

@dadoonet thanks for the ping!

@abd.wsu where exactly is it failing? In which part of the blog is it failing?

Illegal unquoted character ((CTRL-CHAR, code 13)) means that there is a control character unquoted character. Are you trying out the same dataset from that CSV file, or do you have your custom CSV file?

Hi Gabriel @gmoskovicz

Thanks for the reply. I am trying with the same csv. I removed the first line, changed the double quotes to single quotes in the file. Then I running the while loop from my linux box to feed the data to my Elastic Cloud. Basically just using the same commands that you used in the blog except the elastic cloud details.
I gave a shot without removing the double quotes and removing all the double and single quotes and that still didn't work.
The error mentions some issue in line 1 column 195 but there is no column 195 on line 1. No space or any special characters. The 2nd row does start in a new line. Not sure if that matters.

@abd.wsu maybe you left the first line empty? You need to remove the entire line, and the file should start from the second line. So basically the second line is going to be the first line in the file. Maybe you left a blank line?

Here's the file i modified.

$ more NYC_Transit_Subway_Entrance_And_Exit_Data2.csv
BMT,4 Avenue,25th St,40.660397,-73.998091,R,,,,,,,,,,,Stair,YES,,YES,FULL,,FALSE,,FALSE,4th Ave,25th St,SE,40.660323,-73.997952,'(40.660397, -73.998091)','(40.66032
3, -73.997952)'
BMT,4 Avenue,25th St,40.660397,-73.998091,R,,,,,,,,,,,Stair,YES,,YES,NONE,,FALSE,,FALSE,4th Ave,25th St,SW,40.660489,-73.99822,'(40.660397, -73.998091)','(40.660489
, -73.998220)'

And so on. Looks like the first line is deleted. And data is the first line. Please ignore the display within quotes jumping to next line. On my server, this is all one line.


What does it gives you if you do:

while read f1
do print "$f1"
done < NYC_Transit_Subway_Entrance_And_Exit_Data2.csv

Also, what is the command that you are using to send the data? Are you using -d "{ \"station\": \"$f1\" }"? because it should send a JSON. The line in elasticsearch is:

{ \"station\": \"$f1\" }

Where $f1 is the line in your CSV file. So if you change $f1 with the CSV line, you will see column 195.

while read f1; do curl -XPOST '' -H "Content-Type: application/json" -u elastic:xxxxxxxx -d "{ \"station\": \"$f1\" }"; done < NYC_Transit_Subway_Entrance_And_Exit_Data2.csv

That's the command I am using. I did try something just now. I took the first line from the modified file, put it in a different file and ran the same command against that file(so only the single data line) and it worked.

It could be that the file was saved with a different UTF format. Try to re-build that file and start from scratch. It should work if the file is the same and the line is similar than the one that you just tried out.

I simply copies multiple lines to this new file and it works fine. Something missing or wrong in those 1800 lines of the original file. I am able to proceed now. Thanks Gabriel.

Thanks great! Glad to see it is working.

@gmoskovicz, sorry..I did run into another issue while creating the index template. I am putting the template in the dev tools on kibana. Not sure if that's where i am supposed to put it. But It's not creating the type geo_point for location and hence my visualization is erroring out. Am i doing this wrong? Sorry if i sound like a rube.

Hi @abd.wsu,

The template should be added prior ingestion. So you need to delete the index, and then create the pipeline, then create the template, and then index the data.

Is that what you did?


1 Like

Yes. That's what i did.
I tried doing the same using curl from my rhel server and looks like it still didn't create the index pattern template. I do get a {"acknowledged":true} response when i do it.

This is the error i get from my visualization.
No Compatible Fields: The "subway_info_v1" index pattern does not contain any of the following field types: geo_point

Let me clean up everything and start again.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.