Put JSON into elasticsearch?

Hello!

This topic may be a duplicate, but I couldn't find anything appropriate, so please bare with me on this one.
So anyway, how do I import a JSON file into elasticsearch from command line?
I was trying something along the lines of:

curl -X POST 'localhost:9200/sample_data/data/1?pretty' -H 'Content-type: application/json' --data-binary @sample.json

But i get a weird parsing error somehow :x
"type" : "mapper_parsing_exception",
"reason" : "failed to parse",
"caused_by" : {
"type" : "json_parse_exception",
"reason" : "Unexpected character (',' (code 44)): expected a value\n at [Source: org.elasticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper@5a0f3cad; line: 6, column: 3]"
}

Anyway, here's the corresponding JSON, if it helps:
{
"type":"train",
"name":"T01",
"state":"inactive",
"time":"2019-12-20 08:48:12"
},
{
"type":"train",
"name":"T02",
"state":"active",
"time":"2019-12-20 08:48:12"
}
It's very simple, and the format looks good to me... So yea, any help would be greatly appreciated :smiley:

EDIT: I also tried with the following format (previous format was actually incorrect, silly me (was trying to copy a NDJSON format example))
[
{
"type":"train",
"name":"T01",
"state":"inactive",
"time":"2019-12-20 08:48:12"
},
{
"type":"train",
"name":"T02",
"state":"active",
"time":"2019-12-20 08:48:12"
}
]
but to no avail:
"type" : "mapper_parsing_exception",
"reason" : "failed to parse",
"caused_by" : {
"type" : "not_x_content_exception",
"reason" : "Compressor detection can only be called on some xcontent bytes or compressed xcontent bytes"
}

Cordially,
Mox

First you need to use PUT instead of POST here as you are providing the document id.

Then this is not a valid json file as it contains multiple documents and not only one.
If you want to insert multiple documents at once, use the bulk API.

Oh, interesting.
Indeed, when removing the second half of the document and changing POST to PUT, it works.
Now if i want to upload the information corresponding to both trains, should I split it in two json files ?
Or can I keep the first format, and if yes, what should I change to make it work ?
I'm guessing the request would be something like
curl -X PUT "localhost:9200/sample_data/_bulk?pretty" -H 'Content-Type: application/x-ndjson' --data-binary @sample.json

Unfortunately, i get an error msg if I only try doing this.
I tried removing the comma in the middle (since we're working with a newline delimited json), and got the following error:
"type" : "json_e_o_f_exception",
"reason" : "Unexpected end-of-input: expected close marker for Object (start marker at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@64b18771; line: 1, column: 1])\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@64b18771; line: 1, column: 3]"

Thanks again for any help you can bring :smiley:

You need to split in 2 http calls or use the bulk api.

Using Bulk API, i did indeed get it to work !

Here's the final document
{ "index": {"_id":"1"}}
{ "type":"train", "name":"T01", "state":"inactive", "time":"2019-12-20 08:48:12" }
{ "index" : { "_id" : "2" } }
{ "type":"train", "name":"T02", "state":"active", "time":"2019-12-20 08:48:12" }

As well as the command used to upload the file:
curl -X PUT "localhost:9200/sample_data/_bulk?pretty" -H 'Content-Type: application/x-ndjson' --data-binary @sample.json

To any future readers: notice the mandatory "index" before each document, that was the part I had trouble understanding, but after thorough reading of the doc, it's actually explained pretty well ^^'

Thanks again for the help!

EDIT: After uploading the file to ES, you might want to have your timestamp be registered as a time field. Since it doesn't recognize the string automatically, I applied a mapping before the bulk put:
PUT /sample_data
{
"mappings": {
"properties": {
"type": {"type": "keyword"},
"name": {"type": "keyword"},
"state": {"type": "keyword"},
"time": {"type": "date", "format": "yyyy-MM-dd HH:mm:ss"}
}
}
}

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.