Json import error on elasticsearch with curl command

Enonimous8 · April 26, 2018, 12:48pm

Hi, I have a problem when load a json file on elasticsearch with curl command.
The json file is this:
https://drive.google.com/file/d/13nCXdIY1n096SSWcL36TEtqkGTVhn28o/view?usp=sharing

When I launch this command:
curl -H 'Content-Type:application/json' -XPOST "localhost:9200/pacchetti3/doc/_bulk?pretty" --data-binary @C:\Users\Thebe\Desktop\singolopacchetto.json

I received this error:
{
"error" : {
"root_cause" : [
{
"type" : "json_e_o_f_exception",
"reason" : "Unexpected end-of-input: expected close marker for Object (start marker at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@3c861e6a; line: 1, column: 1])\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@3c861e6a; line: 2, column: 3]"
}
],
"type" : "json_e_o_f_exception",
"reason" : "Unexpected end-of-input: expected close marker for Object (start marker at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@3c861e6a; line: 1, column: 1])\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@3c861e6a; line: 2, column: 3]"
},
"status" : 500
}

I checked if the json is well formatted on the site https://jsonformatter.curiousconcept.com/ and the answer is positive.

The version of elasticsearch and kibana I'm using is 5.6.9
What is the problem? And how can I solve?

Christian_Dahlqvist · April 26, 2018, 12:56pm

As you are using the bulk API, have you formatted the file according to the requirements of this API?

Enonimous8 · April 26, 2018, 1:43pm

Yes, I tried to format it according to the Bulk API (at least I think). The result is this:

{"index":{"_index":"pacchetti3"}}
{
  "_type": "pcap_file",
  "_score": null,
  "_source": {
    "layers": {
      "frame": {
        "frame.interface_id": "0",
        "frame.interface_id_tree": {
          "frame.interface_name": "any"
        },
        "frame.encap_type": "25",
             ....
             ....
        }
      }
    }
  }
}

This time the error is this:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "Malformed action/metadata line [3], expected START_OBJECT but found [VALUE_STRING]"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "Malformed action/metadata line [3], expected START_OBJECT but found [VALUE_STRING]"
  },
  "status" : 400
}

What am I doing wrong?

Christian_Dahlqvist · April 26, 2018, 1:51pm

Each header and document must be on a single line and the document should not contain _type, _score or _source fields. You may also run into problems as you have dots in your field names.

Enonimous8 · April 26, 2018, 2:11pm

I followed the advice, the json file I formatted it as follows:
{"index":{"_index":"pacchetti3"}} {"layers":{"frame":{"frame.interface_id":"0","frame.interface_id_tree":{ ... }}}}

However the problem persists with a new error:
{
"error" : {
"root_cause" : [
{
"type" : "action_request_validation_exception",
"reason" : "Validation Failed: 1: no requests added;"
}
],
"type" : "action_request_validation_exception",
"reason" : "Validation Failed: 1: no requests added;"
},
"status" : 400
}

Christian_Dahlqvist · April 26, 2018, 2:33pm

That does not seem to be the format specified in the documentation. The file should look something like this, with a newline after each line:

{ "index" : { "_index" : "pacchetti3", "_type" : "doc" } }
{ "field1" : "value1" }
{ "index" : { "_index" : "pacchetti3", "_type" : "doc" } }
{ "field1" : "value2" }

Enonimous8 · April 26, 2018, 3:01pm

Thanks!
I did a test by formatting only a couple of lines of my json:
{"index":{"_index":"pacchetti4", "_type": "doc"}}
{"frame.interface_id": "0", "frame.interface_name": "any", "frame.encap_type": "25", "frame.time": "Apr 20, 2018 15:30:52.669797277 ora legale Europa occidentale", "frame.number": "1", "frame.len": "649", "frame.cap_len": "649", "frame.marked": "0", "frame.ignored": "0", "frame.protocols": "sll:ethertype:ip:tcp:http:json", "frame.coloring_rule.name": "HTTP", "frame.coloring_rule.string": "http || tcp.port == 80 || http2"}
{"index":{"_index":"pacchetti4", "_type": "doc"}}
{"sll.pkttype": "0", "sll.hatype": "772", "sll.halen": "6"}
{"index":{"_index":"pacchetti4", "_type": "doc"}}
{"ip.version": "4", "ip.hdr_len": "20", "ip.dsfield": "0x00000000", "ip.dsfield.dscp": "0", "ip.dsfield.ecn": "0", "ip.len": "633", "ip.id": "0x0000b60a", "ip.flags": "0x00000002", "ip.frag_offset": "0", "ip.ttl": "64", "ip.proto": "6", "ip.checksum": "0x00008472", "ip.checksum.status": "2", "ip.src": "127.0.0.1", "ip.addr": "127.0.0.1", "ip.src_host": "127.0.0.1", "ip.host": "127.0.0.1", "ip.dst": "127.0.0.1", "ip.dst_host": "127.0.0.1", "Source GeoIP: Unknown": "", "Destination GeoIP: Unknown": ""}

It was finally loaded. Once this problem is solved, I would like to know if there is a simple way to quickly format a much larger json (at the beginning of the topic I have only shown one, but in reality there are about 400.000).

How can I do?

Christian_Dahlqvist · April 26, 2018, 3:42pm

If you have the data formatted as a JSON object per line, you can use Logstash or one of the language to script the ingestion. You generally want to limit the size of each bulk request to around 5MB or so, and then send multiple requests to Elasticsearch in parallel.

system · May 24, 2018, 3:42pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Error when loading the json file with the curl command Elasticsearch	9	3576	May 28, 2018
Error Putting JSON onto Elasticsearch Elasticsearch	2	339	July 16, 2019
Trying to import JSON fails Elasticsearch	4	410	September 2, 2019
Error in BULK Elasticsearch	2	10507	March 2, 2017
"error" : "JsonParseException[Unexpected end-of-input: expected close marker for OBJECT (from [Source: [B@19d6816; line: 1, column: 0])\n at [Source: [B@19d6816; line: 2, column: 3]]" Elasticsearch	8	25764	July 6, 2017

Json import error on elasticsearch with curl command

Related topics