Bulk import does nothing


(Thebuleon29) #1

Hi,
I am trying to import a NDJSON file in ElasticSearch but it does absolutly nothing. Here is an sample of the file :

{ "create" : { "_index" : "benchmark", "_type" : "doc", "_id" : "1" }}
{ "timestamp" : 1516367192636718750, "name" : "SREMTWTAA", "spid" : "10350", "alert" : "0.0", "curve" : "TF9", "engValidity" : "1.0", "engValue" : "OFF", "id" : "4.0", "limits" : "-", "monState" : "OK", "rawValidity" : "1.0", "sample" : "1.0", "bitOffset" : 205.0, "bitSize" : 1.0, "pk" : 102.0, "rawValue" : 0, "reveivedTime" : 1.516367195445e+18 }

The entire file is 3357174 lines long, but the command runs in 1 second and does absolutely nothing. I use the command :

curl -s -H "Content-Type: application/x-ndjson" -XPOST 'localhost:9200/_bulk' --data-binary @file.ndjson

When I don't put the '@' it says "The bulk request must be terminated by a newline [\n]" but I don't think it is related to my problem as my file ends with an empty line and a '\n' character.

I also tried to use the examples from https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html and it works.

{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
{ "delete" : { "_index" : "test", "_id" : "2" } }
{ "create" : { "_index" : "test", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_index" : "test"} }
{ "doc" : {"field2" : "value2"} }

I don't see the difference betweem this file and mine... Please help :cry:


(David Turner) #2

Can you clarify what this means? Do you get a response or an error message? Do you see any messages in the log file? Can you share more information about what you're seeing?


(Thebuleon29) #3

By nothing I mean no output : the command just stops without printing anything.

I am running ElasticSearch in a docker container, and I have no idea of where t find the log file


(Christian Dahlqvist) #4

What is the size of the complete file? The bulk interface is designed to allow you to send multiple (not necessarily all) documents in a single request. It is typically recommended to keep the size of each bulk request below 5MB in size.


(Thebuleon29) #5

The file size is 708 MB. I tried with a shorter version (10 first lines) and it works ! Thanks a lot !


(David Turner) #6

It's probably a good idea to work this out before you need it again. The log is the first place to look when things go wrong. By default Elasticsearch logs most things to stdout, and I think the default for the Docker image is to also log to /usr/share/elasticsearch/logs which you can bind-mount to somewhere persistent.