Problem with bulk inserting json via python

mariskaas · July 10, 2018, 8:35am

I have ben trying to bulk insert a json file into elasticsearch via python (very new to elastic). I had to transform the data a little bit before I put it in elastic. In the end I write my file to a ndjson and try to bulk insert using the following code:
with open("/Users/mariska/Documents/jsontestje14.json") as json_file:
body=json_file.read()

helpers.bulk(es, actions=body, index='jsononfagun6', doc_type='kenteken')

Which yield the error:
Compressor detection can only be called on some xcontent bytes or compressed xcontent bytes

I've tried numerous things to change the format of the file so that it will be accepted by elastic but no success. It currently looks like this (example cause the real file has many more lines):

{
"Kenteken": "WSFT54",
"Voertuigsoort": "Aanhangwagen",
"Merk": "GS",
"Handelsbenaming": "AC-2000 AC",
"Vervaldatum APK": "19/10/2018",
"Datum tenaamstelling": "19/09/2005",
"Bruto BPM": "nan",
"Inrichting": "open laadvloer",
"Aantal zitplaatsen": "nan",
"Eerste kleur": "N.v.t.",
"Tweede kleur": "N.v.t.",
"Aantal cilinders": "nan",
"Cilinderinhoud": "nan",
"Massa ledig voertuig": "5580.0",
"Toegestane maximum massa voertuig": "20000.0",
"Massa rijklaar": "nan",
"Maximum massa trekken ongeremd": "nan",
"Maximum trekken massa geremd": "nan",
"Retrofit roetfilter": "nan",
"Zuinigheidslabel": "nan",
"Datum eerste toelating": "19/09/2005",
"Datum eerste afgifte Nederland": "19/09/2005",
"Wacht op keuren": "Geen verstrekking in Open Data",
"Catalogusprijs": "nan",
"WAM verzekerd": "N.v.t.",
"Maximale constructiesnelheid (brom/snorfiets)": "nan"
}

Several of these all seperated by newlines. It seems to parse every individual letter of every string seperately, but I can't figure out the problem. Hopefully someone can help!

danielmitterdorfer · July 13, 2018, 12:32pm

Hi,

according to the docs, the actions parameter has to be an iterable. As you read the file now, body is of type str which in turn is an iterable of characters and this is the reason for the error that you get.

Try reading the file line by line with:

with open("/Users/mariska/Documents/jsontestje14.json") as json_file:
    body=json_file.readlines()

Then it should work fine.

Note that this means that you read the whole file at once into memory which may or may not be what you want. Alternatively you can also iterate over the lines of the file, collect it the lines into a list and send the bulks yourself without using the helper.

Daniel

system · August 10, 2018, 12:32pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Noob) Put json file on elastic, there is an error Elasticsearch	2	433	October 5, 2018
Elasticsearch bulk insert JSON file Elasticsearch	2	2027	July 5, 2017
Bulk insert file having many json entries into Elasticsearch Elasticsearch	4	27806	July 5, 2017
Elasticsearch with nodejs Elasticsearch	1	1201	February 24, 2017
[solved] - BulkIndexError - Compressor detection can only be called on some xcontent bytes or compressed xcontent bytes Elasticsearch	3	10885	June 21, 2019

Problem with bulk inserting json via python

Related topics