Elasticsearch.py dump conversion

Koen_Wouters1 · July 13, 2020, 9:52am

Up till now we used simple sh scripts to load JSON files as bulk to ES, but we want to change this to the python client. I am however struggling to figure out how we can do this without completely rewriting our JSON creation workflow.

Our format is very straightforward, one line with the index, next line with the content as in the shakespeare demo used by ES (snippet below)

{"index":{"_index":"shakespeare","_id":0}}
{"type":"act","line_id":1,"play_name":"Henry IV", "speech_number":"","line_number":"","speaker":"","text_entry":"ACT I"}

So my question is how I would be able to use this format (multiline JSON) with the elastic python package. Every example I can find is using either non multiline, and id generation is on the fly etc but preferably I'd like to use the same format as we have already.

or is this simply not feasible and do I need to iterate through each line to reconstruct it

system · August 10, 2020, 9:52am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Help Using Python to Load Data into ES Elasticsearch	6	6181	October 7, 2019
How to send multiline json file and send it to elasticsearch Logstash	2	122	February 18, 2024
Problem getting multiline json document into ES Elasticsearch	6	683	September 5, 2019
Is there any response.to_json() method in es python client? Elasticsearch	4	8412	July 5, 2017
ES has lost a portion of its data by importing the json data using python3 helpers.bulk and specifying _id Elasticsearch	3	470	April 16, 2018

Elasticsearch.py dump conversion

Related topics