Elasticsearch.py dump conversion

Up till now we used simple sh scripts to load JSON files as bulk to ES, but we want to change this to the python client. I am however struggling to figure out how we can do this without completely rewriting our JSON creation workflow.

Our format is very straightforward, one line with the index, next line with the content as in the shakespeare demo used by ES (snippet below)

{"index":{"_index":"shakespeare","_id":0}}
{"type":"act","line_id":1,"play_name":"Henry IV", "speech_number":"","line_number":"","speaker":"","text_entry":"ACT I"}

So my question is how I would be able to use this format (multiline JSON) with the elastic python package. Every example I can find is using either non multiline, and id generation is on the fly etc but preferably I'd like to use the same format as we have already.

or is this simply not feasible and do I need to iterate through each line to reconstruct it

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.