I have a requirement to take specific ES data from one instance (permanently available) and, via an automated process, move that data to another ES instance running in a local docker container.
At the time the data is determined the local docker instance isn't running.
So I need the ability to easily save the required elasticsearch data from instance a (which is always up) and load into instance b (docker container) when the container is built.
I can do this using Elasticsearch head chrome plugin to run a query and export as json. I then have a process that loads the json at the time the docker container is started.
I'm running into a formatting problem however - if I perform a curl bulk load of the data, each "row" of data must be converted from json (as outputted by the Elasticsearch head plugin) into the correct bulk format (one line).
If I load the data via curl without the bulk option I still have to strip out the metadata (took, timed_out, shards, hits, etc) from the json file.
If the data is a small amount this isn't too bad but manually modifying the json file is too time consuming when the data is large.
Is there another way to export data from one Elasticsearch instance into a format that can easily be imported into another instance at a later time?
Any suggestions would be appreciated.