My use case is following, I had thousands of pdf and supporting xml files. I have converted the files into JSON and the JSON file has the following structure :
Like above I have 5000 documents, I wish to put this into elastic search, and build a search engine to find which documents are relevant according to passed query.
As a first step, how can I put multiple documents. I could not find any solution on internet.
I have seen examples of bulk API on web, however in those examples bulk API seems to use only one big json file to put into elastic search, but not multiple.
Is it like that, elastic search accepts one big JSON ?
You mean that you have 5000 json files on your hard disk and you want to send them to elasticsearch?
I thought you were speaking of JSON documents, not JSON files.
Well. Lot of options there.
Build a shell script which creates the bulk
Logstash file input might help (unsure though)
FSCrawler has a json option which you can use as well
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.