Push a huge JSON file(s) into ES

I'm very new to elasticsearch, and working on ES-6.7 version. However I have a scenario where in I have to push a JSON file to elasticsearch. This JSON file will have a minimum of 20,000 lines (with arrays, objects, nested objects). I do not know beforehand the structure of JSON that will be pushed, hence I can't create a mapping beforehand. I'll be totally relying on the dynamic mapping created by elasticsearch. Moreover, there is a high chance that every JSON file pushed will be entirely different with respect to previous JSON files. (Some fields might match)

I have 2 questions

  1. How do I push such a big JSON file? The data in the JSON can go even upto 1 Lakh+ lines!
  2. I read about the way elasticsearch treats nested objects, I've understood I need to explicitly declare mapping beforehand for that field, setting the "type" to "nested". But I do not know the JSON structure beforehand, how do I create a mapping for it? Mind you, there are multiple nested objects inside nested objects as well. (Something like 5 level deep nested objects)

I've read about _bulk api, but it needs my entire json file to be in one single line and then do a POST to it using cURL. I'm not sure how I could accomplish that.

Would be really obliged if someone could give me a clear explanation and help me out in this. Or is this something that is not possible to do at all? If that's the case, please recommend an alternative.

@dadoonet Any suggestions you could think of?

Elasticsearch is not schemaless, so indexing large documents of unknown structure is likely to cause a lot of problems and mapping conflicts. I am not sure Elasticsearch is a good fit unless you can structure and control the data. Indexing very large documents can also cause a range of problems.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.