Hi - Are the files created in HDFS as snapshot of a Elasticsearch index encrypted by default? Upon running the 'hadoop fs -text' command against one of the index snapshot I get only garbage printed on screen. It'd be great if someone can clarify.
The snapshot process backs up Lucene segments, which are binary files, so it is perfectly natural that you can not read them as text files.
Thanks Christian. What exact format are the stored in? Also, is there any library to read and process them in a map reduce program?
Not that I am aware of. The normal was to interact with data in Elasticsearch from Hadoop is through the ES-Hadoop connector, not via backed up snapshots.
Snapshots are internal to ES and are not meant to be modified or read by outside tools. If you want to interact with the data, one should work through it through ES, and in case of Hadoop ES-Hadoop is the way to go.