I am new to ES project and i am trying to understand the code. What am trying to do is instead of ES writing to DISK, i want to write to some other place. Ex amazon S3. I know that this is not ideal, but I have my own reasons for this.
Can someone point me to the code where i can start changing this? Any guidance will he helpful.
Elasticsearch is very I/O intensive, so using something like S3 as backing storage is not a good idea. It may help if you provide some additional details about exactly what you want to do and why.
I want to encrypt data the Elastic search stores. So basically i want to encrypt using different keys per index. If i use S3 as file system then i can use S3s encryption at rest support.
Coming back to my main question:
Can you point me out to the code where ES writes to filesystem?
S3 does not offer many of the filesystem features that Elasticsearch needs to operate correctly, such as locks or atomic renames. It's just not possible to do what you ask.
Encryption-at-rest is perfectly possible today, with a proper filesystem on a proper hard disk, using something like dm-crypt.
If you are interested in the enormity of the task you suggest, the three major places that Elasticsearch writes to disk are:
org.elasticsearch.gateway.MetaDataStateFormat and its subclasses
org.apache.lucene.store.Directory and its subclasses
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.