Is anyone using on-premise servers/storage for Cold & Frozen data tiers? It's my understanding that these are the options:
• AWS S3
• Google Cloud Storage
• Azure Blob Storage
• Hadoop Distributed File Store (HDFS)
• Shared filesystems such as NFS
• Read-only HTTP and HTTPS repositories
Anyone already using the Shared Filesystems like NFS to host a Frozen / Cold tier? If yes, how is it going? Good? Did you have any challenges with implementation? Any performance issues?
Based on your question, I'm assuming what you are really referring to here is searchable snapshots, so I'll be responding as such. If you are in fact referring to something else, feel free to clarify.
So, the nice thing with cold/frozen with searchable snapshots, is that it just uses the basic foundation of snapshots. If you are already using a shared filesystem for your backups there shouldn't be much difference when using cold/frozen.
With cold tier, I wouldn't expect much of a change, as really the only difference is that instead of having a local replica, the snapshot in the shared filesystem will act like a replica, and only be used if the node or shard goes down for whatever reason.
With frozen tier, I'd say the usage would mainly depend on how large of a local node cache you have. The size of the node cache would determine how frequently your cluster will read from the remote filesystem snapshots. The other thing to consider is how often old data is searched as the more old data that is searched, the more that will need to be read into the local cache.
One caveat with searchable snapshots that I'm not a big fan of is, you can only have one snapshot repository for searchable snapshot snapshots, so if you normally backup to your shared filesystem, and lets say have an additional back to something like S3, this would no longer work. You'd need to have your snapshots go to your shared filesystem, then have a backup that backs up the shared filesystem to S3, which requires additional forethought, as you don't want your shared filesystem to be backed up mid-way through an Elasticsearch snapshot.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.