Hi ,
We have a 8 node cluster in on premise with 3 years data (3 master + 5 data node) one data node out of this being acting as s3 for cold storage.
Our s3 is getting space issues and we would like to optimize by moving logs older than 2 years to glacier layer in same s3 bucket.
1)Any impact on application performance if we have same s3 bucket with cold layer and glacier layer?
2)As we agreed clients for data search and storage for 7 years . Will they be able to search and retrive data if it is moved to glacier layer ?
3)Any other optimization plans for the same?
Yes, snapshot and restore doesn't specifically support glacier. You could manually manage this however and move the data to glacier and then back when needed, but that is a DIY solution .
WE are not planning to move snapshots. We have older logs to be moved from S3 frozen node to s3 glacier layer.
But we need that logs to be saerchable as well
To my knowledge AWS S3 doesn't have a space limitation. Are you using a different S3 provider other than AWS? Or are you stating that the cold node which stores the cold data is running out of space?
We are looking to move our old indices (2 years old data docs) from s3 to s3 glacier.Based on ILM policies applied, we are moving data from hot to cold node.Our cold node is s3 standard.
We have a question from AWS team as part of cost optimization if we can move the very old data from s3 standard to s3 glacier layer in same s3 bucket as the size is almost in TB's.
We would like to know 1)If we move our old indices from s3 to s3 glacier, will the data be searchable in kibana for users?
2)Will it impact any performance or application?
3)Do we have any other cost optimization ideas for cold storage without interrupting user access to data via Kibana?
We are using Licensed version .
As mentioned, we are using S3 as our frozen node with user able to search the data in Kibana.
Once we move to S3 glacier, will it impact saerch?
Not looking for searchable snaphot
Hi @Shalinicts , as has been mentioned above, you can move the indices to glacier, but they won't be searchable if that's the answer you're looking for. Other implications and constraints, as stated above, still apply.
Yes Ayush that was the answer I was looking for. Whether the indices can be saerchable from Kibana.In short use shouldnt feel any difference if the data is in frozen node(s3) or even if it is in s3 glacier .
what are the other implications?
Sets the S3 storage class for objects stored in the snapshot repository. Values may be standard , reduced_redundancy , standard_ia , onezone_ia and intelligent_tiering . Defaults to standard . Changing this setting on an existing repository only affects the storage class for newly created objects, resulting in a mixed usage of storage classes. You may use an S3 Lifecycle Policy to adjust the storage class of existing objects in your repository, but you must not transition objects to Glacier classes and you must not expire objects.If you use Glacier storage classes or object expiry then you may permanently lose access to your repository contents. For more information about S3 storage classes, see AWS Storage Classes Guide
We have 3 master node set as master.node=true
5 data node:
data node 1, 2,3,4 is set as roles: data_hot,data_content
data node 5 is set as roles: data_frozen
Not moving the data to frozen node as searchable snapshots .
ILM policy is different for each type of logs.Maximum retention is for 7 years in frozen node.
This datanode 5 has 2.5 years data and we are thinking if we can have 2 layers for this and move 2 years old data to Glacier if that data can be searchable.
Is the data5 node currently using S3? If you are not using searchable snapshots here, how have you set it up to use S3? Have you mounted a S3 bucket as a volume and pointed path.data to it?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.