How to search offline data which stored in external storage

by default we will read live data and maintain the data for 6 months in elastic search. In some cases we want to search 6 months data from last 5 years data(offline) which can be stored in external storage. How to search data from external storage and what will happen on the live data stored in elastic search.

Any data you want to search must be indexed into Elasticsearch. You can store snapshots on external shared storage, but those indices must be restored before they can be queried.

Thanks for your response.
lets say i am using data node of 18TB for elastic earch, which can store 6 months data, A of now onine data i have is for 6 latest 6 months, and it is indexed and stored in elastic search,

but if user want to search data from external storage, do we need index them and store it in elastic. It means that old indexed online data will be moved/deleted.

if live data is required again we can get it from latest logs using beats?

You can store old indices on shared external storage using the snapshot and restore API. As long as you have sufficient memory and disk space in the cluster you can then temporarily restore these indices at a later time and query them. This is typically a lot faster than reindexing data.

Thanks for your response.

As of now we have only 6 months memory in elastic search cluster. either i can store online data or i can retrieve from External storage.

what would be your suggestion to improve the performance in terms of moving from searching online data and offline data.

If my cluster was in the cloud I would consider temporarily increasing the size of the cluster by adding nodes abd then restoring the required indices from snapshots. If my cluster was on prem I would look at spinning up a temporary cluster in the cloud where I could load the snapshots and then use cross-cluster search if I can not directly query the new temporary cluster in isolation.

but we are not working on cloud and we are working on VMs.

How to decide number of logstash nodes for elastic nodes. Is there is any ratio or calculations are there ?

for example, i have 1TB of logs per day, i want to have online retention of 30 days at any point of time.

In this case, totaly 30TB is required from elastic search and we can have 4 elastic search node with each 10TB of memory with 64GB RAM without replica.

how many logstash and can be planned.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.