Data Stream data retrieval and compression

prithvirajsivarajan · June 17, 2021, 1:44pm

I'm working on a data stream. I have created a test ILM policy. Have a couple of question regarding that.

Is there a huge difference time lag while retrieving data from hot, warm, cold phase indices. I tried inserting a few documents and used shrink API to reduce number of shards to 1 in the warm phase. But I'm not experiencing much time difference in retrieving the data from warm, hot and cold phases. Is that the case or will it differ when there is large amount of data. How much of a time difference can we expect for data retrieval between the phases.
I'm trying to see if there is a compression technology on data stream. ie, the data on the cold phase is not needed for searching anymore. Can we zip that data so that we can get more disk space on the cluster and can store store more data to the cold phase. Does the size reduces when we move index to cold phase itself? Or the shrink API is actually used for reducing the size of the index(I'm not sure if reducing the number of shards reduces the size of the index)

warkolm · June 18, 2021, 2:45am

Retrieving how exactly? Are the phases on the same hosts, or do you have different hardware profiles for each phase?

Elasticsearch compresses by default. You cannot zip the underlying data without losing access to it.

The _shrink API is used to reduce the number of shards, it should also help to minimise the size of the index.

prithvirajsivarajan · June 18, 2021, 7:40am

They are on the same host only. I just want to know if the data retrieval on warm and cold phases have any difference in time it takes. I tried using a _search query on an index in hot and warm phases on postman. I couldn't see much difference in getting the data. But for large data, how large will be the time lag on different phases

prithvirajsivarajan · June 18, 2021, 7:41am

So in cold phase, we can't reduce it's size much further using any compression technology?

warkolm · June 20, 2021, 11:32pm

If they are on the same host, no.

No.

system · July 18, 2021, 11:33pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Hot Cold architecture question? Elasticsearch	7	1931	May 13, 2020
Data storage stragety Elasticsearch snapshot-and-restore	4	281	August 9, 2023
Warm indices is not compressed Elasticsearch ilm-index-lifecycle-management	19	1190	December 2, 2021
What is the difference between hot. cold and warm indices when we use ILM on Elasticsearch? Elasticsearch	7	7207	April 20, 2020
Elastic Search Index Data Compression (v1.4.2) Elasticsearch	11	4223	July 6, 2017

Data Stream data retrieval and compression

Related topics