Data stream rollover & writing documents at pre-rollover date

nouknouk · March 31, 2023, 11:05pm

Hi,

I brand new to the concept of data streams, and I'm trying to understand the concepts & limits behind them. So sorry in advance if my question is a dumb one.

Let's say:

I configure a data stream "foo" with rollover of "max age = 1 hour" ; so every hour a new backing index is created ; starting with foo-000001 today at 1:00AM
I have 2 applications, APP1 & APP2, on two different servers, continuously logging data in elasticsearch via their respective filebeats, collecting app's log files ; each application writes one log every second.

At 7:00AM, data stream’s write index is foo-000007. and both applications continue to send logs every second.

At 7:58AM, because of a temporary failure of part of my network, filebeat of APP1 is no more able to reach elasticsearch (while APP2 continues to write logs in the data stream).

At 8:00AM, rollover happens and the new data stream’s write index becomes foo-000008 ; APP2 continues to write logs in it while APP1 doesn't.

At 8:05AM, network issue ends. APP1's filebeat starts again to send the logs to elasticsearch. But it starts with logs of 7:58AM.

=> Given

the write's backing index of data stream moved in the meanwhile to foo-000008
APP2 has already filled the data stream with logs from 8:00AM to 8:05AM
data streams are "append-only time series data"

=> will elasticsearch refuse to store the logs sent by filebeat of APP1, with @timestamp between 7:58AM and 8:05AM ? And thus will I loose all logs of APP1 between 7:58AM and 8:05AM ?

Thanks in advance.

stephenb · April 1, 2023, 12:00am

Hi @nouknouk Welcome to the community.

Good Question

See Here

No elasticsearch will write the documents into whatever the backing current backing index is. The timestamp of the document written is not gated/checked upon writing ... Yes in general documents end up in a backing index that is relative to the time, but there are times when there are lags disruptions, etc delays in documents being written and they will be written into the current backing index. There is no guarantee (nor actual requirement that the document timestamp matches the backing index rollover timing) There are some "smarts" to help elasticsearch optimize when searching keeping track of some of the min / max timestamps in the backing indices.

When you search the data you will most likely be searching the Data Stream via the Data View with a time filter, again there is logic to optimize the search.. what it searched and what is returned.

With Respect to "Append Only"

Append-only

Data streams are designed for use cases where existing data is rarely, if ever, updated. You cannot send update or deletion requests for existing documents directly to a data stream. Instead, use the update by query and delete by query APIs.

If needed, you can update or delete documents by submitting requests directly to the document’s backing index.

If you frequently update or delete existing time series data, use an index alias with a write index instead of a data stream. See Manage time series data without data streams.

Hope this helps...

nouknouk · April 1, 2023, 12:14am

hi @stephenb ,

Thank you

Ok. The more I was thinking about it, the more it makes sense, as otherwise, the same problem would arise a few milliseconds before & after any rollover.

For sure. Thanks for you answer !

system · April 29, 2023, 12:15am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Question about data stream rollover timings Elasticsearch datastreams	6	1015	March 11, 2023
Writing to wrong data stream's backing index after rollover Elasticsearch docker , datastreams	1	648	August 10, 2022
Datastream does not rollover properly Elasticsearch datastreams	4	733	January 18, 2021
Move data between data stream backing indexes Elasticsearch datastreams	9	1655	September 22, 2022
Has anyone had to tackle effective dating? Elasticsearch	3	329	July 6, 2017

Data stream rollover & writing documents at pre-rollover date

Related topics