Since upgrading to 8.x and (via cloud.elastic.co) enabling Monitoring > Logs and metrics > Ship to a deployment, I was surprised at the volume of data arriving into the .monitoring-es-8-mb
datastream.
On investigation it seems there is a lot of duplicate data being created about index recovery which happened in the past (see below).
I started by investigating the most common kinds of documents arriving in that datastream broken down by event.dataset
and they are ~60% elasticsearch.index.recovery
, ~35% elasticsearch.index
and then a few little other bits.
Digging into the event.dataset
=elasticsearch.index.recovery
ones >98% of them have elasticsearch.index.recovery.type
=PEER
and basically all of them have elasticsearch.index.recovery.stage
=DONE
which started to seem odd - why so many docs about things which have happened but none about things in progress?
Filtering for 1 specific value of elasticsearch.index.recovery.name
(in my case for example .ds-my.index.name-YYYY.MM.DD-00000N
) I found that there were 2 records being created every 10 seconds for this, with apparently no new data in them.
In my case there were only 2 unique values each for elasticsearch.index.recovery.start_time.ms
and elasticsearch.index.recovery.stop_time.ms
so it looks like the same pair of events are being duplicated every 10 seconds.
Why this seeming waste of effort space (which of course we're paying for)?
Is this expected / a known issue / needs a ticket? I couldn't find anything in my search so far.
Thanks in advance!
P.S. looking separately at the event.dataset
=elasticsearch.index
(as opposed to the ....recovery
ones) that seems to have 1 record per index every 10 seconds - although this too seems excessive for ancient indices in the cold tier, at least I can see some use in recording potentially-changing info (e.g. elasticsearch.index.total.search.query_time_in_millis
might change over time even for an older index) whereas the recovery ones mentioned above seem to be 100% identical records of a past event so I'm not sure those need re-recording every 10 seconds.