Reindex data streams "in place"


I have many data streams totaling... a lot... of space and I need to change the mapping of a few existing fields.

What would be a decent strategy for re-indexing several dozen data streams consisting of several hundred backing indices to include these updated mappings?
I know I could just update the index templates and let Elasticsearch use the new mappings in any new backing indices created by ILM. The mapping conflicts in Kibana would render that pretty useless though.

Currently my best working theory is;

  • Update/create index template(s) with new mappings
  • Update logstash to append something to data_stream_dataset or data_stream_namespace to create a new set of data streams.
  • Change/create kibana index patterns, and dashboards, etc, to suit the new names.
  • Reindex from old data streams to new data streams.
  • Drop original data streams.

The problem with the above is having to change/fix all the dashboards and whatnot. That would suuuuck.

What I think I'd like to do is reindex one-by-one the backing indices right back into the existing data stream, "duplicating" that data, then drop the old backing indices to remove one of the duplicates. Having to use an intermediary new data stream would be fine.

  1. .ds-logs-dataset-namespace-2022.05.10-000001 -> logs-dataset-namespace-mapping
  2. DELETE .ds-logs-dataset-namespace-2022.05.10-000001
  3. logs-dataset-namespace-mapping > logs-dataset-namespace
  4. DELETE logs-dataset-namespace-mapping

Datastreams sit behind aliases, so you can always reindex into new indices and then attach those to the read alias. Or, just create an alias that will match your Kibana patterns.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.