Reindexing Data

I have about 300 indices of data, each of them consumining anywhere from a few hundred MBs to a couple tens of GBs. I'd like to reindex all of the data into rollup indices to reduce the number of indices mounted, which from my understanding would improve performance. In addition, I'd like to make some modifications to the index template to remove some unneccessary fields and properly index fields as an IP or bool instead of strings as they currently are.

I currently ingest the data from log files using Logstash and utilize ILM policies to control rollover and merging. One day after the files are ingested, they are moved to another folder and compressed into a .zip file for archival purposes.

What would be the best way, both in terms of speed and efficiency to go about reindexing the data?

  1. Delete the currently indexed data, decompress my archived log files, and have logstash reingest it.
  2. Create another logstash pipeline that would pull the indexed data and reprocess it and reindex it into a different location?

Anyone?

I would reccomend to download Curator and use the reindex API.
Take a look and se if it solves your problem.

https://www.elastic.co/guide/en/elasticsearch/client/curator/current/index.html

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.