look into my clusters and pick out those daily indices that are smaller than 5GB
then reindex them to big monthly indices
then delete the daily indices.
My problem/questions is: how do I source my indices? I would have indices like car-sale-{{daily}} and boat-sale-{{daily}} and I want to have it like car-sale-January, car-sale-February, boat-sale-January, boat-sale-February and so on
#####################This is my curator script########
description: "this index will reindex the small daily indices into a monthly index"
action: reindex
options:
wait_interval: 10
max_wait: -1
request_body:
source:
index: REINDEX_SELECTION
dest:
index: dest_index ###----> I know this is one of the parts that needs work but not sure how
filters:
- filtertype: period
period_type: relative
source: name
range_from: -1
range_to: 0
timestring: '%Y.%m.%d'
unit: months
- filtertype: space
disk_space: 5
threshold_behavior: less_than
This is a difficult edge case because you are only selecting indices smaller than 5g within the last month. Why not all indices in the last month? I don't understand the use case.
You already have the relative period filter there. Why not use the absolute period filter and just put all of the indices from November into a November index?
Why not create monthly indices from the beginning, or maybe try to get larger indices by using the rollover index API based on a combination of age and size?
Thank you guys for the feedback.
To answer Aaron's questions:
We are trying to optimize our cluster and we happen to have a ton of small daily indices (<5GB) and majority of them are ~50GB daily. So I am trying to bundle the small daily indices into monthly monthly indices to save the number of shards. and yes, I think absolute will work better in this case.
To answer Christian: we have a pipeline that will create multiple indices based on the source of the logs but it probably a good idea. We are trying to reduce the number of shards and try to get all the shards to the targeted size as of 50GB so I dont know if the rollover API can do it
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.