Hi Stephen,
I'm returning to setting up indices. We prefer to completely define our configuration rather than try to rely on defaults.
My goal is to do whatever the recommended best practice is for managing different datasets with different retention windows, through indices, ILM, templates, aliases, and data streams
Towards that end I'm experimenting with defining index lifecycles and templates in terraform like so:
resource "elasticstack_elasticsearch_index_lifecycle" "hot_warm_delete_10_30" {
name = "hot_warm_delete_10_30"
hot {
min_age = "1h"
rollover {
max_age = "1d"
max_primary_shard_size = "30gb"
}
}
warm {
min_age = "10d"
readonly {
enabled = true
}
forcemerge {
max_num_segments = 1
}
shrink {
number_of_shards = 1
}
}
delete {
min_age = "30d"
delete {
delete_searchable_snapshot = true
}
}
}
resource "elasticstack_elasticsearch_index_template" "logs_app" {
name = "logs_app"
priority = 2022
index_patterns = [
"logs-app-*"
]
template {
settings = jsonencode({
"lifecycle.name" = elasticstack_elasticsearch_index_lifecycle.hot_warm_delete_10_30.name
})
}
}
with the idea that I will be able to configure additional lifecycles like hot_delete_10
and hot_warm_cold_delete_10_30_90
to manage migration and retention of different data sources
However, I'm uncertain how to connect the data I'm writing from filebeat to these template configurations.
From the filebeat logs it looks like to set an index like
index: "logs-%{kubernetes.container.name}-%{kubernetes.labels.app_kubernetes_io/name}-%{+yyyy.MM.dd}"
(where the kubernetes fields are populated by processors.add_kubernetes_metadata)
I would also have to configure setup.template.name
and setup.template.pattern
, which are fields defined not per filebeat.inputs
but generally for the filebeat configuration
I am hoping to avoid defining datastreams explicitly in terraform - I would like to infer them from kubernetes fields so that new services don't need to modify the elasticstack or filebeat configurations to start logging
Would you be able to describe the right way to do this configuration? It's a little unclear to me how to completely wire sending data from filebeat to different datastreams conditionally on the source/fields of data
Thank you!
Austin