## Environment Details ###
- Elasticsearch (self-managed cluster on AWS)
- 4 Data Nodes (1 TB each)
- 2 Master Nodes
- Daily data ingestion around 300 GB (without replica) and approximately 600 GB with 1 replica
- Data source is an application that has built-in Elasticsearch integration
Current Architecture
Application (with built-in Elasticsearch integration)
↓
Elasticsearch Cluster
↓
Index: greenzone-aws_transactionalevents
- The application pushes data directly into Elasticsearch
- Elasticsearch connection is configured from the application UI
- Index name is defined in the application configuration
- The index name remains static
Problem Statement
The issue I am facing is as follows:
- The application allows configuring only a static index name such as:
greenzone-aws_transactionalevents - It does not support dynamic index naming like date-based patterns
- It does not automatically create or rotate indices based on date
Impact of Current Setup
Due to this limitation, the following challenges are observed:
- Continuous index growth
- Around 300 GB data is ingested daily
- The same index keeps growing without any rotation
- No date-wise data separation
- All data is stored in a single index
- It is difficult to isolate data for a specific day
- Snapshot limitations
- Snapshots work at index level
- It is not possible to take snapshots for specific date ranges within the same index
- Restore limitations
- It is not possible to restore data for a specific date
- Full index restore is required even for partial data
- Operational challenges
- Storage management becomes difficult
- Large index size can impact performance over time
Requirement
I want to achieve the following:
- Date-wise index creation, for example:
greenzone-aws_transactionalevents-2026.04.22
greenzone-aws_transactionalevents-2026.04.23 - Better data organization
- Logical separation of data per day
- Granular backup and restore
- Ability to take snapshots and restore specific date-based data
Questions
- Does an application with built-in Elasticsearch integration typically support:
- Dynamic index naming
- Date-based index creation or rotation
- Is there any configuration in Elasticsearch that can:
- Automatically split or manage data by date when a static index name is used
- Can index templates, ingest pipelines, or any Elasticsearch feature:
- Override or influence index naming when data is pushed from an external application
- What are the best practices for handling:
- High-volume ingestion (around 300 GB per day)
- When the source system enforces a static index name
Summary
- The application pushes data to Elasticsearch using a static index name
- Dynamic or date-based index naming is not supported at the application level
- Need guidance on achieving day-wise index structure for better data management, backup, and restoration
Looking for suggestions and best practices to handle this scenario effectively.