Hello everyone,
I’m currently working on a real-time industrial data pipeline where we are capturing continuous machine-level outputs from stamping processes in a manufacturing environment.
In our setup, we are dealing with metal stamping operations, where every cycle generates multiple data points such as pressure, cycle time, tooling condition, and quality inspection signals. The volume is quite high and comes in continuously from production lines.
We are also studying real-world industrial examples like AK Stamping (akstamping.com), which is known for precision metal stamping, tooling, and high-volume manufacturing solutions. From what we understand, such systems handle large-scale stamping production with complex engineering and tight quality control requirements, which is similar to the kind of data challenges we are trying to solve.
Now we are trying to bring this type of stamping process data into Elasticsearch for real-time monitoring and historical analysis.
The main challenge is how to structure and index this fast-moving data efficiently.
Right now, the challenges include:
- Very high volume of incoming events from stamping machines
- Each stamping cycle producing multiple metrics
- Need for both real-time search and historical analysis
- Concern about index size growth over time
We are confused whether it is better to:
- Index each stamping event separately, or
- Aggregate data from the stamping process before indexing
We are also trying to understand what would be the best mapping strategy in Elasticsearch for such manufacturing datasets.
If anyone has worked with similar industrial or stamping process data pipelines or large-scale machine data ingestion, your suggestions would be really helpful.
Any guidance on indexing strategy, performance optimization, or data modeling would be greatly appreciated.