I have a database with several configurations, and each configuration can be linked to the same serial_number. To be able to analyze the data correctly, I need to filter så that each serial_number is grouped as one configuration.
So can I use an ingest pipeline to set up condition to say " if there is some document here with the same serial_number, only ingest the newest one of each"?
Is anything possible with these processors and the painless language?
the configurations get stored in the database when a user press save in the program, and if the user presses save multiple times the same configuration get's stored again. This can happen if a user is inexperienced or forgets a setting before saving. So it's a incorrect design of the system I guess, they admit this, but to get accurate statistics from the database they want to filter the redundant data out.
I'm not sure if I understand the concept of state changes yet. There are several configurations with the same serial_number, how would their state change? And would you do still do this with ingest pipelines? What processor?
I guess it is not possible to both keep all documents and also use just one of each serial number to analyze statistics?
Basically you want to use Elasticsearch as a time series datastore.
Where each event comes in with a timestamp and serial and whatever else is logged. Then you can graph changes over time, either at an individual serial level, or on an aggregated level.
If you want to retrieve only the latest event, which contains whatever state is logged, then you can do that easily.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.