We have installed ELK stack in our env and we use it to monitoring and log viewing. For now, we try to find new usages for ELK stack. We are using ELK in 7.8 version and we see in Kibana UI a lot of new functionalities like Machine learning stuff and others. We explore a little Data Visualizer tool from Kibana where we can just upload CSV files and explore data from there. Based on that explorations we invented a use-case for our platform because we think that can match to our process when we generate CSV files and we need to validate them. Our concept looks like below:
- our platform generate CSV file in daily batches ->
- we load that files to elastic index ->
- calculate/aggregate summary of that file from daily batch e.g. documents count, null values for particular fields, some mathematic operations ->
- based on that summary create some validate rules which should check that our data is valid ->
- if data isn't valid, notify the external system
At the moment we are a bit stuck on point 3. We don't know, how in a proper way generate a summary of documents from elasticsearch index. We investigate the Transform feature from Kibane when we able to aggregate/calculate data from one index and save results in the destination index, but we can apply only basic functions like sum, max, count, but we need here more complex operations like conditions, etc. Here comes my first question, how to solve such a case in ELK
For point 4 regarding validation rules, we explore the "Watcher" from Kibana and we think that we can use it to set up our validation rules. Based on our calculated summary from point 3 we can set thresholds which, when exceeded, would be performed by a specific alerting action. We are still not sure that this is the correct approach, so I would like to hear from someone if it makes sense