Hi,
I'm currently defining a new model for our indexation that is being used for calculating trends of metrics, right now, i'm indexing all documents in a single index using a prefix of date 2019-01-01
, and a combination of some other field values in the ID, to avoid duplication of data, as we have pretty much the same data in each document but what changes is the metric value, the rest is exactly the same, except for the date field, right now it works pretty well for calculating trending of the metrics using some basic aggregations in an histogram.
Lets say:
Open Pull Requests: 2019-01-01 = 10,
Code Coverage: 2019-01-01 = 90%
Total Open Pull Requests = 10
Total Code Coverage = 90
Now we have another use case in which we need to have the most up to date metrics in elasticsearch and provide a kind of on time reporting to allow some deployment processes based on the calculation of these metrics, as we will need to store many of the same metrics for the same day only changing its metric value, our aggregations will take the duplicate documents for each metric name, per day and will calculate a wrong value.
Lets say:
Open Pull Requests: 2019-01-01T10:00:00 = 10,
Code Coverage: 2019-01-01T10:00:00= 90%
Open Pull Requests: 2019-01-01T12:00:00 = 12,
Code Coverage: 2019-01-01T12:00:00= 80%
Total Open Pull Requests = 22
Total Code Coverage = 170
Does anyone have a good suggestion on what approach should i take?