For several years I've logged data from some devices in CSV files, and used perl or grep to analyse them in various ways.
I'm trying to do the same thing in elasticsearch, primarily to benefit from far faster retrieval than grepping tens of gigs of data.
I'm polling and recording increasing numbers every 10 seconds. Data looks like this
|time|service|hostname|received|recovered|lost|
|20:00:00|Bob|host130|45|3|3|
|20:00:10|Bob|host130|167|6|8|
|20:00:20|Bob|host130|289|8|12|
|20:00:30|Bob|host130|412|8|12|
|20:00:40|Bob|host130|516|8|12|
|20:00:50|Bob|host130|678|8|12|
|20:01:00|Bob|host130|711|12|16|
|20:01:10|Bob|host130|734|12|20|
|20:01:20|Bob|host130|789|12|20|
Sometimes the counters reset to zero, and the "service" changes
|20:00:00|Joan|host131|51|3|3|
|20:00:10|Joan|host131|51|6|8|
|20:00:20|Joan|host131|235|8|12|
|20:00:30|Joan|host131|371|8|12|
|20:00:40|Joan|host131|414|8|12|
|20:00:50|Dave|host131|0|0|0|
|20:01:00|Dave|host131|150|1|6|
|20:01:10|Dave|host131|301|1|6|
|20:01:20|Dave|host131|460|4|13|
I can graph the delta using a serial diff, but what I'd really like is a data table telling me when the value changes, so above would list something like
|20:00:00|Bob|host130|45|3|3|
|20:00:00|Joan|host131|51|3|3|
|20:00:10|Bob|host130|167|6|8|
|20:00:10|Joan|host131|51|6|8|
|20:00:20|Bob|host130|289|8|12|
|20:00:20|Joan|host131|235|8|12|
|20:00:50|Dave|host131|0|0|0|
|20:01:00|Bob|host130|711|12|16|
|20:01:00|Dave|host131|150|1|6|
|20:01:10|Bob|host130|734|12|20|
|20:01:20|Dave|host131|460|4|13|
My poller does not retain the "last" value stored, so can't do the delta itself, and ideally I'd like the poller to avoid storing state.