Historical data vs Last


(Robert Chartier ) #1

Hi All,

We are trying to model an IoT scenario and find that we may have two needs for most of our sensor indices and I wanted to throw it out there to see what others think and if they have already solved this issue.

Lets take a simple scenario, a WiFi temperature sensor. It reports data 10 times every second. For each sensor we have location data, the temperature reading, the name of the sensor and the datetime in UTC of the event.

Our indexing plan is...

Index 1 (Historical): Use _id = null, to get the historical index built.

Index 2 (Last): Use _id = "the name of the sensor itself", to keep track of the most recent value for the sensor.

I understand that for most situations we can easily query the Historical index for the last by version or DateTime with a size of 1 and we can get the most recent value.

There are two "issues" that I see:

  1. We want to use Kibana to show the LAST reading on a Map, and not see any historical data. Since, in Kibana, we cant express our custom "Last" filter we will not be able to achieve this.

Is there a way to achieve this, easily?

  1. We think, querying by _id on the the "Last" index will outperform querying the "Historical" with a Size of 1.

Anyone have any concrete evidence either way?

Questions, comments, concerns all welcome.

Cheers,

-Rob


(Mark Walkom) #2

KB currently can't do that, I think there is a FR for it though.


(Christian Dahlqvist) #3

Storing the current state, e.g. most recent entry, in a separate index like you are proposing is a very efficient way to get to it from Kibana and something I have seen used in the past. You are basically trading a bit more work at indexing time for a much more efficient query, which is often a good tradeoff. The only thing that concerns me is the high update rate. If may be worthwhile throttling this somehow, especially considering data by default is only flushed and made searchable once per second by default.


(Robert Chartier ) #4

@christian

The example given is a bit contrived.

We actually will be looking potentially thousands to millions of devices with about 50-60 sensors on each, of which about 10-20 update as frequent as about one message every 10 seconds.

@workolm

By FR, do you mean PR? As in a pull request...?

So our initial plan of manually building a "last" index still stands.

Thanks again!


(system) #5