Advice to store measurements

I have some sensors that monitors an engine. I want to store the data to see trends and correlations between data. The aim of the post is to discuss of the best way of storing the data to do this.

Basically, a small document could be:

  • a monitored_object_id (e.g the engine) we can have a hundred of different ids
  • a parameter_name (e.g the vibration_axis_at_point_A) we can have thousands of different names
  • a value (double)
  • a timestamp (to the ms)
  • a list of tags

{ monitored_object_id:'engine_1', parameter_name:'VIB_PHASE_1', value:143.2, timestamp:'2016-04-26T12:21:33Z', tags: ['test', 'PW'] }

{ monitored_object_id:'engine_1', parameter_name:'VIB_AMP_1', value:4.2, timestamp:'2016-04-26T12:21:33Z', tags: ['test', 'PW'] }

I could save this simple document, up to 20 million docs per month (few samples per second)
currently I did this but not sure it is the right way of doing this. Good for writing over daily indexes like logstash does.

But is it the right way of doing things to search efficiently? Here is the queries I need:

Parameter over time: easy
I want to see the values over time for a parameter_name. This is easy, even by making time buckets and aggregate with stats. The small documents are perfect

P1 vs P2: difficult
Say I have parameters VIB_AMP_1 and VIB_PHASE_1. I know that I have some measurements at the same timestamp. I'd like to plot VIB_AMP_1 vs VIB_PHASE_1. And here is the difficulty.

What could be the query? How to do that in kibana?

In fact, I could save documents like this
{ monitored_object_id:'engine_1', VIB_AMP_1:4.2, VIB_PHASE_1:143.2 timestamp:'2016-04-26T12:21:33Z', tags: ['test', 'PW'] }

but the documents are very different because at the same timestamp, we may not have the same parameters. And remember, we could have thousand of different keys! plus if we have 2 measurements at 1ms time offset, we have the same problem as before

finally, I could have nested the document. But how to query VIB_AMP_1 vs VIB_PHASE_1?

{ monitored_object_id:'engine_1', samples: [ { name:'VIB_AMP_1', value:4.2, }, { name:'VIB_PHASE_1', value:143.2, }, ], timestamp:'2016-04-26T12:21:33Z', tags: ['test', 'PW'] }


I've moved this topic to the Elasticsearch forum which is a better suit for this topic.