When to use Nested when to use has_parent?

I'm working on a use case that I think Elasticsearch 5 would be great for but I'm still a little unsure on a couple of points.

I want to start telemetry records using Elasticsearch for search and aggregation operations.

Assumptions: 1) No record ever becomes "less valuable" than another. We won't be retiring any of this data, just adding to it.

  1. The data is time-series data. I have been thinking of it like data frames. So, it's a collection of values associated to time. Here's a sample of one frame. An average set of these is easily 50,000. It's not uncommon in my use case to need all 50,000 frames at once, or at least a few values from them. (For example, if I want the path of the vehicle, I want all 50,000 lats and longs)

Questions:

Do I want to store each "frame" as it's own document? So for example:

"body" : [
telemetry_set_id: 200,
frame: {
datetime: 2017-04-13 00:26:35,
lat: "-37.0000",
long: "127.000000",
speed: "29.3",
battery: "97%"
}
]

So this would mean to get the whole set of data I'd probably have to do a scroll operation for all documents that have telemetry_set_id:200.

It's an important use case for me to be able to make queries like, "Show me the highest speed for telemetry_set 200" or "Show me all sets of telemetry data that include a coordinate within 5 miles of this X,Y coordinate. "

The other option is to make monstrous documents like this that contain an array of frame objects:

(Frames would be mapped to nested in this case)

"id" : "telemetry_set_200",
"body" : [
frames:[
{

datetime: 2017-04-13 00:26:35,
lat: "-37.0000",
long: "127.000000",
speed: "29.3",
battery: "97%"
},
{
datetime: 2017-04-13 00:26:36,
lat: "-37.0030",
long: "127.000010",
speed: "30.1",
battery: "96.5%"
},
{
datetime: 2017-04-13 00:26:37,
lat: "-37.0040",
long: "127.000020",
speed: "30.3",
battery: "96%"
}
]
]

The advantage here is that I can get the entire set of telemetry all at once which I feel is helpful. Will this still permit me to say, "Show me sets of telemetry data that include a lat/long within 5 miles of this specific lat and long?"

I could setup has_parent as well. Is this more of a has_parent problem or a "nested" data type problem? I think has_parent is probably the answer here but I wouldn't mind hearing from some more seasoned professionals.

Thank you in advance, I'm excited to begin working with this new tool.

Josh

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.