Enrichment - is it the best way?

Hello all,

We've been using Elastic in capturing and consolidating IoT data however - my experience with Elastic is however limited to about 6 months.

Consider the scenario:

  • large number of IoT devices with known serial numbers are sending timestamped data
  • a separate index contains the geo_point for each device (key being the unique serial number)
  • data/messages from the devices are enriched with the matching geo_point (again, key being the unique serial number)

This is great for creating maps (especially heatmaps).

However is this the best way to do it? Is enriching each and every message with the geo_point info not wasteful by taking up a lot of space that could be correlated from another index. The expected number of messages from the IoT devices is increasing rapidly (ex: 6 months ago the average was 1M/week, now is at 5M/week).

Any suggestions on how to improve or optimize this would be much appreciated.

As Elasticsearch does not support joins the enrichment approach is the one generally recommended unless you have a custom UI where you can merge data from separate queries (basically perform the join at the application layer).

You could do this in Vega if enrichment is not an option. In Vega you can make 2 separate Elastic queries on 2 separate indexes and then join the data using transformations. Then you can render a map using the points. Vega is a steep learning curve.

But I would recommend you enrich the data one time during ingest and then create a visualization off that one index vs the Vega route which would be much less optimized since it would have to do all that querying and joining for each time the map is rendered.

1 Like

Thank you all for your input, much appreciated.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.