Doubts about building a relationship master detail

Currently, I have an index that collects the logs of a series of control centers, and obtain from them the technical data of their electrical consumption, on and off status, and other kinds of data that allow me to know the status of the said control center.

The mapping was created manually, to optimize storage and work costs to the maximum, and at the time of creation, a typical master-detail relationship was not created with the control center data, leaving this to the program that works with elasticSearch, since we were only interested in the issue of logs.

         +------------------------+
         |                        |
         |                        |
         |      CommandCenter     |
         |                        |
         |                        |
     +---+------------+-----------+
     |                |
     |                |
+----v-----+     +----v-----+
|          |     |          |
|          |     |          |
|   Log    |     |   Log    |
|          |     |          |
|          |     |          |
+----------+     +----------+     ...

Whenever I read about the subject, I see that all the documentation and help revolve around avoiding treating Elasticsearch as a relational DB, but my doubt comes from the need to show in the search results, observability, and others, a minimum of box information, such as your name, your location, etc.

Is it possible to have that master-detail relationship? Is it really as expensive as some tutorials or forums say?

A lot of thanks

When working with Elasticsearch it is generally better to denormalise and in this case store some command center data on each log entry, e.g. using an enrich processor. This allows you to use time-based indices, e.g. through rollover, which for this type of data is considered best practice.

Setting up a master detail relationship would probably require you to use parent join mapping. Although this can be useful if the parent changes frequently or is very large, it comes with a number of drawbacks, e.g.:

  • All documents involved in a parent-child relationship must reside in the same shard. This means that you can not use time-based indices, which makes deleteing data much more expensive. There are also limits to the number of documents a single shard can hold so you may run into performance and scalability issues.
  • This type of relationship requires specific query syntax which is more expensive to execute and use more resources.
  • Kibana does generally not support reporting on this type of relationship, so you will need to visualise your data some other way.

@Christian_Dahlqvist Thank you very much. It was what I imagined, and that's why when I started I preferred to think that the needs of that type could be covered in the application that requests the data, leaving Elasticsearch oriented to its things.
Thanks for the technical explanation.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.