Enrich one document with fields from another

Imagine I have an ElasticSearch instance with two kinds of data - author and book - both in JSON.
Author data looks like this:

    "document-id": "XYZ",
    "document-type": "author",
    "name": "John Doe",
    "country": "Canada"

while book data looks like this:

    "document-id": "ABC",
    "document-type": "book",
    "authorId": "XYZ",
    "title": "Logstash for Dummies"

As of now, each goes into its own index.

I would like to create a denormalized version of the two, so that I can easily search for all books written by Canadian authors.
I need to support updates to the author (and book) data, so that if the author moves to a new country or changes their name, the denormalized copy will also be updated.
I also need to keep all fields from both objects in the denormalized copy (i.e., avoid collisions between the two document-id fields, so that both document-id values are present, even if one has to be renamed).
And all this will be used in Kibana reports which, as I understand it, doesn't have great support for nested objects.

What's the best way to achieve this? I've seen discussions that lead me towards the aggregate filter, or the ElasticSearch output plugin, and I'm unsure what to pursue. Is Logstash even necessary, or is this possible with ingest pipelines?
Do both document types need to be in the same index in order for this to work? And should book be "enriched" with author data, or should they be combined into yet a third document type?
(For the sake of the example, let's also say that there's a third index, publisher, that also needs to be combined into every book document.)

I'm an ElasticSearch novice, and a complete newcomer to Logstash, so I'd appreciate any guidance you can provide.


This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.