Enrich processor limitations


I'm working with enrich processor to add data from my already existing logs to the ones that are being ingested. What makes it challenging is the fact that the logs that are the »source« of information are also continuously arriving in my index.

In my current problem, I have »process start « logs that signal the start of some process. After that process starts many »events« can occur. The process start log has certain information which I need to add to all the »event logs« that are related to it and happen after. The picture below may make the situation a bit clearer (info field needs to be added to all event documents).

From my experience with the enrich processor, you can only add fields from the documents that are located in the enrich index. Enrich index has to be built each time a new »process start« log arrives for it to have up-to-date data which can be then used to enrich ingested documents. There is no mechanism to do that automatically, so I'd have to trigger it externally, and even then it might miss some events that arrive just after the process start log.

Is it possible to enrich logs from a source index that can be changed more dynamically (or is automatically updated)? Is there a mechanism that would allow querying of the normal index and then retrieving the »information source« document from it?

I could see transforms as a possible solution to this problem but I still haven't figured out how to make it work (if it's even possible). Basically, to just copy the whole »event« document while adding information from the »process start« log.

Any suggestions are greatly appreciated.

The enrich processor can only query the .enrich-* index, it is not possible to use any other index for it.

The enrich processor works best with static data or data that does not change often, if your source data has frequently updates, then you will need to keep recreating the enrich index manually or automate it, which it is not ideal.

How are you indexing the data? If you are using Logstash, than you can use the translate filter or memcached filter to create a more dynamically way of enrich.

I understand, enrich processor might be a dead-end then.

We're currently not using Logstash and I would prefer if I can avoid adding it for just this thing.

Are there any other options I could try out? Something that doesn't require adding any new components.

Unfortunately nothing that I'm aware of, the way to do enrichments using Elasticsearch ingest pipelines is through the enrich processor, which is still limited for some cases.