Bidirectional asynchronous data enrichment

ea1987 · December 17, 2019, 9:41pm

Hi guys,
I'm currently performing data enrichment of 2 different type of events (type A and type B) which are ingested and stored in different times in Elasticsearch.
Keep in mind that:

Events are correlated by a correlation_id field
Correlation ratio typeA/typeB is: 1 to n
Events are written in different times, event type A first
Some fields of event type A need to be written in events type B and viceversa

Following enrichment pattern and scenario details:

ENRICHMENT PATTERN SUMMARY
During events type A and B ingestion I basically write/update an event (_id=correlation_id) on a specific enrichment index, with desired enrichment data of both events. Later, with a scheduled pipeline, I enrich type A/B events using values of the enrichment event.

2 DIFFERENT EVENTS

Event type A:
- field1 (correlation_id)
- field2
- field3
- field7

Event type B:
- field1 (correlation_id)
- field4
- field5
- field6

3 INDICES

INDEX 1: storing event of type A
INDEX 2: storing event of type B
INDEX 3: storing fields of both events (field2, field4, field5, field7)

DATA INGESTION/ENRICHMENT STEPS

t0 -> Ingestion of Event type A to INDEX 1, ingestion of an event containing field1,field2,field7 to INDEX 3 using field1 value as _id
t1 -> Ingestion of Event type A to INDEX 2, update of event stored in INDEX 3, adding field4, field5 to it
t2 -> Scheduled enrichment (update) of Event type A and Event type B using data of INDEX 3 (queried per field1)

PRO of this solution:

no additional load during data ingestion on t0 and t1

According to your experience:

Is this the best asynchronous enrichment solution?
Is this something that could be done using the new enrich processor feature of ES 7.5? It doesn't seem so because in this scenaria data need to be enriched in both ways (from index 1 to 2 and viceversa)
do you have any advices?

Thank you in advance,

Andrea

system · January 14, 2020, 9:41pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Enrich document with data from same index Elasticsearch	9	369	January 17, 2024
Enriching new documents with fields from other ES indexes Logstash	1	1116	July 6, 2017
Enrich documents by copying fields from another index Elasticsearch	11	4802	November 4, 2022
Enrich execution takes long time (about 3 min) Elasticsearch	3	287	November 9, 2022
Enrich index with data found in another index Elasticsearch painless	2	256	July 25, 2023

Bidirectional asynchronous data enrichment

Related topics