We are planning to store real time event based data from various systems in ES, such that a set of events has dependency on each other. Please find example below:
set of events
1st event {"id":"1", "f1":"123", "f2":"abc"}
2nd event {"id":"2", "f3":"123", "f4":"xyz"} <-- 2nd event is related to 1st event by value "123" (like a foreign key)
3rd event {"id":"3", "f5":"xyz", "f6":"def"} <-- 3rd event is related to 2nd event by value "xyz" (like a foreign key)
Our aim is to use these set of events to create a unique field across the set like below:
after 1st event is processed
{"id":"1", "f1":"123", "f2":"abc", "UID"="123"}
after 2st event is processed
{"id":"1", "f1":"123", "f2":"abc", "UID"="123xyz"}
{"id":"2", "f3":"123", "f4":"xyz", "UID"="123xyz"}
after 3st event is processed
{"id":"1", "f1":"123", "f2":"abc", "UID"="123xyzdef"}
{"id":"2", "f3":"123", "f4":"xyz", "UID"="123xyzdef"}
{"id":"3", "f5":"xyz", "f6":"def", "UID"="123xyzdef"}
If we store the events in denormalized form or using parent child we will need to query ES before indexing the data in real time which will be very expensive.
What can be the best way to go ahead with this?