We need to create alarms based on matches, where only certain criteria will be available in a log, but the complimenting information will reside in additional logs that have the same ID.
For instance, take 10 logs, which have a common ID. We would like to match one field, which could potentially match serveral logs, with different IDs. Then we would like to iterate over those IDs and see if additionally matches exist. If so, trigger the alarm.
this is not possible with watcher, and there is a reason to it, as you are following a pattern, that does not really scale. You basically have reproduced the famouns n+1 problem. Your approach looks doable if you have 10 hits, and then you have to fire 10 additional queries. However your additional query load will always increase with the number of hits und thus stop scaling.
Have you thought about merging those two documents together, so that querying suddenly becomes very easy/fast/cheap (compared to indexing which will need an additional step).
Mark Harwood, an Elasticsearch developer uses the term 'entity centric indexing' to describe this pattern. You can find a few presentations and talks when googling that will get you an idea about this.
Thanks a lot for this. Mark Harwood's Elastic{on} presentation was helpful. This certainly got us headed in the right direction.
We just got done with a four hour session with Elastic and came up with a scripted update with upserts. While 'entity centric indexing' can be interesting for the use-cases Mark presents, I actually think this is a better way to deal with transactional data.
In our case, we could create a document with our unique ID and all relavent metdata via upserts and simply append the transactional data into a nested object via scripts.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.