Centralizing Business events is a very common need that could be handled much more easily and this is the subject of this "Request For Comments". I tried to make it as short as possible.
As opposed to a technical (think log4j) event, a business event :
- has a potentially complexe structure
- should never be lost (think audit purposes)
- could trigger more use cases if stored asap (Watcher)
There are already a few open source projects that have implemented a log4j appender for Elasticsearch :
Trouble is : they don't comply with the above prerequisites. Further more it would be very useful :
- to have appenders that rely on the Elasticsearch Rest API (vs java API which is based on Elasticsearch Transport Client Node) to minimise coupling between clients and Elasticsearch (including Java !) versions.
- to complement Elasticsearch's resiliency (https://www.elastic.co/guide/en/elasticsearch/resiliency/current/index.html
(see OOM resiliency, loss of documents during network partition) and https://aphyr.com/posts/323-call-me-maybe-elasticsearch-1-5-0
- to have minimum impact on user (client thread) response time
- to safely log events even if Elasticsearch (or the network) is down or even if the client JVM crashes a nano second after client application received ok from logging service)
I eventually came up with this idea that 2 Elasticsearch log4j2 appenders were needed :
- a "safe" appender that would garanty logging by writing the event on disk (potential java.io exception handled by the client application, possibly used to rollback previous linked updates). An asynchrone (quartz...) process would then periodically try to index the event in Elasticsearch
- a "fast" appender that would directly index (Rest API as well) the event in Elasticsearch (asap indexing)
It would then be easy to develop a Business Event Log API (potentialy embedded in some java framework) that would do the following :
- json-ize the input Java bean (representing the Business Event)
- potentially complement the json with some other information (node name, method name, ...)
- generate a UUID (friendly to Elasticsearch like UUIDv1 Base64 url encoded)
- synchronously logs json (on disk) using safe appender (file name = uuid.json) then asynchronously Put (http) using UUID (and remove from file system if indexed ok (or already indexed))
- asynchronously (new java thread) logs json using fast appender (http Put using UUID)
A double http put (with a minimum time interval in between) should garanty that even under dramatic circumstances no event is lost. This would avoid inserting events in a "safer" database before indexing them into ES.
The second http put should require a very small amount of Elasticsearch resource (no json parsing nor Lucene indexing)
UUID has to be passed to both appenders (for instance : name of the uuid field in json structure)
The asynchronous part of the "Safe" appender has to make decision upon receiving errors like MapperParsingException. This kind of miss-mapped event_type won't be indexed unless there is a mapping change in the regular index. A possible solution should be to index them in a ES alternate index (kind of dead letter queues) : ESIndexName_errors. Indexing in there will create new mappings (only string types) and I presume they can't be rejected this time.