I'm fairly well versed in ELK, but I've not yet figured out exactly the capabilities of other tools which integrate to ES data to transform it. I believe 'transform' is the right term to use here, from what I've been able to research on my own. Let me give an example of what I'm looking to do.
In our existing ES database, I've got the need to batch process it, where I enhance what is initially in documents with supplementary data. A good simple example is this.
Let's say my log file data, stashed into ES, has two log lines - one where a message is sent, and (potentially) one where the same message is received. Each send/receive has a key which uniquely identifies the transaction.
What I'm looking to do is to post-process the ES database, match send/receive pairs, add transaction time into the receive record, and add a 'matched' boolean value to each indicating the transaction was successful.
There's other needs I've got for post-processing, but this is a good example to get me going. My ideal situation would be where I can script up the transformation, based off of returned queried ES documents, and then be able to manipulate these documents by simply adding JSON content.
So my question is ... what is the best approach to accomplish this? Elasticsearch to/from hadoop? Pig, Hive, etc ... all of these seem applicable, but I'm not sure where to start.
Any guidance on where to dive into further would be great!