I got a requirement whereby I am required to match two fields from two different files and output the result into a csv file.
Prior to this train of thought, i did a simple python script to do the matching logic between 2 excel files and output the result into a csv file, but well, it took roughly 10 hours and its not optimal speed wise.
Moving along, I then thought of trying it via logstash/Elasticsearch and have since been able to create my .conf pipeline files and ingest the data into two indexes.
So my problem starts here, I need to match/compare two fields (the field values are an exact match) from two different indexes and output the result to another index, so like 3 index.
Index A - contains a field called indexA_txn_id
Index B - contains a field called indexB_txn_id
Both fields have the same values.
Once a match is found, take indexA_txn_id and flush it to a new index, Index C.
Any ideas how I can go about this? Have considered stuff like aggregate or even getting rid of the index C idea completely and replacing it with a new field, something like, if indexA_txn_id == indexB_txn_id, add_field ismatch_txn_id = 'matched'.
Any help or advise would be greatly appreciated. If it is impossible to do this at logstash layer, do recommend me an alternative if you have one.