logstash is not a good fit for this task. You would have to write a ruby filter to keep track of the values to be added in the second column. You would also need a single worker thread (pipeline.workers 1) and make sure that pipeline.ordered evaluates to true.
Thank you for your answer.. it works and I was able to solve it. I used the in and out values to differentiate the start and the end of the aggregation tasks.. but now I have another issue:
what if the left column is not a preknown value? for instance a continuous number like this:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.