I'm aggregating streaming connection log from nginx server. Usually there are several similar events coming with same id, then there are another bunch of similar events with another id coming. I'm trying to get the number of consecutive events with same id.
The problem is that there is no start and end event in my case, every time I see a new id it should be the end of previous map. I used push_previous_map_as_event and it works fine, but it keeps the previous map alive and never end it. Is there any way I can set end_of_task => true after push_previous_map_as_event is true?
Thank you for your help! Actually push_map_as_event_on_timeout works good. The reason I don't see it working correct is because I am manually pasting some log as input. And maybe that's not really a streaming.
I just verified the number using real streaming data and it looks good. Thank you for your help!
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.