I am using Logstash to aggregate some data out of a log file and store it in elastic search, to get some usage data for a software.
We copy the log files into a directory that logstash checks. We use the file input plugin in read mode and delete the files from the directory afterwards.
For the data aggreation I would need to know when the file ends. Is there any way in Logstash to detect that EOF was reached?
For instance, can I automatically generate a EOF event?
Or detect that an event is the last one in the file?
Not really. If you generate document_id yourself then you could use an aggregate filter to track the most recent record from each file and then generate an upsert to tag the last record after a timeout.
However, the timeout event is stored with the index "%{[@metadata][index]}" literally and not replaced correctly. The other events are stored in the correct index.
There are two fields in the event in elasticsearch then:
_index: %{[@metadata][index]}
index: (With the correct value)
So I guess I am addressing something wrong here, but I can not figure it out.
Does anyone know what is going on there?
When the timeout_code executes the map has already been converted into an event (in create_previous_map_as_event, the .shift that removes the map happens before the call to create_timeout_event). Try
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.