Hi,
My requirement is to get the remote file using beats and write it to hdfs using logstash.
I need to write the output to hdfs using logstash. However, I cannot use webhdfs, so i used pipe output with hdfs appendToFile to push the data.
This is how i am passing the output.
input {
beats {
port => 5044
}
}
filter {
grok {
match => ["source","%{GREEDYDATA}/%{GREEDYDATA:filename}\.xml"]
}
}
output {
pipe { command => "hdfs dfs -appendToFile - /user/srini/%{filename}.xml" message_format => "%{message}"}
}
The hdfs files are getting opened and data has been written to it. However, the files are kept open with 0 bytes data until i stop logstash. I think it might be expecting more output. How can i role the file after the event.?