Webhdfs output plugin does not create new files

Hi,

we have problems writing to hdfs via the webhdfs plugin. It does not create new files but will wirte to a file that does already exist. This prevents using tags like date, time, host and so on.

Does someone have the same problem and how do I solve this?

Thanks,

Olaf.

Providing your config as well as what version you are running would help.

Hi,

sorry. the config is pretty vanilla. The version is logstash 2.1 with webhdfs pulled from github (was not included in the "all-plugins" version).

the error message was

webhdfs write caused an exception: {"RemoteException":{"message":"Append failed for file: 
\/services\/logstash\/test.log, error: No such file or directory
(2)","exception":"IOException","javaClassName":"java.io.IOException"}}. Maybe you should increase retry_interval      
or reduce number of workers. Retrying... {:level=>:warn, :file=>"logstash/outputs/webhdfs.rb", :line=>"191",   
:method=>"write_data"}

until I created the file with "hadoop dfs -touchz /services/logstash/test.log". Then it worked.

input {
 log4j {
  mode => server
  host => "0.0.0.0"
  port => "4560"
  type => "log4j"
  }
}

output {
 webhdfs {
  host => "355.305.404.230"         
  port => "14000"
  path => "/services/logstash/test.log"  
  user => "hadoop"
}

Hi,

I too face similar issue. Using webhdfs plugin in logstash to push a copy to HDFS.
It does creates file with dynamic names[dd/mm/yy etc], but writes only first line. All subsequent logs that logstash tries to push results in same error as stated in this post.

When i use pre created file like test.log, it works smoothly. I went thru some posts and they are suggesting to close the HDFS file once created and reopen again for appending. I guess that is not working properly with the plugin.

Strangely when i make dfs.replication factor to 1, it does work in all scenarios. I am using HDP sandbox.