Logstash 7.7.0 on Windows - java.nio.file.AccessDeniedException

Hello all,

I am hoping that someone may be able to help me with an issue I've recently ran into with Logstash. I am running 7.7.0 on Windows Server 2016 and I have recently began experiencing Logstash crashes. I'll paste the error below.

[2020-05-28T23:36:26,702][ERROR][org.logstash.ackedqueue.io.FileCheckpointIO][syslog-test] Error writing checkpoint: java.nio.file.AccessDeniedException: C:\Program Files\Logstash\data\queue\syslog-test\checkpoint.head.tmp -> C:\Program Files\Logstash\data\queue\syslog-test\checkpoint.head
[2020-05-28T23:36:26,854][ERROR][org.logstash.execution.WorkerLoop][syslog-test] Exception in pipelineworker, the pipeline stopped processing new events, please check your filter configuration and restart Logstash.
java.nio.file.AccessDeniedException: C:\Program Files\Logstash\data\queue\syslog-test\checkpoint.head.tmp -> C:\Program Files\Logstash\data\queue\syslog-test\checkpoint.head
	at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:89) ~[?:?]
	at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) ~[?:?]
	at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:309) ~[?:?]
	at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:292) ~[?:?]
	at java.nio.file.Files.move(Files.java:1426) ~[?:?]
	at org.logstash.ackedqueue.io.FileCheckpointIO.write(FileCheckpointIO.java:104) ~[logstash-core.jar:?]
	at org.logstash.ackedqueue.Page.forceCheckpoint(Page.java:225) ~[logstash-core.jar:?]
	at org.logstash.ackedqueue.Page.headPageCheckpoint(Page.java:190) ~[logstash-core.jar:?]
	at org.logstash.ackedqueue.Page.ack(Page.java:161) ~[logstash-core.jar:?]
	at org.logstash.ackedqueue.Queue.ack(Queue.java:656) ~[logstash-core.jar:?]
	at org.logstash.ackedqueue.Batch.close(Batch.java:57) ~[logstash-core.jar:?]
	at org.logstash.ackedqueue.AckedBatch.close(AckedBatch.java:55) ~[logstash-core.jar:?]
	at org.logstash.ackedqueue.AckedReadBatch.close(AckedReadBatch.java:109) ~[logstash-core.jar:?]
	at org.logstash.execution.QueueReadClientBase.closeBatch(QueueReadClientBase.java:130) ~[logstash-core.jar:?]
	at org.logstash.execution.WorkerLoop.run(WorkerLoop.java:87) [logstash-core.jar:?]
	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
	at java.lang.reflect.Method.invoke(Method.java:564) ~[?:?]
	at org.jruby.javasupport.JavaMethod.invokeDirectWithExceptionHandling(JavaMethod.java:441) [jruby-complete-9.2.11.1.jar:?]
	at org.jruby.javasupport.JavaMethod.invokeDirect(JavaMethod.java:305) [jruby-complete-9.2.11.1.jar:?]
	at org.jruby.java.invokers.InstanceMethodInvoker.call(InstanceMethodInvoker.java:32) [jruby-complete-9.2.11.1.jar:?]
	at C_3a_.Program_20_Files.Logstash.logstash_minus_core.lib.logstash.java_pipeline.RUBY$block$start_workers$5(C:/Program Files/Logstash/logstash-core/lib/logstash/java_pipeline.rb:279) [jruby-complete-9.2.11.1.jar:?]
	at org.jruby.runtime.CompiledIRBlockBody.callDirect(CompiledIRBlockBody.java:138) [jruby-complete-9.2.11.1.jar:?]
	at org.jruby.runtime.IRBlockBody.call(IRBlockBody.java:58) [jruby-complete-9.2.11.1.jar:?]
	at org.jruby.runtime.IRBlockBody.call(IRBlockBody.java:52) [jruby-complete-9.2.11.1.jar:?]
	at org.jruby.runtime.Block.call(Block.java:139) [jruby-complete-9.2.11.1.jar:?]
	at org.jruby.RubyProc.call(RubyProc.java:318) [jruby-complete-9.2.11.1.jar:?]
	at org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:105) [jruby-complete-9.2.11.1.jar:?]
	at java.lang.Thread.run(Thread.java:832) [?:?]
[2020-05-28T23:36:27,522][FATAL][logstash.runner          ] An unexpected error occurred! {:error=>java.lang.IllegalStateException: java.nio.file.AccessDeniedException: C:\Program Files\Logstash\data\queue\syslog-test\checkpoint.head.tmp -> C:\Program Files\Logstash\data\queue\syslog-test\checkpoint.head, :backtrace=>["org.logstash.execution.WorkerLoop.run(org/logstash/execution/WorkerLoop.java:103)", "jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)", "jdk.internal.reflect.NativeMethodAccessorImpl.invoke(jdk/internal/reflect/NativeMethodAccessorImpl.java:62)", "jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(jdk/internal/reflect/DelegatingMethodAccessorImpl.java:43)", "java.lang.reflect.Method.invoke(java/lang/reflect/Method.java:564)", "org.jruby.javasupport.JavaMethod.invokeDirectWithExceptionHandling(org/jruby/javasupport/JavaMethod.java:441)", "org.jruby.javasupport.JavaMethod.invokeDirect(org/jruby/javasupport/JavaMethod.java:305)", "C_3a_.Program_20_Files.Logstash.logstash_minus_core.lib.logstash.java_pipeline.start_workers(C:/Program Files/Logstash/logstash-core/lib/logstash/java_pipeline.rb:279)", "org.jruby.RubyProc.call(org/jruby/RubyProc.java:318)", "java.lang.Thread.run(java/lang/Thread.java:832)"]}
[2020-05-28T23:36:27,742][ERROR][org.logstash.Logstash    ] java.lang.IllegalStateException: Logstash stopped processing because of an error: (SystemExit) exit

I currently have two pipelines. 3/3 crashes so far all seem to be related to my syslog pipeline/index. Here is that pipeline:

input {
  tcp {
    port => 514
    type => syslog
  }
  udp {
    port => 514
    type => syslog
  }
}

filter {
  if [type] == "syslog" {
    grok {
      match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
    }
    date {
      match => [ "syslog_timestamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
    }
  }
}

output {
  elasticsearch {
    hosts => ["https://someServer:9200"]
    index => "syslog-test"
    user => "${some_user}"
    password => "${some_password}"
	ssl => true
	ssl_certificate_verification => true
	cacert => 'C:/Program Files/Logstash/config/certificates/some-certificate.pem'
  }
}

The error appears to be related to filesystem permissions, but as a test I have given the account running Logstash full control to the C:\Program Files\Logstash\data\queue\syslog-test directory and I am still experiencing crashes. If it helps to know, the crash usually occurs around the 3.5 hour uptime mark.

I searched GitHub and found this issue, but it doesn't appear to be 100% related. Please let me know if I can provide any further information that may help. Thank you much!

I believe that error can also occur when attempting to move a file that is open. That could mean it is a weird race condition where two threads try to do the rotation at the same time.

I am not suggesting it is a solution, but does the crash still occur if you set pipeline.workers to 1?

1 Like

@Badger Only one way to find out! I've set pipeline.workers to 1 on that particular pipeline (will leave the other alone) and we'll see how it goes. I will be sure to report back either way.

Thank you for the suggestion!

@Badger it crashed again, unfortunately. It did seem to run the longest it has. Not sure whether that is a coincidence or not.

Any ideas why the error says to please check your filter configuration? The filter looks fine to me and seems to be parsing correctly. Do you see any potential issues? Maybe I will try to remove the filter altogether and see what that does. After all, my winlogbeat pipeline which has no filter is working flawlessly.

That's a generic error message used for any exception in the pipeline.

1 Like

For anyone else who may be dealing with this - to assist with debugging, you can set queue.checkpoint.writes: 1 which will make the error occur instantly.

Also, of interesting note - I am seeing a lot of explorer.exe file manipulations in the \data\queue folder that are resulting in INVALID PARAMETER or NO MORE FILES.

I'm beginning to think this is a bug rather than a configuration issue. Anyone else have any suggestions?

I've found the problem and I feel rather silly now. It appears that Endgame (Elastic Endpoint Security) was causing the issue. It wasn't generating any alerts so I didn't really think twice about it. Thank you for pointing me in the right direction @Badger. You mentioning the possibility of the checkpoint files being in use was helpful.

I used Process Monitor and saw that Endgame was performing read operations on the files. All is well now! :slight_smile: Thanks again.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.