Spooling Queue

Attempting to play with the new spooling queue but we're having issues with the config around it.

The most prevalent error when testing the config:
spool.dat can not be locked right now

Current config section:

spool:
# The file namespace configures the file path and the file creation settings.
# Once the file exists, the size, page_size and prealloc settings
# will have no more effect.
file:
# Location of spool file. The default value is ${path.data}/spool.dat.
path: "${path.data}/spool.dat"

  # Configure file permissions if file is created. The default value is 0600.
  permissions: 0666

  # File size hint. The spool blocks, once this limit is reached. The default value is 100 MiB.
  size: 512MiB

  # The files page size. A file is split into multiple pages of the same size. The default value is 4KiB.
  #page_size: 4KiB

  # If prealloc is set, the required space for the file is reserved using
  # truncate. The default value is true.
  #prealloc: true

# Spool writer settings
# Events are serialized into a write buffer. The write buffer is flushed if:
# - The buffer limit has been reached.
# - The configured limit of buffered events is reached.
# - The flush timeout is triggered.
write:
  # Sets the write buffer size.
  buffer_size: 10MiB

  # Maximum duration after which events are flushed, if the write buffer
  # is not full yet. The default value is 1s.
  flush.timeout: 5s

  # Number of maximum buffered events. The write buffer is flushed once the
  # limit is reached.
  flush.events: 4096

  # Configure the on-disk event encoding. The encoding can be changed
  # between restarts.
  # Valid encodings are: json, ubjson, and cbor.
  codec: json
#read:
  # Reader flush timeout, waiting for more events to become available, so
  # to fill a complete batch, as required by the outputs.
  # If flush_timeout is 0, all available events are forwarded to the
  # outputs immediately.
  # The default value is 0s.
 # flush.timeout: 0s

any one have any tips or tricks here?

Could you please share the debug logs of Winlogbeat?

This is a known bug we've encountered on windows. Unfortunately there is no workaround for this.

We already have a fix for this in go-txfile (the library providing low level support for spooling to disk). I hope to update the beats repositories with fixes and improvements the next days. APIs changed slightly, so it takes a little time. If you are keen to test spooling, we can help you getting it running, once fixes are available.

1 Like

Thanks @steffens. We're definitely up for some testing...If you want to reach out to me, I'd be happy to run through things with you!

Do you have an idea of when this fix will be released?

Unfortunately still in progress. You can follow progress in 7859. It's a quite big dependency update and CI did fail.

1 Like

Fix has just been merged into the development branch. The backport to the 6.4 branch (fixes might come with 6.4.0 or 6.4.1, not sure) is still in progress #7911. If you have a go development environment, you can try to build winlogbeat yourself. See go Getting Started Guide for setting up an environment. Checkout the beats repository to %GOPATH%\src\github.com\elastic\beats, switch to %GOPATH%\src\github.com\elastic\beats\winlogbeat and run go build in order to produce a winlogbeat binary.

Setting up a build environment can be a pain if you are new to go. You can also wait a little for tomorrows snapshot build (based on development branch).

I'm not aware of any changes that might break your config in winlogbeat, but I might have missed one or the other change in winlogbeat.

Feel free to ask if you run into any problems.

And please report back :slight_smile:

Out of curiosity, but why do you plan to use spooling to disk with windows event logs? In the winlogbeat/filebeat use-case the source the beat reads from normally acts as buffer as well (in case of network outage). For filebeat spooling makes sense if files rotate and get removed very fast. In this case the spool can act as buffer.

Keep in mind that Windows event log API is quite slow. I wouldn't expect any performance benefits out of the box. The only time you might see a change in performance is after a long network outage. Then winlogbeat can drive it's outputs from the spoll, while continuing to read from the event log. Great way to catch up after long outages. Yet, the spool is transactional (similar to databases). We don't want to loose events. There might be a noticable IO overhead/delay due to fsyncs. Increase write buffer and/or flush timeout to reduce the amount of disk flush operations. The spool uses memory mapped IO, don't be shocked by high memory usage :wink:

Thanks for the update! I'm not very familiar with go, personally, but we may try to set up an environment for testing.

We're having major throughput issues (Winlogbeat Throughput Seems Low) and are looking towards any and all options. Our current connector architecture utilizes an on-disk cache and we thought it was worth experimenting with.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.