Adding state storage backend options to the libbeat framework?

Dustin_Decker · September 19, 2017, 11:45pm

For beats that don't depend on a local system resource, such as a beat for a web API, having a some standard options to store state remotely would be a huge win when running on infrastructure without persistent disk storage.

Providing an io.ReadWriter backed by a configurable local, elasticsearch, etcd, or consul storage backend as part of the standard beats framework might be a good way to do it. Developers can be free to read and write their state blobs, and users can have more flexibility in their deployments.

tudor · September 20, 2017, 12:22pm

Just to make sure I understand, you are referring to custom Beats, and you want libbeat to never write to local disk, right?

If it could be done in a generic way, we'd be open to have that in libbeat, but I'm not sure how easy it will be. Can you try opening a PoC pull request? Doesn't need to be complete, but just to demonstrate the concept.

Dustin_Decker · September 20, 2017, 2:21pm

Yes, I'm proposing adding a configurable storage backend for state tracking in libbeat.

So for Filebeat, for example:
What is now filebeat.registry_file: registry would instead become:

state:
  storage_driver: local
  storage_opts:
    path: registry

A different beat, such as one that retrieves Google Suite Oauth authorizations, you could use something like this, and run it on Kubernetes without local storage:

state:
  storage_driver: etcd
  storage_opts:
    address:   "127.0.0.1:8500"
    token: "3940uf3ij32094j43"
    key: "co.elastic.gsuitebeat.state"

I'd be happy to make a PoC when I find time.

The feature issue is being tracked here: https://github.com/elastic/beats/issues/5375

steffens · October 4, 2017, 10:54am

It's an interesting idea. Currently the state store for filebeat and winlogbeat are custom per beat. We hope to change this, so to have a common API for storing/updating state in libbeat in the future. As first iteration we still want to go for local files, but e.g. considering k8s we might indeed opt to support other backends. But with backends based on networks, the chance of things going wrong will increase as well.

system · October 10, 2017, 11:46pm

This topic was automatically closed after 21 days. New replies are no longer allowed.