Anyone have an easy way to make Metricbeat use Time Series Data Streams (TSDS)?

jerrac · September 5, 2023, 7:56pm

Long story short, we need to shrink our resource usage with our Elastic Stack. Part of my attempts at that has been implementing down sampling. But, after finally reading the docs carefully enough, I found out that we need to be on time series data streams.

I think Elastic Agent sets that up automatically for at least some things, but we're looking to move back to using beats directly (maybe, we might change our minds...). In any case, I'd like to know how to make Metricbeat use TSDS indices.

The docs make it clear you have to set it up just right. So I'm hoping someone can share how they did so...

Also, to double check, Metricbeat doesn't do what I want automatically, right? I exported the default index template and it does not have any time_series_dimensions parameters in it. I also could not find any mention of TSDS in the docs for 8.9.

Thanks in advance!

miltonhultgren · September 6, 2023, 12:39pm

As far as I know, this isn't a use case that would be supported (though it may be technically possible). Metricbeat doesn't currently use TSDB*.

One would have to establish all of the dimensions for the fields in the modules being used. Looking at how those dimensions are specified in the matching Integration (where available) would give some hints but I don't think that will be easy to manage nor do I think that's on the roadmap to be done.

Can I ask about what's driving your thoughts around moving back to running Beats directly over using Elastic Agent?

jerrac · September 6, 2023, 3:36pm

Ah. I was afraid of that. :\

As far as beats vs. agent, I've had multiple issues with Agent filling up the /opt filesystem, which would then cause issues with the actual apps we were running on those servers. (Side note: I've actually run into issues due to some piece of the ELK stack not checking disk space multiple times. Had to wipe out an FS snapshot repo due to that, and have had Kibana not allow logging in due to disk space before.)

You also used to not be able to upgrade Agent via fleet en-mass without a subscription, though I think that has changed since I was able to do so yesterday. Beats are fairly easy to manage with Ansible, I used to do that before Agent existed, so switching back seemed like a good idea.

I also have mixed feelings about managing configuration via the Fleet UI. It's somewhat handy, but having to configure every single integration for every single variation of a policy is painful. (As in, some servers run apache, some don't, but I have to configure the system integration on both the policy with apache, and any policies without apache. I should be able to just configure the system integration once and then have any policies that need it use the defaults I set.)

With Beats, I'd have everything in Ansible. That makes configuring the same thing on multiple instances easy.

Also, why create a new version of the policy on every tiny change and then deploy that to every agent? Why not wait until I've finished adjusting the entire policy and have said to deploy?

I haven't really looked at Agent's standalone mode. But it's always felt like it isn't how Agent is supposed to be used and I'd run into issues because of that.

So, then there's the way Agent stores it's configuration each host. If you have to debug an integration and want to go into where the actual configuration lives to see it, well, I'm not actually sure you can. Every time I've tried I can never figure out where the configuration actually is for a module/integration. A while back I was trying to debug some issues with the Docker integration and I can't remember if I ever did find the actual config for it...

Anyway, I had moved to Agent because I had hoped we'd be able to actually subscribe down the road and it was obvious Agent was where Elastic was heading. Recent issues here at work mean that isn't going to happen, and even the minimal usage we have on my long term dev stack is eating up more resources than we can afford. So I'm trying to force the Elk Stack to work in ways it isn't designed to. I've been testing using Beats directly to see if that might help with that.

Though, so far, getting the stack to work the way I want has not worked... Everything is built assuming everything is working the "Elastic" way. There isn't much flexibility to change how things work unless you take on massive amounts of configuration. Like what it would take to make TSDS work with Metricbeat. :\

Anyway, some of what I said is probably a bit more negative that it should be. I've just been rather frustrated with some of that for a while and it might have colored my thoughts a bit more that it should.

What I really want is for Elastic to release a stripped down version that is only metrics and logs aggregation. Strip everything else out and make it small enough to run a single node on very few resources. For small dev shops or small community colleges (like where I work), that would be incredibly helpful. That's basically what I've been trying to configure for the past few weeks.

Edit: A more encouraging thing is that the switch to TSDS for metrics is a huge space saver. I hadn't updated my integrations in a while, so just recently did that along with the 8.9.1 upgrade, looking at my indices for a couple metrics data streams shows the TSDS versions using roughly 1/5th of the space as the old ones. So that could help us out a lot! Kudos to Elastic for that.

Nima_Rezainia · September 11, 2023, 12:05pm

Thank you for your candid feedback @jerrac. They are all very valid points and I wanted to address them if I may.

We strive to enhance Elastic Agent, both in Fleet managed and Standalone mode. Beats are obviously supported and free to use and address specific use cases.

Regarding some of the points raised here:

If you do try the agent again please have a look at the upgrade logic via Fleet. This is part of the standard subscription. There's a mistake on our web site currently which will address this discrepancy. There really is only one licenses feature within Fleet at the moment.
A very valid point about having to configure every single integration, one that we have heard from other users also. "Reusable Integrations Policies", is an item on our roadmap that would allow the user to create a policy (with some integrations) and reuse (or inherit) that policy in other parent policies. This would allow the operator to have one copy of the integrations configuration and apply it to many other agent policies.
Standalone agent is certainly a viable option to Beats. Constructing the configuration file has been a bit onerous and we are working on simplifying the configuration and finding better ways to provide users with the initial configuration to use with the Standalone agent.
Delaying the policy from distribution after all the changes to the policy are is a good idea. I'll see if we can put that on the roadmap for development.
Fleet managed agents will store their configuration on the local agent but it would be encrypted and not humane readable. You can issue "elastic-agent inspect" on the local agent to extract the config if needed. You can also use the Fleet APIs to extract policy configurations.

Again thanks for your feedback.

jerrac · September 12, 2023, 3:17pm

Thanks for the reply. I'm glad to hear improving the configuration process is on the roadmap.

I'll also give the inspect command a try. I noticed it the other day, but haven't actually used it yet. And when I've been troubleshooting, it hasn't come up. Though the last major session where I was trying to dig into the config was months ago so...

On my end, I've mostly settled on using Agent for metrics to get the benefits of TSDS, and Filebeat for logs to get the benefit of Docker Swarm hint based auto-discovery. I'll be deploying my current work to test/staging today or tomorrow, then pointing some of our test/dev stuff at it. We'll see if my work actually does what I hope.

system · October 10, 2023, 5:18pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Beats and Elastic Agent data streams Beats filebeat , metricbeat	3	329	May 17, 2024
Statsd module for metricbeat Beats metricbeat	8	2183	December 28, 2016
Migrate from beats to elastic-agent Elastic Agent filebeat , metricbeat	16	748	September 7, 2022
Metricbeat-* vs metrics-* Beats metricbeat	4	477	June 8, 2021
Elastic-Agent vs Metricbeat standalone Elastic Security	5	1691	November 4, 2022

Anyone have an easy way to make Metricbeat use Time Series Data Streams (TSDS)?

Related topics