Best practice for enriching logs with environment names

I want to enrich logs by adding service information (Environment specifically). Add a service name to logs seems like what I want. But none of the options seem ideal.

  • Ideally I would have an agent policy for (say) the web tier, then add the environment at the agent level. It seems I can’t do this with fleet and elastic agent. Our hosts are VMs. It looks like I might have been able to do something if we ran containers, but we don’t.
  • Integrations are included in the agent policy, so I can’t add the field here without expanding the agent policies to (almost) one per host.
  • I don’t think I am using filebeat, I am using filestream. But this still has the same issue in that it is part of an integration, so would expand the number of agents
  • add fields processor - This has the same issue as above, unless we can set a variable at the agent level which we can’t seem to with fleet.
  • The logs don’t have the environment name in them.

So it seems that my only options are to have one agent policy per tier per environment, which is basically one per host. Or to do something very clever with an ingest pipeline and map hostnames or filenames to environment names. I haven’t investigated this, but it seems somewhat complex and fragile.

But surely I am missing something. Doesn’t everyone have this problem and nobody would use observability if this was the case?

I have searched and found some people asking similar questions, but the solutions generally seem to involve not using fleet. Do I just need to create loads of agent policies?

Hello,

I have done that via the agent-policies. So every environment gets its own policy. Maybe you can aggregate a bit because if you do nothing then collecting logs in the policy you could merge all services into one policy and set the monitored paths to the log pathes of all your app. The Agent will only use the ones it can find. Then you could separate the different services based on the log-path or something like that in an ingest-pipeline.

The second way i can think of that is a bit more clunky: You could create a processor in the ingest-pipeline that is parsing the logs to set the environment based on the host-name. Maybe you even have a host name pattern you can match (e.g. all development server names contain DEV (APPDEV01 or similar))?

That are the two ways i could think of.

BR

hello,

we added an environment variable to elastic-agent service (linux debian12 /etc/sysconfig/elastic-agent file contains APP=myapp). Then we took this variable in integration processor with add_labels.

Mk

Hello and welcome,

In each agent policy you can add a custom field, in this custom field you can configure it to get the value from an environment variable as you can check on the answer of this similar post.

So you could combine an environment variable and the custom field, if you want to add multiple tags, you can also combina an ingest pipeline as well.

For example, assume that you add multiple information, you can do something like this:

Create a environment varaible on the host

HOST_ENV="env1|env2|env3|envN"

Add this a custom field in your policy, so you will have a field with this value.

{
    "custom_field": "env1|env2|env3|envN"
}

You can then use the split processor on an ingest pipeline to split the multiple values into an array and end up with something like this:

{
    "custom_field": ["env1","env2","env3","envN"]
}

Then you are able to filter based on each one of the values in the array.

3 Likes

Thanks, this looks like what @michel_kessler was suggesting, but it gives more detail. I also tried what @Shaoranlaos suggested, and that worked as well - I will need to process the filename for the databases which share a host, and therefore an agent, but I do prefer the environment variable to mangling the hostname for our setup.