Elastic Endpoint Integration - Large amount of Log data ingested


I have been trialing using elastic cloud for the purpose of EDR tooling on some linux systems. I have been using elastic endpoint on two environments with the hope of later scaling, currently, having run for two weeks the amount of logs ingested from this process is extremely sizeable, making it look like a very costly solution.

I was wondering first whether the amount of logs ingested via this data stream for these processes should be this large for a time span of around 2 weeks. Secondly, when looking at the logs the message field should be as generic as just 'endpoint process event' or 'endpoint file event' or should contain more information. Lastly, if event filters are the only way to reduce log ingest size. Is there documentation detailing best practice for reducing log size.

How many hosts are you collecting data from and what kind of hosts are those? Desktops? Servers?

Hi, currently collecting from two different linux EC2 instances hosted on AWS.

But what those EC2 are doing? Also, are those size counting replicas? I'm not sure.

I do not use the Elastic Endpoint integration, but I use similar tools and depending on the function of the server it can generate a large volume of logs per day.

For example if those server are exposed to the internet and the ACL/Security Group are not correct configured it may be receiving a lot of connections attempt which could generated a lot of processes being created and terminated.

Do you have any process that have more event than others?

Hi @Seas39 ,

I'm a Product Manager on the Linux Platform related integrations in Elastic Security. To provide you with the best possible assistance, I'd appreciate some additional information:

  • Elastic Stack version
  • Linux distribution and kernel version on those EC2 instances in AWS
  • Most importantly what are the workloads running on those EC2 instances? as @leandrojmp already mentioned
  • Do you remember seeing the below elastic defend/endpoint configurations options while onboarding? If yes, what options were chosen?

On a heavy workload Linux box, it's common to see huge volume of ingress data. On average, without any event filters, it's common to see 1Gb/per day/per instance. Again it totally depend up on your workload. As a first step to understand your workload, I'd recommend using the discover page under analytics sub-section in Kibana, select the endpoint process index, and in field statistics look for process.name ECS field and identify the top values / big hitters in your env. Sample screenshot(I'm using a different index in this example) below.

This will actually help you answer the last question(Do you have any process that have more event than others?) asked by @leandrojmp.

Once the big hitter process is identified, we need to collect additional metadata from that event to create a granular and efficient filter. Hope this helps.

1 Like

7.10.03 is EOL and no longer supported. Please upgrade ASAP.

(This is an automated response from your friendly Elastic bot. Please report this post if you have any suggestions or concerns :elasticheart: )

Hi thanks for the help,

In answer to those questions you have raised. The elastic stack deployment version is v8.8.0.
The linux distribution and kernel version for those EC2 instances are: Ubuntu 20.04.6 and Kernel: Linux 5.15.0-1040-aws .

As for the workloads running the instances are: private (not exposed to the internet) hosts, running data processing software, they do not produce vast quantities of system logs to consume, but there is likely a significant amount of network traffic.

As for the configuration settings I can't remember I think probably complete EDR.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.