Some Agent Do Not Send Data Custom Logs (Filestream)

lomar · September 22, 2025, 6:26am

I've created a cronjob that runs a script and outputs ndjson to a file. Then I've created a integration policy that reads the log file using Custom Logs Filestream integration and add it to agent policies.

Some of the agents sent the data, but most of them didn't. I've checked file permissions and paths used in cronjob, all agents uses the same content. There is no error log in agents logs by the way.

Thanks for your help.

Musab_Dogan · September 22, 2025, 10:36am

It can be because of various reasons. Check the following one by one and let me try to help you.

Is the data exist in the defined path? Please try to add a test file and see if it’s ingested.
Is the data older than 72 hours? check the ignore_older parameter in the integration. If it’s set the data must be new to beat to harvest.
Make sure there is no index rejection by checking GET /_stats?filter_path=**.index_failed
Double check the elasticsearch logs and agent logs too see if there is any explanation about the rejection.

This explanation is the most important one. What is the difference between those agents?

lomar · September 22, 2025, 11:24am

Yes, the data is exists and in correct format for all agents.
New data generated by script hourly, so its younger than 72 hours. I've checked also.
Ignore older haven't setted, however I've setted it to 0 in case.
I've checked index rejections, there is no index rejection.

I've checked the agent logs located in /opt/Elastic/Agent/data/elastic-agent-*/logs, some of them have written error logs, some of them haven't.

The crontab:

PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
0 * * * * bash /usr/local/bin/script.sh | tee /var/log/mylogs/log.ndjson

Note: I've configured the parser in integration policy
- ndjson:
   target: ""

Sample log:

{"field1":"value1","field2":"101","field3":"somestr","field4":"running","field5":12,"field6":32768,"field7":100}
{"field1":"value1","field2":"103","field3":"somestr","field4":"running","field5":4,"field6":16384,"field7":200}
{"field1":"value1","field2":"119","field3":"somestr","field4":"stopped","field5":4,"field6":8192,"field7":50}
{"field1":"value1","field2":"120","field3":"somestr","field4":"stopped","field5":4,"field6":8192,"field7":50}
{"field1":"value1","field2":"121","field3":"somestr","field4":"stopped","field5":4,"field6":8192,"field7":50}
{"field1":"value1","field2":"122","field3":"somestr","field4":"running","field5":12,"field6":32768,"field7":250}
{"field1":"value1","field2":"125","field3":"somestr","field4":"running","field5":8,"field6":32768,"field7":100}
{"field1":"value1","field2":"129","field3":"somestr","field4":"running","field5":4,"field6":8192,"field7":100}

lomar · September 22, 2025, 11:26am

sample error logs:

jq 'select(."log.level" == "error" and .message != " ") | .message' elastic-agent*|sort -u
"2025-09-22 07:56:29: debug: Exec.cpp:189 ChildMonitor is pid 3734993 and monitoring pids 3734922 and 3734971"
"2025-09-22 07:56:29: debug: ProcFile.cpp:855 Found 1 cgroups for pid(3734922)"
"2025-09-22 07:56:29: debug: ProcFile.cpp:861 cgroup: id=0 type= path=/system.slice/elastic-agent.service"
"2025-09-22 07:56:29: info: InstallLib.cpp:610 Running [/opt/Elastic/Endpoint/elastic-endpoint] [version --log stdout]"
"2025-09-22 07:56:29: info: InstallLib.cpp:650 Installed endpoint is expected version (version: 8.17.3, compiled: Wed Feb 26 21:00:00 2025, branch: HEAD, commit: e54b5de09796d1b3601f7d5472359c11fafafc67)"
"2025-09-22 07:56:29: info: MainPosix.cpp:389 Verifying existing installation"
"Error dialing EOF"
"Error dialing read tcp xx.xx.xx.xx:38084->xx.xx.xx.xx:9200: read: connection reset by peer"
"Error dialing read tcp xx.xx.xx.xx:38090->xx.xx.xx.xx:9200: read: connection reset by peer"
"Error dialing read tcp xx.xx.xx.xx:49014->xx.xx.xx.xx:9200: read: connection reset by peer"
"Error dialing read tcp xx.xx.xx.xx:50976->xx.xx.xx.xx:9200: read: connection reset by peer"
"Error dialing read tcp xx.xx.xx.xx:50998->xx.xx.xx.xx:9200: read: connection reset by peer"
"Error dialing read tcp xx.xx.xx.xx:51204->xx.xx.xx.xx:9200: read: connection reset by peer"
"Error dialing read tcp xx.xx.xx.xx:56974->xx.xx.xx.xx:9200: read: connection reset by peer"
"Error dialing read tcp xx.xx.xx.xx:58798->xx.xx.xx.xx:9200: read: connection reset by peer"
"Exiting: context canceled"
"failed accept conn info connection: accept unix /opt/Elastic/Agent/.eaci.sock: use of closed network connection"
"Failed to connect to backoff(elasticsearch(https://xx.xx.xx.xx:9200)): Get \"https://xx.xx.xx.xx:9200\": EOF"
"Failed to connect to backoff(elasticsearch(https://xx.xx.xx.xx:9200)): Get \"https://xx.xx.xx.xx:9200\": EOF"
"Failed to connect to backoff(elasticsearch(https://xx.xx.xx.xx:9200)): Get \"https://xx.xx.xx.xx:9200\": read tcp xx.xx.xx.xx:49014->xx.xx.xx.xx:9200: read: connection reset by peer"
"Failed to connect to backoff(elasticsearch(https://xx.xx.xx.xx:9200)): Get \"https://xx.xx.xx.xx:9200\": EOF"
"Failed to connect to backoff(elasticsearch(https://xx.xx.xx.xx:9200)): Get \"https://xx.xx.xx.xx:9200\": read tcp xx.xx.xx.xx:38090->xx.xx.xx.xx:9200: read: connection reset by peer"
"Failed to connect to backoff(elasticsearch(https://xx.xx.xx.xx:9200)): Get \"https://xx.xx.xx.xx:9200\": read tcp xx.xx.xx.xx:58798->xx.xx.xx.xx:9200: read: connection reset by peer"
"Failed to connect to backoff(elasticsearch(https://xx.xx.xx.xx:9200)): Get \"https://xx.xx.xx.xx:9200\": EOF"
"Failed to connect to backoff(elasticsearch(https://xx.xx.xx.xx:9200)): Get \"https://xx.xx.xx.xx:9200\": read tcp xx.xx.xx.xx:50976->xx.xx.xx.xx:9200: read: connection reset by peer"
"Failed to connect to backoff(elasticsearch(https://xx.xx.xx.xx:9200)): Get \"https://xx.xx.xx.xx:9200\": read tcp xx.xx.xx.xx:50998->xx.xx.xx.xx:9200: read: connection reset by peer"
"Failed to connect to backoff(elasticsearch(https://xx.xx.xx.xx:9200)): Get \"https://xx.xx.xx.xx:9200\": read tcp xx.xx.xx.xx:51204->xx.xx.xx.xx:9200: read: connection reset by peer"
"Failed to connect to backoff(elasticsearch(https://xx.xx.xx.xx:9200)): Get \"https://xx.xx.xx.xx:9200\": read tcp xx.xx.xx.xx:56974->xx.xx.xx.xx:9200: read: connection reset by peer"
"failed to perform any bulk index operations: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
"failed to perform any bulk index operations: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": EOF"
"failed to perform any bulk index operations: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
"failed to perform any bulk index operations: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
"failed to perform any bulk index operations: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": EOF"
"failed to perform any bulk index operations: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
"failed to perform any bulk index operations: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
"failed to perform any bulk index operations: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": EOF"
"failed to perform any bulk index operations: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
"failed to perform any bulk index operations: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": read tcp xx.xx.xx.xx:38084->xx.xx.xx.xx:9200: read: connection reset by peer"
"failed to perform any bulk index operations: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
"failed to perform any bulk index operations: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": EOF"
"failed to perform any bulk index operations: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
"failed to publish events: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
"failed to publish events: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": EOF"
"failed to publish events: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
"failed to publish events: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
"failed to publish events: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": EOF"
"failed to publish events: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
"failed to publish events: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
"failed to publish events: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": EOF"
"failed to publish events: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
"failed to publish events: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": read tcp xx.xx.xx.xx:38084->xx.xx.xx.xx:9200: read: connection reset by peer"
"failed to publish events: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
"failed to publish events: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": EOF"
"failed to publish events: Post \"https://xx.xx.xx.xx:9200/_bulk?filter_path=errors%2Citems.%2A.error%2Citems.%2A.status\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
"runOsquery exited with error: context canceled"

I've sent request to es node from agent using curl by the way, the connection error may be temporary

stephenb · September 22, 2025, 2:46pm

Hi @lomar

What version agent and elastic stack and integrations?
How did you install?
Do you use self generated certs?

That does not necessarily mean the agent is connecting correctly

Have you run

./elastic-agent status

Have you run and looked at the output in detail
./elastic-agent inspect

It is hard to tell from your sampled error messages, but it sure looks like a connectivity issue.

lomar · September 23, 2025, 6:14am

stephenb:

Do you use self generated certs?

Yes. But other integrations assigned to policy of the agent send logs, so I don't think It's caused by SSL.

┌─ fleet
│  └─ status: (HEALTHY) Connected
└─ elastic-agent
   └─ status: (HEALTHY) Running

stephenb:

Have you run and looked at the output in detail
./elastic-agent inspect

Yes I've checked these section, It's similar to other agent's that works well sharing same policy. I'll share with you next post due to charachter limit.

Elastic agent version: 8.17.3
Elasticsearch node versions: 9.1.3
Custom Logs File Stream Integration version: 1.1.0

lomar · September 23, 2025, 6:21am

agent:
  download:
    sourceURI: https://artifacts.elastic.co/downloads/
  logging:
    level: info
  monitoring:
    enabled: true
    logs: true
    metrics: true
    namespace: default
  protection:
    enabled: false
fleet:
  enabled: true
  hosts:
    - <REDACTED_FLEET_SERVER>
  ssl:
    verification_mode: none
  timeout: 10m0s
host:
  os: linux
  osinfo:
    family: debian
    version: 13 (trixie)
inputs:
  - name: Auditd Logs
    type: logfile
    streams:
      - dataset: auditd.log
        paths:
          - /var/log/audit/audit.log*
  - name: System Audit
    type: audit/system
    streams:
      - dataset: system_audit.package
        period: 15m
    type: logfile
    streams:
      - dataset: system.auth
        paths:
          - /var/log/auth.log*
          - /var/log/secure*
      - dataset: system.syslog
        paths:
          - /var/log/messages*
          - /var/log/syslog*
  - name: Windows Event Logs
    type: winlog
    streams:
      - dataset: system.application
      - dataset: system.security
      - dataset: system.system
  - name: System Metrics
    type: system/metrics
    streams:
      - dataset: system.cpu
      - dataset: system.memory
      - dataset: system.network
      - dataset: system.process
      - dataset: system.uptime
  - name: Journald Logs
    type: journald
    streams:
      - dataset: system.auth
      - dataset: system.syslog
  - name: Endpoint Security
    type: endpoint
    meta:
      package: endpoint
      version: 9.1.0
    policy:
      linux:
        malware:
          mode: detect
      windows:
        malware:
          mode: detect
  - name: Osquery Manager
    type: osquery
  - name: Server VM Inventory
    type: filestream
    streams:
      - dataset: vm.inventory
        paths:
          - /var/log/mylog/mylog.ndjson
outputs:
  default:
    type: elasticsearch
    hosts:
      - <REDACTED_ES_HOST_1>
      - <REDACTED_ES_HOST_2>
      - <REDACTED_ES_HOST_3>
      - <REDACTED_ES_HOST_4>
    ssl:
      ca_trusted_fingerprint: <REDACTED>
revision: 13
runtime:
  arch: amd64

stephenb · September 23, 2025, 1:45pm

On the agent host, there's a directory called events under the logs. Did you check in there because that's where it'll show issues with actually sending the data.

Please look in there, and if you share the logs, please don't use jq or similar tools to parse them, please take a look for errors and share those events in raw form.

Example

/opt/Elastic/Agent/data/elastic-agent-9.2.0-SNAPSHOT-c4b645/logs/events

lomar · September 24, 2025, 10:50am

stephenb · September 24, 2025, 1:36pm

Can you stop and start the agent and then share a full latest one of the regular logs (not in event directory)

Quick glance of event log just shared, there is no reference to that log file at all that you're trying to harvest.

So that leads me to indicate that failing to access that path or something like that.

Assuming you checked basic things like permissions reading the file....

How many lines are you added to the file ... just 1? 100s?

If you only add 1 line and do not add a newline it will not read it.

There is something basic going on custom logs is used widely.

Try another file... try another path

- /var/log/mylog/*

lomar · September 24, 2025, 2:57pm

There is the logs after restarting the agent:

I've already tried this.

{"host_type":"Proxmox","vm_id":"101","vm_name":"vm-1","vm_state":"running","cpu_count":12,"memory_mb":32768,"disk_gb":100}
{"host_type":"Proxmox","vm_id":"103","vm_name":"vm-2","vm_state":"running","cpu_count":4,"memory_mb":16384,"disk_gb":200}
{"host_type":"Proxmox","vm_id":"119","vm_name":"vm-3","vm_state":"stopped","cpu_count":4,"memory_mb":8192,"disk_gb":50}
{"host_type":"Proxmox","vm_id":"120","vm_name":"vm-4","vm_state":"stopped","cpu_count":4,"memory_mb":8192,"disk_gb":50}
{"host_type":"Proxmox","vm_id":"121","vm_name":"vm-5","vm_state":"stopped","cpu_count":4,"memory_mb":8192,"disk_gb":50}
{"host_type":"Proxmox","vm_id":"122","vm_name":"vm-6","vm_state":"running","cpu_count":12,"memory_mb":32768,"disk_gb":250}
{"host_type":"Proxmox","vm_id":"125","vm_name":"vm-7","vm_state":"running","cpu_count":8,"memory_mb":32768,"disk_gb":100}
{"host_type":"Proxmox","vm_id":"129","vm_name":"vm-8","vm_state":"running","cpu_count":4,"memory_mb":8192,"disk_gb":100}

All log files across the agents are same format and similar outputs. I'm already using the same file path for the cronjob, script, and log path in all of them. I checked all of them and there are no differences.

I've mentioned about this, all permissions are correct. If it weren't true, it wouldn't work with other agents.

lomar · September 24, 2025, 2:59pm

By the way, the agent run with root permissions, so file permissions might not be problem as well

stephenb · September 24, 2025, 3:15pm

Looked through .. .do not see any messages about that filestream collector at all... not sure what to tell you. I would turn up the Log Level to DEBUG which it is not.. do you know where to do that? Its kinda hidden

Do that then look through all the logs again...

With debug you should see it start the collector for that file...

I would turn off everything else if you can to isolate but that is up to you....

Add some different paths, add some different files use wildcards

Remove the integration put it back

I notice that you set the data stream to something specific ... vm.inventory assuming you actually looked in the correct data stream, I would remove... add back with all defaults etc.

Not sure what to tell you... Custom Logs is used widely something simple / basic as this point...

There sure are a lot of connection errors.... I don't usually see that.
Lots of errors trying to write
Lots of errors trying to connect to fleet

lomar · September 25, 2025, 8:08am

Hi, I couldn't find the debug logs, but I created a new integration policy and tried reading the same log file. I encountered the same issue, but when I ran the following command, the log appeared.


cat vm_inventory.ndjson | tee -a /vm_inventory.ndjson

Of course, doing this created duplicate data. But at least the data is coming through. Before that, I tried running the echo "" | tee /vm_inventory.ndjson command to write logs to the file, but that didn't work.

stephenb · September 25, 2025, 1:52pm

echo "" | tee /vm_inventory.ndjson that writes nothing to a file in the root directory not sure why you would expect that to add lines to the log file.

So I am confused ... is it reading the logs? Are you actually

Your intergations is set to

    streams:
      - dataset: vm.inventory
        paths:
          - /var/log/mylog/mylog.ndjson

and your commands are writing to
/vm_inventory.ndjson

I am confused... why not just point it to an actual log file... or just concatenate into /var/log/mylog/mylog.ndjson

lomar · September 26, 2025, 2:35pm

Hi, I've solved the problem with adding timestamp field to the log.

stephenb · September 26, 2025, 3:19pm

@lomar

Good, Glad you got it fixed. Hmmm, a @timestamp should have been added on ingest... but perhaps the ndjson parser was overwriting it....

Topic		Replies	Views
Custom logs - Newly added log path not harvested Elastic Observability fleet	3	622	November 4, 2022
Kibana is not showing all the logfiles from the path, shows only one file Beats filebeat	23	7293	July 5, 2017
Ingest Manager - Custom Files not being uploaded Beats filebeat	9	722	December 3, 2020
Logstash, flume vs elasticsearch Elasticsearch	5	3376	July 6, 2017
Elastic agent does not send logs Elastic Security fleet	8	2715	September 28, 2021

Some Agent Do Not Send Data Custom Logs (Filestream)

Related topics