Need to Ingest Small Log Files in Real-Time

Hello everyone,

I am currently working with Elasticsearch v8.17.3 using the Custom Logs (Filestream) integration via Elastic Agent.

I have encountered a problem:

  • Filestream only starts ingesting a file when it is larger than ~1 KB.
  • I found in the documentation that this can be adjusted via prospector.scanner.fingerprint.length (minimum 64 bytes), so I changed it to 64 bytes.

However, I now face two issues:

  1. The change doesn’t seem to work — even when my file is over 64 bytes, ingestion doesn’t start until it reaches around 1 KB. I’m confused about the exact syntax/IDs needed to set this correctly in the Elastic Agent integration policy (Fleet). Could someone provide a working example?
  2. My use case requires real-time ingestion of every new log line (even ~20 bytes), because I use these logs to trigger detection rules. Even 64 bytes is much larger than I need — ideally, ingestion should happen immediately when a new line is added.

Questions:

  • Is there any way in v8.17.3 to bypass the 1 KB file-size threshold so Filestream starts reading immediately?
  • If not, is there a supported workaround that can ingest small logs in real time?
  • Could someone share the exact policy configuration (with correct field names) to change the fingerprint length in Fleet-managed Elastic Agent?

This is urgent for me — my Master’s thesis depends on getting real-time log ingestion working. Any advice or alternative solution would be greatly appreciated.

Thanks in advance!

Hi @Nourgr23,

If you’re using a Fleet managed Elastic Agent, then you don’t need to write the policy manually.

What you need to do is to enable the native file identity, which will identify files based on their inode and device ID, allowing files from any size to be ingested.

For that to work, you will need the latest version of the integration, 1.2.0. When adding/editing the integration look for “File identity: Native” and enable it, also make sure to disable “File identity: Fingerprint”.

Once the new policy is deployed to your Elastic Agent(s), you’ll be able to ingest files of any size.

1 Like

Thank you so so much @TiagoQueiroz . It works! It is so helpful. I noticed that when we use the Native mode, there is reingestion. Is there a way to avoid it? thanks in advance

This can happen if the inode or device ID (or the equivalent in your file system) change. We introduced the fingerprint file identity to solve this problem: instead of relying in the file system to have a unique and stable identifier for files, we generate a fingerprint using the file’s content, with the downside of needing a minimum size to generate the fingerprint/starting ingesting files.

To try not re-ingesting files with the native file identity, it really depends a lot on your setup.

  1. Could you give more details about the files being re-ingested?
  2. When are they re-ingested? Does your log rotation strategy copies files?
  3. What is the file system used?