Enrich Processor Not Working with Ingest Pipeline + Fleet Data Streams

Hi everyone,

I'm building a small CTI platform where threat intelligence feeds (containing file hashes, IPs, etc.) are indexed into a custom index called tip_index. I want to correlate fields like threat.indicator.file.hash.md5 from this index with incoming logs from my Elastic Agent (which go into data streams like log-*), using an enrich processor and ingest pipelines.

:bullseye: Goal

Automatically enrich incoming logs with threat intelligence data (e.g., match process.hash.md5 with known malicious md5 from tip_index) and store the match in a field like threat_match or similar — without using the preview ES|QL JOIN.

Steps I've Taken So Far

  1. Created an Enrich Policy:
PUT /_enrich/policy/file_hash_policy
{
  "match": {
    "indices": "tip_index",
    "match_field": "threat.indicator.file.hash.md5",
    "enrich_fields": ["threat.indicator.file.*"]
  }
}
  1. Executed the Policy:
POST /_enrich/policy/file_hash_policy/_execute
  1. Created the Ingest Pipeline:
PUT /_ingest/pipeline/enrich-file-hash
{
  "processors": [
    {
      "enrich": {
        "policy_name": "file_hash_policy",
        "field": "process.hash.md5",
        "target_field": "threat_match",
        "ignore_missing": true
      }
    }
  ]
}
  1. Connected the Pipeline to Log Ingestion:
  • Tried assigning the pipeline using an index template:
PUT /_index_template/logs_enrich_template
{
  "index_patterns": ["logs-*"],
  "template": {
    "settings": {
      "index.default_pipeline": "enrich-file-hash"
    }
  }
}
  • Also tried setting pipeline in Fleet:
    • Fleet > Settings > Edit Output > Advanced YAML:
pipeline: enrich-file-hash
  1. Verified new logs are still not enriched with the expected threat_match field.

:cross_mark: Problem

  • No threat_match field is showing up in my logs-* indices.
  • Pipeline doesn't appear to trigger.
  • The enrich policy executes correctly, and the .enrich-file_hash_policy index exists.
  • I suspect it may have something to do with **Fleet-managed data streams or integration behavior not using the pipeline?

:magnifying_glass_tilted_left: Questions

  1. How can I properly connect an enrich ingest pipeline to logs coming from Fleet-managed integrations (like System or Endpoint)?
  2. Is it possible to enrich documents in data streams like logs-* at ingest time, or do I need to restructure my pipeline?
  3. Can I attach enrich pipelines to Fleet agent logs without custom filebeat.yml?
  4. If I update the tip_index, will it automatically reflect in enrichment after re-executing the policy?

Any help or guidance would be greatly appreciated!

Thanks so much in advance,

Hello and welcome,

There are some issues in what you are doing.

First:

PUT /_index_template/logs_enrich_template
{
  "index_patterns": ["logs-*"],
  "template": {
    "settings": {
      "index.default_pipeline": "enrich-file-hash"
    }
  }
}

You cannot do that, this will breaks a lot of things, you also cannot change the index.default_pipeline for Elastic Agent integration, this breaks them.

You need to create an ingest pipeline named logs@custom an add your enrich processor there.

In theory it is possible, but it may be insanely CPU intensive to the point that it will not work or you would need to use expensive dedicate ingest nodes, this depends on a lot of factors and the only way to know is by testing.

With fleet managed agents you do not have access to any yml to do changes, you need to do everything through the UI, but as mentioned to make an enrich processor to be executed by all integrations, you need to create the enrich processor in the logs@custom ingest pipeline.

Yes, when you re-execute an enrich policy it create a new .enrich-* index, it may take some time for the changes to reflect.

Unfortunately enrichment during ingestion with Elasticsearch has some problems.

In this case it may be easier to do enrichment during query using the new JOIN feature for ESQL that you mentioned or maybe creating an indicator match rule.

1 Like

Thanks a lot for the detailed explanation — that really helped clarify the limitations around enrich processors and Fleet-managed integrations.

I’ve decided to switch to using an indicator match rule instead, which seems to be a much more reliable and efficient approach for our use case. It's already looking more stable in our testing.

Appreciate your support and insights!