Logstash elastic integration

Hello,

I make use of Elastic Fleet. I configured a logstash output for several integrations.
The logstash pipeline contains the elastic_integration filter, so elastic ingest pipelines are also executed by logstash. I assume this is correct?
The output of the logstash pipeline sends the data to our hot nodes.
However, after inspecting the data I noticed the following field and value in the index settings: "final_pipeline": ".fleet_final_pipeline-1"
This originates from the managed component template '.fleet_agent_id_verification-1'.
My question is: does logstash also executes this pipeline? I found online that this ingest pipeline was not ported to logstash and therefore this pipeline can only be
executed by elasticsearch nodes with the ingest role just before indexing...

Best regards
Christophe

When using the elastic_integration filter, the ingest pipelines are executed only by Logstash, not Elasicsearch anymore.

Logstash will load the ingest pipeline definition for the integration from Elasticsearch, including any @custom pipeline added to the integration, then after processing the data it adds a metadata information telling Elasticsearch to not use any ingest pipelines, basically it sets this field:

@metadata.target_ingest_pipeline = _none

In this case, Logstash does not execute this pipeline as it is not part the integration pipeline definition.

You can check the same data and you see another setting named index.default_pipeline, on this setting you will have the integration pipeline name with the current version installed, this is the pipeline that Logstash will load.

The index.final_pipeline that points to .fleet_final_pipeline-1 will always be executed after the defaul pipeline has finished, and it can only be executed by Elasticsearch Ingest nodes.

As an example, for the Palo Alto integration, you have these 2 settings:

"index.final_pipeline": ".fleet_final_pipeline-1"
"index.default_pipeline": "logs-panw.panos-5.4.0"

The index.default_pipeline is the pipeline that the elastic_integration plugin will load, it will also load all pipelines that are called by this one using the pipeline processor, so Logstash will load all these pipelines:

logs-panw.panos-5.4.0
logs-panw.panos-5.4.0-audit
logs-panw.panos-5.4.0-authentication
logs-panw.panos-5.4.0-config
logs-panw.panos-5.4.0-correlated_event
logs-panw.panos-5.4.0-decryption
logs-panw.panos-5.4.0-globalprotect
logs-panw.panos-5.4.0-gtp
logs-panw.panos-5.4.0-hipmatch
logs-panw.panos-5.4.0-ip_tag
logs-panw.panos-5.4.0-sctp
logs-panw.panos-5.4.0-system
logs-panw.panos-5.4.0-threat
logs-panw.panos-5.4.0-traffic
logs-panw.panos-5.4.0-tunnel_inspection
logs-panw.panos-5.4.0-userid
logs-panw.panos@custom

After it finishes processing the data, it will send the events to Elasticsearch that will see that it does not need to process anything using the defaul pipeline for the integration, but it sees that the index has a setting to run a final_pipeline, Elasticsearch will then run this pipeline before indexing the data.

Thank you for the clarification. The logstash servers output the events to Elasticsearch on my hot nodes. Those nodes currently do not have the ingest role configured. I guess that role needs to be configured on the hot nodes to prevent those nodes sending the data around to yet another node that has the ingest role?

Yes, if you are using integrations you need to have at least one node with the ingest role.

You should configure your hot nodes to have the ingest role as well.