I have a custom integration setup with Elastic agent. At the moment I have a directory with several log files containing quite different data. I can't use the same pipeline so I would like to reference another pipeline. I am happy for all of these to share the same data stream, even if some fields will be different.
At the moment in the custom log integration I simply chose pipeline: logs-ams-worker
This processes the worker logs and all is well. I also need to process a different log file with a different pipeline e.g logs-ams-node.
Is it possible in the custom configuration where I have my pipeline declared to be able to have a condition with a distributor or an if statement or some other method.
if doc['source'].value == 'logs-ams-worker.log'
else if doc['source'].value == 'logs-ams-node.log'
else if doc['source'].value == 'logs-ams-other.log'
You know since the sorting is path based you could just create 2 or N custom Logs integrations with the paths / patterns and then just call the exact pipeline...
For Multiline that happens before the pipeline you need to put it in the integration using the multiline syntax (actually the legacy syntax) see here...
Make sure to use the Log input syntax
Using log input:
multiline.type: pattern
multiline.pattern: '^[[:space:]]+(at|\.{3})[[:space:]]+\b|^Caused by:'
multiline.negate: false
multiline.match: after
Helps if you actually show the pipeline not just the error but I suspect you need null safety checks...
Right above the doc I linked
Incoming documents often contain object fields. If a processor script attempts to access a field whose parent object does not exist, Elasticsearch returns a NullPointerException. To avoid these exceptions, use null safe operators, such as ?., and write your scripts to be null safe.
For example, ctx.network?.name.equalsIgnoreCase('Guest') is not null safe. ctx.network?.name can return null. Rewrite the script as 'Guest'.equalsIgnoreCase(ctx.network?.name), which is null safe because Guest is always non-null.
If you can’t rewrite a script to be null safe, include an explicit null check.
PUT _ingest/pipeline/my-pipeline
{
"processors": [
{
"drop": {
"description": "Drop documents that contain 'network.name' of 'Guest'",
"if": "ctx.network?.name != null && ctx.network.name.contains('Guest')"
}
}
]
}
Ohh and this is not correct ctx means context of the document in the painless script in the processor.... you pre-pended it to the field in the conditional statements...
don't prepend it in the simulate etc.
if condition scripts run in Painless’s ingest processor context. In if conditions, ctx values are read-only.
The multiline is being problematic, so I think I will go back to the multiple integrations. It does in my case mean I will be running six of these custom integrations for one server type, which seems excessive. But life is too short.
Perhaps provide more details... multiline is always a bit more challenging... but to be clear Multi-line happens on the Collection Side / In the Agent not on in the ingest pipeline...
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.