I'm having some issues with normalizing my CEF logs coming over Elastic Agent's CEF integration.
To be precise, they're logs from Silverfort, and there's no way to configure detailed logging options on it to remove the double quotation marks. Here is a part of the raw message:
After being indexed, the fields have quotation marks in the values:
"user": {
"name": "\"user@example.com\""
}
Any help is appreciated. I tried playing around with painless scripts on the CEF integration under Processors, and with a Logstash ruby filter.
So far, no luck.
The key is to do the processing before it hits the Elasticsearch CEF ingest pipeline as the pipeline does not know how to handle it, and I would like to avoid modifying it since it will be overwritten on updates.
[error in field 'rt': value is not a valid timestamp, error in field 'src': value is not a valid IP address]
Does anyone have a quick and dirty loop through all and remove specific character script?
Thanks for the prompt response. AFAIK the processors run after the integration does it's work.
In this example, since it is the CEF integration first it would run the decode_cef processor, which would in turn populate all of the fields with quotations in the field values.
The replace processor can only work on one field at a time, so it would not be a valid solution here. Unless the processors run before any of the other beats processors built into an integration.
In my case, using the standard TCP integration with the below processors now fixes such cases.
It would probably be a good idea to open an enhancement request for the CEF integration to do this beforehand, as indexing fields with quotation marks is not really something you should do.
Ah yes, I misread the replace doc and thought you could field select with a wildcard.
Glad you were able to find a solution for your particular cef source.
Typically, minimal processing is done on the agent side with most processing done in the ingest pipeline. I hadn't looked before but I did look and you're right the CEF decoding happens on the agent and so processors run after cef decoding and before ingest pipelines for this integration.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.