Parse filename before sending to ElasticSearch

I am trying to parse the filename to extract certain information. Is this possible today or need to send the data to LogStash to process further ?

Thanks,
Rahul

@rahulnathan,
It will be very helpful if you describe it with some example.

Filename can be fetched from json field "source" of publishable event and user can apply different processors on it with some regular expression and conditional statement.

If you elaborate with a sample example then it will be helpful to give proper solution.

Thanks

My filename is of below format.
"application-process-hostname-cluster-region.log.INFO.20190417-190942.1"
Here I would like to publish process, hostname, cluster and region as additional fields to ElasticSearch.

Thanks,

Thanks @rahulnathan for giving details.

It is possible by filebeat but little bit tricky and version dependent. Kindly confirm the version of filebeat you are using. After that I will give you the exact solution.

Filebeat version is 6.7.1

Thanks in advance for your help.

@rahulnathan, As you are using 6.7.1 so you can do with the help of "dissect" processor. To achieve your requirement you have to use "drop_fields" processor with combination of "dissect".

  filebeat.inputs:
  - type: log
    enabled: true
    paths:
     - /var/log/application-process-hostname-cluster-region.log.INFO.20190417-190942.1
  processors:
     - dissect:
          tokenizer: "%{key1}-%{key2}-%{key3}-%{key4}-%{key5}.%{key6}"
          field: "source"
          target_prefix: ""  
     - drop_fields:
          when:
              has_fields: ['key1','key6']
          fields: ["key1","key6"]

The sample publishable event will be

  "@timestamp": "2019-04-24T04:52:21.749Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "doc",
    "version": "6.4.1"
  },
  "message": "Sample",
  "input": {
    "type": "log"
  },
  "host": {
    "name": "localhost.localdomain"
  },
  "source": "/var/log/application-process-hostname-cluster-region.log.INFO.20190417-190942.1"
  "key2": "process",
  "key3": "hostname",
  "key4": "cluster",
  "key5": "region",
  "offset": 0,
  "prospector": {
    "type": "log"
  },
  "beat": {
    "hostname": "localhost.localdomain",
    "version": "6.4.1",
    "name": "localhost.localdomain"
  }, 
}

If you want to publish the value of process in your log filename as a field value of "process" then use the field name "process" in place of "key2". Make changes in similar way for the others (hostname, cluster and region).

@rahulnathan Is it solved? What is the current status regarding your problem?

@Debashis, Unfortunately, this did not work. Below fields are missing in the sent event

"source": "/var/log/application-process-hostname-cluster-region.log.INFO.20190417-190942.1"
"key2": "process",
"key3": "hostname",
"key4": "cluster",
"key5": "region",
"offset": 0,
I am not seeing any error in filebeat logs and message line is being sent to Elastic search without the additional fields.
Is there a way we can enable more debugging in the logs to help identify the issue ?

Thanks,
Rahil

@rahulnathan, I have executed this on my side and it is working fine and the events are getting published with those fields successfully. I can't understand what's problem is there on your side.

Kindly share your filebeat console log after enabling debug log and .yml file using </>. I will recheck it.

 logging.level: debug
 logging.selectors: ["*"]

Thanks.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.