Need to get filenames at a particular directory without reading the data inside files


(shruti) #1

Hi,

I am trying to read the file names without reading the content inside file.
I am not able to get the desired result using filebeat.
The logstash is:

input {
>   beats {
>     port => 5044
> 	}
> }

filter {
		if [fields][log_type] == "check-filename" {	
			 grok {
			match => ["source","D:/ELK Demo Logs/STO/%{GREEDYDATA:filename}"]
				}
		mutate { 
				remove_field => [ "host" ]
				add_field => ["Promotion_Source", "XMLs"]
							}
		}
		}

output {
   elasticsearch {
    hosts => "localhost:9200"
	manage_template => false
	index => "abc-ind-%{+YYYY.MM.dd}" 
  }
 stdout { codec => rubydebug }
}

My Filebeat is:

  • type: log
    enabled: true
    paths:
    • D:\ELK Demo Logs\STO*.txt
      fields: {log_type: check-filename}

In my output,no filename column is created.Please help if you see any discrepancy in code or any other change needed.

Thanks in advance.


#2

Do your events have a _grokparsefailure tag? I would expect source to have backslashes, not forward slashes. Try changing the grok to

match => ["source","^D:\\ELK Demo Logs\\STO\\%{GREEDYDATA:filename}"]

(shruti) #3

Thankyou very much Badger.
The code is reading filename now.

Now, the logstash is reading filename as many times as there are records in file.
I only want to get the filename once,no matter how many records are inside the file.

Example: There are 3 records in a file abc.txt
I am getting Filename 3 times.
I dont need to parse records of filename but to get the filename only once.

Please suggest the approach to it.


#4

If you want to avoid duplicate events being indexed into elasticsearch then this blog post provides suggestions.

You could also do it with an aggregate filter. Use the filename as the task_id and then

  code => 'map["occurs"] ||= 0; map["occurs"] += 1; if map["occurs"] > 1 then event.cancel end'