FileBeat Sends the same file content again and again

Hi,

I have file beat agent running on one machine with conf as

filebeat:
  prospectors:
    -
      paths:
        - /var/logs/apache/*.log
      input_type: log


  registry_file: /var/lib/filebeat/registry

output:
  logstash:
    hosts: ["localhost:5044"]

shipper:

logging:
  files:
    rotateeverybytes: 10485760 # = 10MB

When i fire command to send the output to Elastisearch from log stash .My logstash.conf is like

input {
   beats {
     port => 5044
   }
}

filter {
  grok {
    match => {
      "message" => '%{IPORHOST:clientip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] "%{WORD:verb} %{DATA:request} HTTP/%{NUMBER:httpversion}" %{NUMBER:r
esponse:int} (?:-|%{NUMBER:bytes:int}) %{QS:referrer} %{QS:agent}'
    }
  }

  date {
    match => [ "timestamp", "dd/MMM/YYYY:HH:mm:ss Z" ]
    locale => en
  }

  geoip {
    source => "clientip"
  }

}

output {
  stdout { 
codec => plain {
                        charset => "ISO-8859-1"
                }

}
  elasticsearch {
    hosts => "localhost:9200"
  }
} 

and then

./logsatsh -f logstash.conf

it send the o/p to Elastisearch but WHEN I ADD NEW FILE AGAIN IN /var/log/apache/ folder which i mentioned in filebeat.yml ,it resend the already sent data again to those index so how to stop that file to send its data again if its already sent

Plz help me

Thanks
gaurav

Include ignore_older property in your filebeat.yml config. For eg if you have mention ignore_older: 5s then filebeat wont pickup the file if the file is not modified for past 5s.

Not sure I understand how you are adding a new file. If the original file is just renamed, Filebeat should normally notice that and not send the file again. Can you give more details about the rotation strategy?

I am adding new file by creating a file under same folder.SO that when filebeat see new file it sent out the content to ES

So you simply add a new file and the contents of the other files are sent again? That sounds quite strange. Can you come up with step by step instructions on how to reproduce it?

If you see my filebeat.yml file above, there i mentioned /var/log/apaches/*.log.

Let suppose initially under that folder there is no file and i started my filebeat and logstash command..

when i add new file let suppose 'apache.log' under /var/log/apaches/ it matches with my filebeat.yml condition so it send data to ES.

If again i create a new file 'test.log' under same fodler then it matche with /var/log/apache and send both earlier file (apache.log) and test.log to ES .

can you explain a little bit more ..I think this might be the solution

Include ignore_older property like below in your config file and mention the time in sec. For eg below I have mentioned 10s. So filebeat will read the file and wait for 10s and it will started to read the next file if there is nor modification in that file. It wont read again that file. Please refer the below link for detail description.

filebeat:
prospectors:
-
paths:
- /var/logs/apache/*.log
input_type: log
ignore_older: 10s
registry_file: /var/lib/filebeat/registry