Data loss without my consent

Hello,
I have a logstash script that takes files as input and sends them to an according elasticsearch index. Said script is run at 1am every night as a scheduled task, which worked in the beginning.But now, every time I check data is comming in only for the day I am looking. The scheduled Tasks are run accordingly, based on the exit code. But today(17.1.2023) I checked and I can only see data that came in on 17.1.2023 at 1am. If I would check tomorrow, the data from today would be lost and all I could see would be data from tomorrow at 1am.

The system has enough storage, so why is the data not stored?

input {
  file {
	path => [ "C:/logfile.log" ]
 	add_field => { "testvalue" => "Shipped with logstash :)" }
	start_position => "beginning"
  }
}

filter{
	csv{
		separator => ","
		skip_empty_columns => "true"
		skip_header => "true"
		columns => ["ID","Date","Time","Description","IP","Host Name","MAC","Username","TransactionID","QResult","Probationtime","CorrelationID","Dhcid","VendorClass(Hex)","VendorClass(ASCII)","UserClass(Hex)","UserClass(ASCII)","RelayAgentInformation","DnsRegError"]
	}
	if "DNS" in [Description] {
    	    drop {}
  }
}

output {
  elasticsearch {
    hosts => "https://elastic:9200"
    cacert => "ca.crt"
    user => '?'
    password => 
    index => "logstash-%{+yyyy.MM.dd}-_filtered"
  }
}

Where data is lost? File, LS or ES?

  • File logfile.log
    Have you checked are C:/logfile.log is exist or refreshed, is appended or every day is overwritten?
    Has been the date modified attribute changed?

  • Logstash
    Have you checked sincedb_path? You can force LS to read file again with: sincedb_path => "NUL"
    Since there is no mode specified, you are using tail mode. Check the documentation what is suitable for your case.
    Is you condition(if "DNS" in [Description] ) always true for some daily cases?
    If the logfile.log file is not so big, you can add in output:
    Have you tried to save data before send to ES?

output {
 file { path => "C:/temodirte/processed_-%{+YYYY.MM.dd-HH.mm.ss}.txt" }

  elasticsearch {
    hosts => "https://elastic:9200"
    cacert => "ca.crt"
    user => '?'
    password => 
    index => "logstash-%{+yyyy.MM.dd}-_filtered"
  }

}

Have you enable debug log.level: debug for a day or two?

  • Kibana/ES
    If is in Kibana, do you have correct a time filter or perhaps you are viewing only the dashboard?
    Have you search for data "manually" with DSL queries in Devtools?

Hello, the files are generated automatically and therefore exist in either way. They are never empty. I will try to implement the sincedb_path accordingly, thank you for that.

I have found this error in elasticsearch.log:

[2023-01-20T00:34:55,555][INFO ][o.e.x.i.IndexLifecycleRunner] [elasticsearch] policy [logstash-policy] for index [logstash-2023.01.19_filtered] on an error step due to a transient error, moving back to the failed step [check-rollover-ready] for execution. retry attempt [2]
[2023-01-20T00:44:55,557][ERROR][o.e.x.i.IndexLifecycleRunner] [elasticsearcg] policy [logstash-policy] for index [logstash-2023.01.19-_filtered] failed on step [{"phase":"hot","action":"rollover","name":"check-rollover-ready"}]. Moving to ERROR step
java.lang.IllegalArgumentException: index.lifecycle.rollover_alias [logstash] does not point to index [logstash-2023.01.19-_filtered]
        at org.elasticsearch.xpack.core.ilm.WaitForRolloverReadyStep.evaluateCondition(WaitForRolloverReadyStep.java:156) [x-pack-core-7.16.3.jar:7.16.3]
        at org.elasticsearch.xpack.ilm.IndexLifecycleRunner.runPeriodicStep(IndexLifecycleRunner.java:226) [x-pack-ilm-7.16.3.jar:7.16.3]
        at org.elasticsearch.xpack.ilm.IndexLifecycleService.triggerPolicies(IndexLifecycleService.java:408) [x-pack-ilm-7.16.3.jar:7.16.3]
        at org.elasticsearch.xpack.ilm.IndexLifecycleService.triggered(IndexLifecycleService.java:339) [x-pack-ilm-7.16.3.jar:7.16.3]
        at org.elasticsearch.xpack.core.scheduler.SchedulerEngine.notifyListeners(SchedulerEngine.java:186) [x-pack-core-7.16.3.jar:7.16.3]
        at org.elasticsearch.xpack.core.scheduler.SchedulerEngine$ActiveSchedule.run(SchedulerEngine.java:220) [x-pack-core-7.16.3.jar:7.16.3]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
        at java.lang.Thread.run(Thread.java:833) [?:?]