FileBeats -Are there any ways we can delete the log files after file beat harvest the data to logstash

FileBeats -Are there any ways we can delete the log files after file beat harvest the data to logstash.

I am using window machine. I have json files(multiple) which is having application analyticsinfo's. I am reading those json files through filebeats and pushing it to logstash -->--->Elastic--->Kibana.
Content of these json files will grow quickly so i need a mechanism to delete the files immediately once beats harvest it and simultaneously regenerate the new json files . are there any ways to achieve this in filebeats?

There is no direct way to do this in Filebeat. One thing you could do is writing a small script which checks the registry file. And as soon as the size of the file == offset in the Filebeat registry file, you can delete it (assuming filebeat is always a bit behind in reading). You would have a small delay but I would assume this is acceptable?

1 Like

Appreciate the response. As suggested,am planning to do the same on single instance.

It would be great if can get below clarifications too.

  1. Will there be any chance of data loss in any use cases? i will be having 3-4 log files.
  2. My application will be hosted on multiple instance and each instance will have their own log,so it imply us to create this job on each instance. Is there any centralized way to do the same for large scale application hosted on multi instances/server or on cloud?
  3. Can we expect Elastic team to build some mechanism to support this in future? This will be kind of + for ES community.
  4. Do you suggest to have any queue mechanism(Kafka/Rabbitmq) to prevent any data loss.

Thanks and Regards,
Cheers!!

  1. The offset is only written to the registry if the data was successfully published. So if you read the offset from the registry you should be on the safe side. Not sure how 3-4 log files play into this?
  2. As the registry is local to each instance and state is local, this would have to be local. I assume you could deploy the script the same way you deploy filebeat.
  3. There is an open Github issue for this. My current personal take is that log deletion is not a responsibility of a log shipper but that is just my personal view :slight_smile:
  4. For the full data loss discussion, I think the more movable parts you have the more likely you can loose data. But the devil is always in the detail. Do you need LS in between or could you ship to ES directly?

Great! Appreciate on the response. Yes, i would be needing logstash in between to communicate to ES.
My scenario is having multiple instances and each instance with multi logs(separate exception,event & track files) will be transmitting the data to either 1 logstash or different logstash. From logstash ,this will be going to common ES/Kiabana.
I am thinking of having kafka in between to build a queue system or to ensure the prevention of data loss.
Kindly suggest. Any input on the design would be great.

Thanks & Regards,
Prabhat Ranjan.

Also, fyi, below is my sample powershell script to delete the files on window server.
Function Remove-AnalyticsLog {
[CmdletBinding()]
Param(
[Parameter(Mandatory=$true)][string]$registryfilelocation)
Process
{ Write-Host "FileBeat registry is valid JSON string";
$text = Get-Content $registryfilelocation -Raw
try {
$filebeatRegister = ConvertFrom-Json $text -ErrorAction Stop;
$validJson = $true;
}
catch
{
$validJson = $false;
}

	if ($validJson) 
	{  
		Write-Host "FileBeat registry is valid JSON string";
		foreach ($line in $filebeatRegister ) 
		{
			if([System.IO.File]::Exists($line.source)) 
			{	     
				$offset=$line.offset;		 
				$size =(Get-Item $line.source).length;
				Write-Host ($line.source + " size on disk:--" +$size+ " :-offset value on filebeat registry:--" +$offset);
				if($offset-eq $size)
				{
                                        /*Write code to delete files */
					Write-Host ($line.source + "Can be deleted");
	                             
				}
				else
				{
				Write-Host ($line.source + "File not processed, offset: "+ $offset);
				}		 
			}
			else 
			{
				Write-Host ($line.source + "Not exists");
			}	
		}
	} 
	else 
		{
		  Write-Host "FileBeat registry is not a valid JSON string";
		}
    
}

}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.