[sincedb creation]


(Varun Kumar) #1

I have a following config file for logstasth
input {
file{
path => "D:/Log//"

}

file{
	path => "D:/Log/*/MIP/*"
          
}

file{
	path => "D:/Log/*/NHGS/*"
            
}

}

filter{
mutate{
gsub => ["path","D:/Log/", ""]
}

}

output {

	file{
		path => "Z:/Logs/%{host}/%{path}"
		message_format => "%{message}"
                    flush_interval => 0
	}

}

Issue :- I want to know whn the sincedb file is created .In my case when I am pressing ctrl ^c in windows terminal after the message Logstash startup completed then only sincedb file is created .But I want that since db is created before pressing me ctrl c


(Magnus Bäck) #2

The sincedb file is updated every sincedb_write_interval seconds (default 15), and I suspect that this holds true for new files too, i.e. it's created 15 seconds after the startup. You can use Process Monitor to check this assumption. At least with recent versions of Logstash and the file input plugin it'll be written when Logstash shuts down.


(Varun Kumar) #3

Hello ,
Thanks for your reply.Then How can i change the sicedb writing procedure .I want it immediately becasue I am writing one script which is using sincedb inputs .


(Varun Kumar) #4

I have changed the sincedb_write_interval to 0 and I am seeing things are working .Is there anything more I want to change


(Magnus Bäck) #5

What do you want to accomplish? If all you want to do is increase the write frequency of the sincedb files you're done.


(Varun Kumar) #6

Hello Magnus ,
Thanks for your reply .I am writing one script (C#) where I read the file size of a file from sincedb and matches the file size of that file from its actual size as we get from running ls command .The reason for writing this script is I want to check when the logstash has finished parsing a file .Do you know any better way to tell this thing .


(Varun Kumar) #7

I know there is a way we can tell from kibana GUI but right now I want to do this thing with help of only logstash


(Magnus Bäck) #8

To monitor Logstash's file reading progress the currently best way is indeed to read the sincedb file.


(Varun Kumar) #9

Dear magnus,
I have one more doubt about sincedb file .I have changed sincedb_write_interval to 0 .If possible can you tell when the sincedb file is updated with inode number and file size offset is it possible the offset will be the final offset of the file used by the logstash or will it when logstash is still parsing the file .


(Magnus Bäck) #10

I'm not sure I get your question right, but the sincedb file is always written after Logstash has read the file up to that offset and sent the messages into Logstash's (very small) internal queue. Hence, when the offset is equal to the file size Logstash is done with the file.


(Varun Kumar) #11

Hello Magnus ,
so according to your answer whenever any file info is present in sincedb it means that file offset size mentioned in the sincedb corresponding to that file must be the end of the original file parsed by the logstash .


(Magnus Bäck) #12

Yes, the sincedb file records how far Logstash has read. At any given moment there is of course the possibility that Logstash has read slightly further than what has been recorded in the sincedb file, but there's nothing to do about that.


(Varun Kumar) #13

Hello magnus,
I am facing one issue regarding performance .What I am doing I am creating one big log file .I am comparing the creation of the log file with logstash running and without logstash running .I am seeing a huge difference in the performace of the log file creation .Can u suggesst what could be the reasong behind it and what could be done to increase the performance .My current cofig file is
input {
file{
path => "D:/Log//"
start_position => "beginning"
sincedb_write_interval => 0

}
file{
	path => "D:/Log/*/MIP/*"
           start_position => "beginning"
           sincedb_write_interval => 0
}

file{
	path => "D:/Log/*/NHGS/*"
             start_position => "beginning"
            sincedb_write_interval => 0
}

}

filter{
mutate{
gsub => ["path","D:/Log/", ""]
}

}

output {
stdout {
codec => line {
format => "%{path}"
}
}

}

Following is the timing difference of the log file creation
Without running logstash --5005788.029ms
With Logstash --.5607995.1309m

I am creating two log files of 13.2 MB each


(Magnus Bäck) #14

What program is creating the log file? How are you measuring this? And why would it take over 5000 seconds to generate 26 MB worth of log files? That's ~6 kB/s.


(Varun Kumar) #15

Hello magnus ,
Thanks for your reply .Below is my sample file to generate the log file .

I am putting a stopwatch at start and end of this function .I am using Nlogger to generate the log .
private static void TestLoggerHierarchy()
{

for (var i = 0; i < 1; i++)
{
string jobId = System.Guid.NewGuid().ToString();
NLog.GlobalDiagnosticsContext.Set("jobId", jobId);

            ILogger mainLogger = LogManager.GetLogger("mainLogger");
            ILogger mipLogger = LogManager.GetLogger("mipLogger");
            ILogger nghsLogger = LogManager.GetLogger("neighbourhoodLogger");
            ILogger batchProcessorLogger = LogManager.GetLogger("batchProcessorLog");

            batchProcessorLogger.Info("Job Execution started");

            mainLogger.Info("Engine Started");
            mipLogger.Info("MIP Started");
            batchProcessorLogger.Info("Job Execution MIP started");

            for (var line = 0; line < (1000000+(i*100)); line++)
            {
                mipLogger.Info("MIP : " + line);
            }

            mipLogger.Info("MIP Finished");
            batchProcessorLogger.Info("Job Execution MIP finished");

            var fileTarget = LogManager.Configuration.FindTargetByName<HtmlTarget>("mipLoggertarget");

            String[] fileComponents = fileTarget.ActualLogFileName.Split(new string[] { "/" }, 
                StringSplitOptions.None);

            mainLogger.Info("MIP has finished. The logs are located at: <a href = ./MIP/" +
                            fileComponents.Last() + ">MIP log file </a>");

            batchProcessorLogger.Info("Job Execution NHGS started");
            nghsLogger.Info("Neighbourhood started");

            for (var line = 0; line < (1000000 + (i * 100)); line++)
            {
                nghsLogger.Info("NHGH : " + line);
            }

            nghsLogger.Info("Neighbourhood finished");
            batchProcessorLogger.Info("Job Execution NHGS finished");

            fileTarget = LogManager.Configuration.FindTargetByName<HtmlTarget>("neighbourhoodTarget");
            fileComponents = fileTarget.ActualLogFileName.Split(new string[] { "/" },
                StringSplitOptions.None);

            mainLogger.Info("Neighbourhood has finished. The logs are located at: <a href = ./NHGS/" +
                            fileComponents.Last() + ">NGHS log file </a>");

         //   Thread.Sleep(100);
        }
    }

(Varun Kumar) #16

How does it matter how I am generating the log .I want to make sure that timing should not be different between logstast and without logstash


(Magnus Bäck) #17

How does it matter how I am generating the log .

Because if you're generating log files at 6 kB/s with or without Logstash something is incredibly wrong and I don't think the measurements are of any value.

Are you worried that the application's log calls will suffer from Logstash's reading of the same file at the same time? That sounds extremely farfetched, but you should be able to measure the time required for each I/O call with e.g. Process Monitor.


(Varun Kumar) #18

Hello Magnus ,
Thanks for your valuble comment .Actually what you are thinking is correct .Leaving aside the high logging time as secondary issue I am really worried about how much logstash will make the application run slower when simultaneously log generation and log collection will happen and the results are not good.Is there any way that I can improve the results using logstash.Anyconfig setting or anything .


(Magnus Bäck) #19

I am really worried about how much logstash will make the application run slower when simultaneously log generation and log collection will happen and the results are not good.

I don't see any reason why reading a log file would slow down the writing to the same file that's taking place at the same time, unless the system is heavily loaded and doesn't have CPU and RAM headroom for both processes. The I/O load from the Logstash process should be very very small since it's reading the same data that was just written so the read request won't ever have to hit the disk.

So, focus on the system load that Logstash adds. I reckon it's a couple of percent CPU and maybe a few hundred MB RAM, obviously depending on how much work it has to do. If you have that headroom I wouldn't worry about it.


(Varun Kumar) #20

Hello Magnus,
Thnaks for your support .There is not program running on my computer and below is configuration of my computer
RAM --16 gb
Processor-Intel core -I7 -4900MQ CPU @2.80Ghz 2.79 Ghz
So you can see its among the best one .
Log file is being generated in D: which is non sdd.Please clarify more of your points .