Logstash output differs in different runs


(sahere rahimi) #1

hi all,
I am using jdbc plugin to load data from database, my logstash config file is as following:

input {
jdbc {
    jdbc_driver_library => "sqljdbc42.jar"
    jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
    jdbc_connection_string => "jdbc:sqlserver://localhost:3333;databaseName=Total"
    jdbc_user => "****"
    jdbc_password => "****"
	schedule => "38 11-23/1 * * *"
	connection_retry_attempts => "15"
	connection_retry_attempts_wait_time => "60"
	last_run_metadata_path => "E:\logstashlog\.my_last_run"
    statement => "
	DECLARE @DDate1  CHAR(10)
   select @DDate1=REPLACE(MAX(FDate),'/','') from Total.dbo.Total_Control
   select [BaseDate_Accept_Setel]
      ,[Financial_Date]
      ,[Cycle_No] from dbo.vw_Total where (BaseDate_Accept_Setel <= @DDate1) and (BaseDate_Accept_Setel > :sql_last_value) 
	"
use_column_value => "true"
tracking_column => "basedate_accept_setel"
}

}
filter {
mutate {
  add_field => {
   "basedate" => "%{basedate_accept_setel}"
   
  }
  convert => {
  "basedate" => "string"
  }
}
ruby {
            code => "
            event.set('Myear', event.get('basedate')[0..3])
            event.set('Mmonth', event.get('basedate')[4..5])
            event.set('Mday', event.get('basedate')[6..7])
            "
    }
	ruby {
	        code => "
			event.set('disp_CmdCycleNo',nil)
			"
	}
	mutate {
	add_field => {
		"P_date" => "%{Myear}/%{Mmonth}/%{Mday}"
	}
	}
	
jdbc_streaming {
    jdbc_driver_library => "sqljdbc42.jar"
    jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
    jdbc_connection_string => "jdbc:sqlserver://localhost:3333;databaseName=Total"
    jdbc_user => "****"
    jdbc_password => "****"
    parameters => {"mine" => "P_date"}
    statement => "
	DECLARE @ddate2  CHAR(10)
	set @ddate2=:mine
SELECT Total.dbo.s2md(@ddate2) as gdate"
target => "gdate"
}
mutate { replace => { "gdate" => "%{[gdate][0][gdate]}" } }
    mutate { gsub => [ "gdate", "/", "-" ] }
	mutate { gsub => [ "P_date", "/", "-" ] }
elasticsearch {
hosts => ["localhost:9200"]
index => "disp_%{P_date}"
query_template => "matching-request.json"
fields => {
		 "CmdCycleNo" => "disp_CmdCycleNo"
		
}
}
}

output {
  elasticsearch { 
    hosts => ["http://localhost:9200"]
    index => "total_disp_%{P_date}"
 }
  stdout { codec => rubydebug }
}

notably, last amount in ".my_last_run" is "20190313". when I start logstash, two amounts of "BaseDate_Accept_Setel" satisfy the condition of statement which these amounts are "20190314" and "20190315".
when i checked kibana console, both indices are created as "total_disp_20190314" and "total_disp_20190315" and when i search in logstash log there is no error. but, when i analyzed these indices in kibana console and checked the number of records, everything for "total_disp_20190314" is ok but the records of "20190315" is less than expected ones. after that i count the number of json records for "20190315" in logstash log which is correct. what is the problem ? why does kibana show the number of index records wrongly?

in addition, in the management part i checked the fields of two indices which there are some differences as following:

for "total_disp_20190314" there is 103 fields and for "total_disp_20190315" there are 106 fields and extra fields in "total_disp_20190315" are : "tags", "tags.keyword", and "gdate.keyword", notably there is no "tags", "tags.keyword" and "gdate.keyword" in "total_disp_20190314" and there is "gdate" field in "total_disp_20190314" same as "total_disp_20190315". in addition, in the ".my_last_run" file, it is expected to see "--- 20190315" but there is "--- 20190314" . could you please advise me about this? thanks.

Also, when i creating index pattern in kibana, in the case of "total_disp_20190314" in the "Time Filter field name" the field "gdate" exists and can be selected but in the case of "total_disp_20190315" the filed "gdate" doesn't exist


#2
input {

     jdbc
        {
         jdbc_driver_library => "sqljdbc42.jar"
         jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
         jdbc_connection_string => "jdbc:sqlserver://25.36.859.333:2588;instanceName=fggrttt;databasename=bbbb"
         jdbc_user => ""
         jdbc_password => ""
         #Every day of the month, at 10:00AM,6:00PM AND 02:00AM 
         schedule => "0 6,14,22 * * *"
  
         statement => "SELECT [field1],[field2]FROM [TABLE] where date > :sql_last_value"
         use_column_value => true
         tracking_column => date
         tracking_column_type => "timestamp"
         clean_run => false
         last_run_metadata_path => "/path_to/.logstash_jdbc_last_run"
        }
}

I'm not sure I understood your need. But me with this configuration, at each execution, I load data whose date is greater than sql_last_value.
sql_last_value being the last loading date, stored in the .logstash_jdbc_last_run file


(sahere rahimi) #3

many thanks for your reply. the issue had been solved