hello Elastic forum,
We have noticed a 3-hour time difference in our results, when comparing the output of our PostgreSQL database and our Kibana reports, and various requests to the _search endpoint with Query DSL. For example, we have this below:
1.1. with PostgreSQL this query below:
select "EmployeeId",
to_char(date_trunc('day',"AccessTime"),'dd-Mon-yyyy') as trunc_access_time_day,
min("AccessTime") as min_access_time_day,
max("AccessTime") as max_access_time_day
from public."AccessLogs" f
where "EmployeeId" = 886
group by "EmployeeId", to_char(date_trunc('day',"AccessTime"),'dd-Mon-yyyy')
order by min("AccessTime") desc;
returns this row below for 09.09.2020:
886 09-Sep-2020 2020-09-09 06:47:18.958518 2020-09-09 08:41:43.799184
1.2. and for the same employee, this Elastic DSL query returns this below:
and a further clarification on this: we are using a Logstash data ingest pipeline with a simple JDBC input, and an elasticsearch output below:
output {
elasticsearch { #protocol => http
index => "access_logs"
document_type => "access_log" #document_id => "%{uid}" #hosts => ["http://127.0.0.1:9200"]
hosts => ["http://127.0.0.1:9200"]
}
}
Hello Yassine, thanks for your reply, however this new query does not fix the error, have a look at the output below:
{
"key_as_string" : "2020-09-09T00:00:00.000Z",
"key" : 1599609600000,
"doc_count" : 225,
"max_accesstime_day" : {
"value" : 1.599630103799E12,
"value_as_string" : "09-Sep-2020 5:41:43"
},
"min_accesstime_day" : {
"value" : 1.599623238958E12,
"value_as_string" : "09-Sep-2020 3:47:18"
}
}
So as I obtained last week, these results are exactly 3 hours apart:
886 09-Sep-2020 2020-09-09 06:47:18.958518 2020-09-09 08:41:43.799184
and 06:47 - 03:47 == 3 hours,
and 08:41 - 5:41 == 3 hours.
Also, you said that my DSL query is doing an OR filter (should) for two employees id : 886 and 894 : if you look closely, one block refers to employeeid and the 2nd block refers to the employeetagid, which is not the same thing.
3.2. It turns out, that employeeId==886 refers to the same employee as employeetagid == 894, so this is correct as well.
I also managed to create another data ingest pipeline with Logstash today, by using the date filter plugin, and I got it working correctly:
filter {
date {
match => [ "AccessTime", "yyyy-MM-dd HH:mm:ss" ]
# set the time zone used:
timezone => "Europe/Bucharest"
}
But this new data ingest pipeline, using the above date filter, also shows exactly the same problem in Kibana, ie. the results are exactly 3 hours apart:
2. So as I obtained last week, these results are exactly 3 hours apart:
886 09-Sep-2020 2020-09-09 06:47:18.958518 2020-09-09 08:41:43.799184
and 06:47 - 03:47 == 3 hours,
and 08:41 - 5:41 == 3 hours.
So Yassine, or whoever else is looking at this post, please help to fix this problem: it is very important for us, to get our Kibana reports correctly matched, between what we see in our PostgreSQL database, and what we find in Kibana!
thanks in advance, Bogdan
hello everyone, I have done an interesting test with my colleague Adrian today on this issue, and it seems that:
if we continue to query the console using our original DSL query, we still get erroneous results, ie. the results are 3 hours too early, whereas:
if we use Kibana visualizations, then we can see these timestamps correctly:
I hope this makes this issue a little easier for you to resolve! Thanks, Bogdan
hello David, thanks for your reply!
I actually did define the timezone when I imported our data, see my post from November 6 below:
I also managed to create another data ingest pipeline with Logstash today, by using the date filter plugin, and I got it working correctly:
filter {
date {
match => [ "AccessTime", "yyyy-MM-dd HH:mm:ss" ]
# set the time zone used:
timezone => "Europe/Bucharest"
}
Is this the way you refer to defining the timezone when importing your data, ie. by using the date filter plugin with Logstash, or are you referring to something else?
greetings from Bucharest, Bogdan
so you can see that the above values of 5:41 am and 3:47 am are off by exactly 3 hours, from the below PostgreSQL query and output below:
select "EmployeeId",
to_char(date_trunc('day',"AccessTime"),'dd-Mon-yyyy') as trunc_access_time_day,
min("AccessTime") as min_access_time_day,
max("AccessTime") as max_access_time_day
from public."AccessLogs" f
where "EmployeeId" = 886
group by "EmployeeId", to_char(date_trunc('day',"AccessTime"),'dd-Mon-yyyy')
order by min("AccessTime") desc;
--> which returns this on development:
hi David, I have only used Elastic search for 2.5 months now, and so far I have never had to dig out documents or mappings from ES: can you please tell me how to get these out of Elasticsearch, so we can proceed: thanks!
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.