3 hour time difference between Kibana reports and PostgreSQL database

hello Elastic forum,
We have noticed a 3-hour time difference in our results, when comparing the output of our PostgreSQL database and our Kibana reports, and various requests to the _search endpoint with Query DSL. For example, we have this below:
1.1. with PostgreSQL this query below:
select "EmployeeId",
to_char(date_trunc('day',"AccessTime"),'dd-Mon-yyyy') as trunc_access_time_day,
min("AccessTime") as min_access_time_day,
max("AccessTime") as max_access_time_day
from public."AccessLogs" f
where "EmployeeId" = 886
group by "EmployeeId", to_char(date_trunc('day',"AccessTime"),'dd-Mon-yyyy')
order by min("AccessTime") desc;
returns this row below for 09.09.2020:
886 09-Sep-2020 2020-09-09 06:47:18.958518 2020-09-09 08:41:43.799184
1.2. and for the same employee, this Elastic DSL query returns this below:

POST /access_logs/_search
{
"query" : {
"bool" : {
"should" : [
{ "term" : { "employeeid" : "886" } },
{ "term" : { "employeetagid" : "894" } }
]
}
},
"aggs": {
"accesstime_day": {
"date_histogram": {
"field": "accesstime",
"fixed_interval": "1d"
},
"aggs" : {
"min_accesstime_day": {"min": {"field": "accesstime", "format": "dd-MMM-yyyy H:m:s"}},
"max_accesstime_day": {"max": {"field": "accesstime", "format": "dd-MMM-yyyy H:m:s"}}
}
}
}
}
returns this block below:
{
"key_as_string" : "2020-09-09T00:00:00.000Z",
"key" : 1599609600000,
"doc_count" : 2346,
"max_accesstime_day" : {
"value" : 1.599630103799E12,
"value_as_string" : "09-Sep-2020 5:41:43"
},
"min_accesstime_day" : {
"value" : 1.599623238958E12,
"value_as_string" : "09-Sep-2020 3:47:18"
}
}
1.3. Therefore, these results are exactly 3 hours apart:
886 09-Sep-2020 2020-09-09 06:47:18.958518 2020-09-09 08:41:43.799184
and 06:47 - 03:47 == 3 hours,
and 08:41 - 5:41 == 3 hours.

and a further clarification on this: we are using a Logstash data ingest pipeline with a simple JDBC input, and an elasticsearch output below:
output {
elasticsearch {
#protocol => http
index => "access_logs"
document_type => "access_log"
#document_id => "%{uid}"
#hosts => ["http://127.0.0.1:9200"]
hosts => ["http://127.0.0.1:9200"]
}
}

Your DSL query is doing an OR filter (should) for two employees id : 886 and 894 !

{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "employeeid": "886"
          }
        },
        {
          "term": {
            "employeetagid": "894"
          }
        }
      ]
    }
  },
  "aggs": {
    "accesstime_day": {
      "date_histogram": {
        "field": "accesstime",
        "fixed_interval": "1d"
      },
      "aggs": {
        "min_accesstime_day": {
          "min": {
            "field": "accesstime",
            "format": "dd-MMM-yyyy H:m:s"
          }
        },
        "max_accesstime_day": {
          "max": {
            "field": "accesstime",
            "format": "dd-MMM-yyyy H:m:s"
          }
        }
      }
    }
  }
}

Try this query

{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "employeeid": "886"
          }
        }
      ]
    }
  },
  "aggs": {
    "accesstime_day": {
      "date_histogram": {
        "field": "accesstime",
        "fixed_interval": "1d"
      },
      "aggs": {
        "min_accesstime_day": {
          "min": {
            "field": "accesstime",
            "format": "dd-MMM-yyyy H:m:s"
          }
        },
        "max_accesstime_day": {
          "max": {
            "field": "accesstime",
            "format": "dd-MMM-yyyy H:m:s"
          }
        }
      }
    }
  }
}

Hello Yassine, thanks for your reply, however this new query does not fix the error, have a look at the output below:
{
"key_as_string" : "2020-09-09T00:00:00.000Z",
"key" : 1599609600000,
"doc_count" : 225,
"max_accesstime_day" : {
"value" : 1.599630103799E12,
"value_as_string" : "09-Sep-2020 5:41:43"
},
"min_accesstime_day" : {
"value" : 1.599623238958E12,
"value_as_string" : "09-Sep-2020 3:47:18"
}
}

  1. So as I obtained last week, these results are exactly 3 hours apart:
    886 09-Sep-2020 2020-09-09 06:47:18.958518 2020-09-09 08:41:43.799184
    and 06:47 - 03:47 == 3 hours,
    and 08:41 - 5:41 == 3 hours.
  2. Also, you said that my DSL query is doing an OR filter (should) for two employees id : 886 and 894 : if you look closely, one block refers to employeeid and the 2nd block refers to the employeetagid, which is not the same thing.
    3.2. It turns out, that employeeId==886 refers to the same employee as employeetagid == 894, so this is correct as well.

I also managed to create another data ingest pipeline with Logstash today, by using the date filter plugin, and I got it working correctly:
filter {
date {

		match => [ "AccessTime", "yyyy-MM-dd HH:mm:ss" ] 

		# set the time zone used: 
		timezone => "Europe/Bucharest"
		
}

}

But this new data ingest pipeline, using the above date filter, also shows exactly the same problem in Kibana, ie. the results are exactly 3 hours apart:
2. So as I obtained last week, these results are exactly 3 hours apart:
886 09-Sep-2020 2020-09-09 06:47:18.958518 2020-09-09 08:41:43.799184
and 06:47 - 03:47 == 3 hours,
and 08:41 - 5:41 == 3 hours.

So Yassine, or whoever else is looking at this post, please help to fix this problem: it is very important for us, to get our Kibana reports correctly matched, between what we see in our PostgreSQL database, and what we find in Kibana!
thanks in advance, Bogdan

hello everyone, I have done an interesting test with my colleague Adrian today on this issue, and it seems that:

  1. if we continue to query the console using our original DSL query, we still get erroneous results, ie. the results are 3 hours too early, whereas:
  2. if we use Kibana visualizations, then we can see these timestamps correctly:
    I hope this makes this issue a little easier for you to resolve! Thanks, Bogdan

hello everyone, this issue has been open now for almost one week: can someone please help to resolve it asap with us!
thanks & regards, Bogdan

Probably because you did not define the timezone when you imported your data and elasticsearch considers it as UTC?

AFAIK Kibana uses the browser settings to display values in the right timezone.

hello David, thanks for your reply!
I actually did define the timezone when I imported our data, see my post from November 6 below:
I also managed to create another data ingest pipeline with Logstash today, by using the date filter plugin, and I got it working correctly:
filter {
date {

		match => [ "AccessTime", "yyyy-MM-dd HH:mm:ss" ] 

		# set the time zone used: 
		timezone => "Europe/Bucharest"
		
}

}

Is this the way you refer to defining the timezone when importing your data, ie. by using the date filter plugin with Logstash, or are you referring to something else?
greetings from Bucharest, Bogdan

Could you share a document that have been imported? And the mapping?

hi David, thanks for your reply: sure I can share the mapping below:
POST /access_logs/_search { "query" : { "bool" : { "should" : [ { "term" : { "employeeid" : "886" } }, { "term" : { "employeetagid" : "894" } } ] } }, "aggs": { "accesstime_day": { "date_histogram": { "field": "accesstime", "fixed_interval": "1d" }, "aggs" : { "min_accesstime_day": {"min": {"field": "accesstime", "format": "dd-MMM-yyyy H:m:s"}}, "max_accesstime_day": {"max": {"field": "accesstime", "format": "dd-MMM-yyyy H:m:s"}} } } } }

and the output blocks you can find below too:
{
"key_as_string" : "2020-09-09T00:00:00.000Z",
"key" : 1599609600000,
"doc_count" : 1077,
"max_accesstime_day" : {
"value" : 1.599630103799E12,
"value_as_string" : "09-Sep-2020 5:41:43"
},
"min_accesstime_day" : {
"value" : 1.599623238958E12,
"value_as_string" : "09-Sep-2020 3:47:18"
}
},

so you can see that the above values of 5:41 am and 3:47 am are off by exactly 3 hours, from the below PostgreSQL query and output below:
select "EmployeeId",
to_char(date_trunc('day',"AccessTime"),'dd-Mon-yyyy') as trunc_access_time_day,
min("AccessTime") as min_access_time_day,
max("AccessTime") as max_access_time_day
from public."AccessLogs" f
where "EmployeeId" = 886
group by "EmployeeId", to_char(date_trunc('day',"AccessTime"),'dd-Mon-yyyy')
order by min("AccessTime") desc;
--> which returns this on development:

886 11-Nov-2020 2020-11-11 08:08:31.63731 2020-11-11 11:49:32.711689
886 10-Nov-2020 2020-11-10 08:08:31.63731 2020-11-10 11:49:32.711689
886 09-Sep-2020 2020-09-09 06:47:18.958518 2020-09-09 08:41:43.799184
886 08-Sep-2020 2020-09-08 06:51:56.094963 2020-09-08 12:05:26.24529
886 07-Sep-2020 2020-09-07 06:55:52.827203 2020-09-07 10:32:14.576888
886 06-Sep-2020 2020-09-06 08:23:27.511487 2020-09-06 11:47:18.294693
886 05-Sep-2020 2020-09-05 07:44:35.602995 2020-09-05 12:13:50.369206

I asked for one of the documents, not the result of an aggregation.
Could you provide this please?

I also asked for the mapping.
Could you provide it please?

hi David, I have only used Elastic search for 2.5 months now, and so far I have never had to dig out documents or mappings from ES: can you please tell me how to get these out of Elasticsearch, so we can proceed: thanks!

Run:

GET /access_logs/_search
{ "size": 1, "query" : { "bool" : { "should" : [ { "term" : { "employeeid" : "886" } }, { "term" : { "employeetagid" : "894" } } ] } } }

And

GET /access_logs/_mapping