Wrong events order if time difference is less than 1ms

alexandrpaliy · September 23, 2021, 9:41am

Hi.

I use version 7.13.4 of all stack (ELK + filebeats).
Filebeat grabs events (auth.log entries in this case) on remote host, then passes them to logstash to filter/modify data, then events go to Elasticsearch. Talking about Logstash, I use aggregate plugin pretty extensively to populate events with required data, so I heavily rely on correct events order in same sequence.

Problem:
If two events come during 1 millisecond period, there is a chance they will end up in wrong order, which prevents aggregate plugin to populate data correctly.

Input data example:

2021-09-23T10:26:19.980107+03:00 my-test-server-hostname pluto[1284082]: "xauth-psk"[18373] 184.105.176.47 #23495: responding to Main Mode from unknown peer 184.105.176.47 on port 54209
2021-09-23T10:26:33.856101+03:00 my-test-server-hostname pluto[1284082]: "xauth-psk"[18374] 184.105.176.47 #23495: PAM: #23495: completed for user 'testusername' with status SUCCESSS
2021-09-23T10:26:35.022162+03:00 my-test-server-hostname pluto[1284082]: "xauth-psk"[18374] 184.105.176.47 #23495: the peer proposed: 0.0.0.0/0:0/0 -> 192.168.43.123/32:0/0
2021-09-23T10:26:35.022643+03:00 my-test-server-hostname pluto[1284082]: "xauth-psk"[18374] 184.105.176.47 #23498: responding to Quick Mode proposal {msgid:bb095b40}
2021-09-23T10:26:35.023138+03:00 my-test-server-hostname pluto[1284082]: "xauth-psk"[18374] 184.105.176.47 #23498:   them: 184.105.176.47[192.168.1.101,+MC+XC+S=C]===192.168.43.123/32
2021-09-23T10:27:05.045830+03:00 my-test-server-hostname pluto[1284082]: "xauth-psk"[18374] 184.105.176.47 #23495: deleting state (STATE_MODE_CFG_R1) aged 45.066s and sending notification
2021-09-23T10:27:39.036857+03:00 my-test-server-hostname pluto[1284082]: "xauth-psk"[18374] 184.105.176.47 #23498: deleting state (STATE_QUICK_R1) aged 64.013s and sending notification
2021-09-23T10:27:39.036955+03:00 my-test-server-hostname pluto[1284082]: "xauth-psk"[18374] 184.105.176.47 #23498: ESP traffic information: in=0B out=0B XAUTHuser=testusername

But if I check same events in kibana:

I have tried to do my research. I have

pipeline.workers: 1
pipeline.ordered: true

in my logstash pipeline configuration from the very beginning of using aggregate. Also, I know that logstash does not support time precision more than 1ms, so, at least for kibana and Elasticsearch, I'm using some kind of workaround:

grok {
    match => { "message" => '^%{SYSLOGTIMESTAMP:timestamp_string} %{DATA:hostname} ... ' }
#   I get "raw timestamp text" as `timestamp_string` field ...
}

date {
    timezone => "UTC"
    match => ["timestamp_string", "ISO8601"]
#   ... then I generate logstash own `@timestamp` field based on `timestamp_string` ...
}

mutate {
    copy => { "timestamp_string" => "timestamp" }
#   ... and then I copy same string to `timestamp` field
}

(and then, in Elasticsearch index mapping, I set "timestamp" to "date_nanos" format, so, when saving to elastic index, it will be parsed/stored correctly + in kibana index pattern I use "timestamp" as a time field)

As far as I understand, this problem exists because I use logstash date filter plugin - I guess, without date plugin (i.e. without rewriting events' "@timestamp" field) logstash could possibly save "natural events order". But, without date plugin, I risk to get events heavily out-of-order in case of some logstash/filebeat/network temporary problems.

Right now, at this moment, I'm experimenting and have disabled date plugin at all - I want to test, will it work correctly indeed.
But, even if it'll end up working correctly, I'm still curious is it possible to make it work with date plugin (so logstash will populate it's "@timestamp" with a value from event, but will still respect the order of events). In other words, is there any workaround for logstash to support time precision higher than 1ms by itself.

alexandrpaliy · September 23, 2021, 2:10pm

And... Looks like it works
What's interesting, is that order of events in Kibana is partially wrong (no matter if I try sorting them by my "timestamp" field, or not), but Logstash, on the other hand, is working correctly so far (I don't see any cases of my problem with aggregate plugin since the beginning of the experiment).

Anyway, I'm still interested in this:

system · October 21, 2021, 2:11pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash events out of order Logstash	3	2084	April 29, 2020
How to parse logs in same order as in log files Kibana	5	2124	July 6, 2018
Logstash @timestamp precision causing misordering of data Logstash	5	1773	March 17, 2017
Timestamp from logfile lacks millisec Logstash	5	1810	July 6, 2017
Dateparsefailure while replacing @timestamp Logstash	19	3462	July 12, 2017

Wrong events order if time difference is less than 1ms

Related topics