Documents order in Elasticsearch after processing with filebeat/logstash

njkl · August 5, 2017, 1:57pm

Hi,

I have the following setup to gather data : filebeat -> logstash -> elastichsearch

As input I have files with the following plain text format :

2017-07-25 10:36:07,988 21 User 1 Ligne 0
2017-07-25 10:36:07,988 21 User 1 Ligne 1
2017-07-25 10:36:07,988 21 none 1 Ligne 2

once the file is harvested by filebeat it goes thru logstash where I map @timestamp field. but when I query Elasticsearch I get different order than the one the source log file. (log2, log 0, log 1 instead of log 0, log 1, log 2 )

the timestamp field in not precise enough I will always get more than 1 doc for the same exact timestamp.

How can I solve this issue ? I would like to be able to get document in elasticsearch in the same order as their are in the log file.

Regards

magnusbaeck · August 5, 2017, 3:28pm

Perhaps you can use the file offset as a secondary sort key? I believe Filebeat records it in the offset field.

njkl · August 6, 2017, 12:08pm

You are right, if I use both @timestamp and the offset. the result seems to be good. but I have one more question. Do you know how can I merge @timestamp and offset into a new field and use it as unique sorting criteria (I suppose its better when it comes to reindexing and performance)

update :
I tried this

add_field => { "timestamp_sort" => "%{@timestamp}+%{offset}" }

But as result I got this :

"timestamp_sort": "2017-08-06T08:36:07.988Z+53"

magnusbaeck · August 6, 2017, 6:49pm

You'd have to use a ruby filter to do that, but I don't think just adding the timestamp expressed in milliseconds (if that's what you meant) with the file offset is a good idea since entries right after a rotation could very well sort prior to entries just before the rotation. You should figure out something else that doesn't have that problem.

njkl · August 22, 2017, 5:10pm

Hi Magnus,

Thank you for your feedback. So far, I was able to figure out the following :

declare timestamp_sort as date in the mapping of the index in elasticsearch

"timestamp_sort" : {
"type" : "date",
"format": "yyyy-MM-dd HH:mm:ss,SSSSSSSSSSSSSSSS"
}

in logstash simply concatenate timestamp and the offset (not adding but concatenate )

add_field => { “timestamp_sort” => “%{@timestamp}%{offset}” }

when pulling data from ELK, the sorting will be based on the new field (asc) + file name (desc).

This seems to be working, at least I didn't see any issue so far

Regards

system · September 19, 2017, 5:10pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Which is the best way to sort data in elastic search Elasticsearch	21	1755	April 11, 2018
Regarding @timestamp Elasticsearch	7	1486	July 5, 2017
Maintaining Order of logs with elastic-search search API Elasticsearch	5	419	June 5, 2018
Sort by time received instead of timestamp Elasticsearch	8	1524	August 16, 2022
How to maintain the order of logs Logstash	9	7283	July 6, 2017

Documents order in Elasticsearch after processing with filebeat/logstash

Related topics