Elasticsearch Missing Messages on Query

(James Crowley) #1

I am having an issue with Elasticsearch when querying. My python application queries Elasticsearch about every 10 seconds looking for new messages that it has not processed. Once and a while my python application crashes due to Elasticsearch skipping/missing messages in the query results. Here is my most recent crash. I can't post the full logs here due to the length limit, so here is a pastebin link, https://pastebin.com/b6AkMeGU.

2017-09-27 10:51:26.361 DEBUG: "GET /_search HTTP/1.1" 200 52753
2017-09-27 10:51:26.430 INFO:GET [status:200 request:5.464s]
2017-09-27 10:51:26.430 DEBUG:> {"size": 100, "query": {"bool": {"must": {"range": {"@timestamp": {"gte": "2017-09-27T14:50:41.656Z"}}}, "filter": {"exists": {"field": "Message_Type"}}, "must_not": [[{"term": {"_id": "AV7D0YSHptardQ4PUQCb"}}]]}}, "sort": {"@timestamp": {"order": "asc"}, "Fractional_Sec": {"order": "asc"}}}
2017-09-27 10:51:26.432 DEBUG:Query successfully pulled
2017-09-27 10:51:26.432 DEBUG:2017-09-27T14:51:19.589Z 0.589002 AV7D0hi2ptardQ4PUQCc
2017-09-27 10:51:26.433 DEBUG:2017-09-27T14:51:23.989Z 0.989352 AV7D0i7rptardQ4PUQCe

You can see the last two lines of the log, are the messages my application got from Elasticsearch. It crashes because there is suppose to be a 3rd message in this pull right between the first and the second message. I have printed the @timestamp field, my Fractional Sec field and the message ID.

Just seconds later, I issue the same query, but get the missing message.

2017-09-27 10:52:21.567 DEBUG: "GET /_search HTTP/1.1" 200 60152
2017-09-27 10:52:21.646 INFO:GET [status:200 request:5.540s]
2017-09-27 10:52:21.646 DEBUG:> {"size": 100, "query": {"bool": {"must": {"range": {"@timestamp": {"gte": "2017-09-27T14:50:41.656Z"}}}, "filter": {"exists": {"field": "Message_Type"}}, "must_not": [[{"term": {"_id": "AV7D0YSHptardQ4PUQCb"}}]]}}, "sort": {"@timestamp": {"order": "asc"}, "Fractional_Sec": {"order": "asc"}}}
2017-09-27 10:52:21.648 DEBUG:2017-09-27T14:51:19.589Z 0.589002 AV7D0hi2ptardQ4PUQCc
2017-09-27 10:52:21.648 DEBUG:2017-09-27T14:51:21.102Z 0.102055 AV7D0iN0ptardQ4PUQCd
2017-09-27 10:52:21.649 DEBUG:2017-09-27T14:51:23.989Z 0.989352 AV7D0i7rptardQ4PUQCe
2017-09-27 10:52:21.649 DEBUG:2017-09-27T14:51:24.354Z 0.354648 AV7D0i7sptardQ4PUQCg
2017-09-27 10:52:21.649 DEBUG:2017-09-27T14:51:26.899Z 0.899377 AV7D0jVCptardQ4PUQCi
2017-09-27 10:52:21.649 DEBUG:2017-09-27T14:51:27.252Z 0.252254 AV7D0jajptardQ4PUQCl
2017-09-27 10:52:21.649 DEBUG:2017-09-27T14:51:32.958Z 0.958178 AV7D0kztptardQ4PUQCn
2017-09-27 10:52:21.649 DEBUG:2017-09-27T14:52:11.023Z 0.02344 AV7D0uGhptardQ4PUQCo
2017-09-27 10:52:21.649 DEBUG:2017-09-27T14:52:13.055Z 0.055302 AV7D0umOptardQ4PUQCp
2017-09-27 10:52:21.649 DEBUG:2017-09-27T14:52:13.385Z 0.385317 AV7D0urYptardQ4PUQCr
2017-09-27 10:52:21.649 DEBUG:2017-09-27T14:52:13.707Z 0.707511 AV7D0uwaptardQ4PUQCt
2017-09-27 10:52:21.649 DEBUG:2017-09-27T14:52:14.030Z 0.030358 AV7D0u1dptardQ4PUQCw

See, AV7D0iN0ptardQ4PUQCd, that message should have been in the first pull but is not. I been troubleshooting this error on my end for over a week and I do not think I am doing anything wrong. If anyone can help me that would be great.

(James Crowley) #2

I used a diff checker on the results that came back from Elasticsearch. The only differences are the number of results, 2 vs 12, the missing message, AV7D0iN0ptardQ4PUQCd, and all the additional messages in the second result. The queries I issued are exactly the same, the only thing that changed was time. Could Elasticsearch not be processing the messages in order?

(James Crowley) #3

I looked at my Elasticsearch setup. I wanted to makes sure shards we not some how messing around with the messages. I am currently only running on one shard. I am starting to wondering if it has to deal with the order Logstash is sending the messages to Elasticsearch. Is it possible for Logstash to process the message out of order? If so, how do I enforce the order?

(James Crowley) #4

I am kind of stuck here, Can anyone help?

(Mark Walkom) #5

Elasticsearch and Logstash have no concept of order in this context, they just process things as they get them.

(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.