I am having an issue with Elasticsearch when querying. My python application queries Elasticsearch about every 10 seconds looking for new messages that it has not processed. Once and a while my python application crashes due to Elasticsearch skipping/missing messages in the query results. Here is my most recent crash. I can't post the full logs here due to the length limit, so here is a pastebin link, https://pastebin.com/b6AkMeGU.
2017-09-27 10:51:26.361 DEBUG:http://192.168.1.233:9200 "GET /_search HTTP/1.1" 200 52753
2017-09-27 10:51:26.430 INFO:GET http://192.168.1.233:9200/_search [status:200 request:5.464s]
2017-09-27 10:51:26.430 DEBUG:> {"size": 100, "query": {"bool": {"must": {"range": {"@timestamp": {"gte": "2017-09-27T14:50:41.656Z"}}}, "filter": {"exists": {"field": "Message_Type"}}, "must_not": [[{"term": {"_id": "AV7D0YSHptardQ4PUQCb"}}]]}}, "sort": {"@timestamp": {"order": "asc"}, "Fractional_Sec": {"order": "asc"}}}
2017-09-27 10:51:26.432 DEBUG:Query successfully pulled
2017-09-27 10:51:26.432 DEBUG:2017-09-27T14:51:19.589Z 0.589002 AV7D0hi2ptardQ4PUQCc
2017-09-27 10:51:26.433 DEBUG:2017-09-27T14:51:23.989Z 0.989352 AV7D0i7rptardQ4PUQCe
You can see the last two lines of the log, are the messages my application got from Elasticsearch. It crashes because there is suppose to be a 3rd message in this pull right between the first and the second message. I have printed the @timestamp field, my Fractional Sec field and the message ID.
Just seconds later, I issue the same query, but get the missing message.
2017-09-27 10:52:21.567 DEBUG:http://192.168.1.233:9200 "GET /_search HTTP/1.1" 200 60152
2017-09-27 10:52:21.646 INFO:GET http://192.168.1.233:9200/_search [status:200 request:5.540s]
2017-09-27 10:52:21.646 DEBUG:> {"size": 100, "query": {"bool": {"must": {"range": {"@timestamp": {"gte": "2017-09-27T14:50:41.656Z"}}}, "filter": {"exists": {"field": "Message_Type"}}, "must_not": [[{"term": {"_id": "AV7D0YSHptardQ4PUQCb"}}]]}}, "sort": {"@timestamp": {"order": "asc"}, "Fractional_Sec": {"order": "asc"}}}
2017-09-27 10:52:21.648 DEBUG:2017-09-27T14:51:19.589Z 0.589002 AV7D0hi2ptardQ4PUQCc
2017-09-27 10:52:21.648 DEBUG:2017-09-27T14:51:21.102Z 0.102055 AV7D0iN0ptardQ4PUQCd
2017-09-27 10:52:21.649 DEBUG:2017-09-27T14:51:23.989Z 0.989352 AV7D0i7rptardQ4PUQCe
2017-09-27 10:52:21.649 DEBUG:2017-09-27T14:51:24.354Z 0.354648 AV7D0i7sptardQ4PUQCg
2017-09-27 10:52:21.649 DEBUG:2017-09-27T14:51:26.899Z 0.899377 AV7D0jVCptardQ4PUQCi
2017-09-27 10:52:21.649 DEBUG:2017-09-27T14:51:27.252Z 0.252254 AV7D0jajptardQ4PUQCl
2017-09-27 10:52:21.649 DEBUG:2017-09-27T14:51:32.958Z 0.958178 AV7D0kztptardQ4PUQCn
2017-09-27 10:52:21.649 DEBUG:2017-09-27T14:52:11.023Z 0.02344 AV7D0uGhptardQ4PUQCo
2017-09-27 10:52:21.649 DEBUG:2017-09-27T14:52:13.055Z 0.055302 AV7D0umOptardQ4PUQCp
2017-09-27 10:52:21.649 DEBUG:2017-09-27T14:52:13.385Z 0.385317 AV7D0urYptardQ4PUQCr
2017-09-27 10:52:21.649 DEBUG:2017-09-27T14:52:13.707Z 0.707511 AV7D0uwaptardQ4PUQCt
2017-09-27 10:52:21.649 DEBUG:2017-09-27T14:52:14.030Z 0.030358 AV7D0u1dptardQ4PUQCw
See, AV7D0iN0ptardQ4PUQCd, that message should have been in the first pull but is not. I been troubleshooting this error on my end for over a week and I do not think I am doing anything wrong. If anyone can help me that would be great.