Does elasticsearch expect the date field to be in increasing order?
I am sending docs from python using elasticsearch-py. The documents I send are sent in bursts (think sending some documents every random interval, like on average 100 documents every second), and I'm using the
index api to tell elasticsearch to index each one of them (I'm using
index and not
create because I want an automatically created ID).
These documents have a date field (in
epoch_millis). The point is: the dates are increasing within a burst, but they aren't across different bursts.
For example. I may send now a burst of 10 documents, whose dates are 1000000000011, 1000000000012, ..., 1000000000020 (imagine these are increasing epoch_millis that make sense). Then, in 1 second, I send another burst whose dates are like: 1000000000001, 1000000000002, ... etc. (Elasticsearch receives after documents whose date are before dates that it has already received and indexed.
My problem is that these documents are indexed very strangely. They end up being lumped quite at random. Sometimes they have the right date, sometimes they are indexed all at some wrong time bin, sometimes I don't get any document in elasticsearch for time ranges in which I should see documents.
I'm not sure if the problem is caused by something related to how elasticsearch indexes documents by dates, or by some obscure problem of elasticsearch-py, or by some other low-level issue like network clogging. I wouldn't know. To me it just seems that elasticsearch is "making confusion" in indexing all these documents.
Note: if I send one document alone (for example via curl), it is indexed correctly.