Memory leak? - constant memory use increment every day


(David Earl) #1

Twice in the last two months, the elasticsearch process has been killed by the OS oom-killer, so I've been monitoring things a bit more recently.

I'm running a single shard on a Raspberry Pi (so 1GB total, 4 cores, no swap, Raspbian Stretch). Currently ES 6.1.1. I allocate 512m to heap and max heap in jvm options, and it starts up with top showing 75.6% allocated to the ES Java process, which as you'd expect is by far the largest memory user. I know there is stack and code memory consumption, but it seems quite a lot to start with compared with the requested heap. However, the problem seems to be that the process grows at a pretty constant rate - about 0.6% of memory per day, so it inevitably crashes sooner or later. It's been running for 2 years or so quite happily in this configuration, and memory problems have only been fairly recent, so either it is to do with a recent version, or something I've changed in how I use it that is provoking this.

When it crashes it is usually during backup, I guess either because it has to touch every record or because it is a relatively long process, so anything else that starts while it is running just tips it over, so I don't think that is itself to blame.

I could, of course, pro-actively restart ES every week or so, say to preempt problems. However, it would be good to locate the cause of the problem. It has all the hallmarks of a memory leak.

Perhaps it might be better to set the heap memory lower so it runs out of heap before the OS runs out of memory, though I think the consequences would ultimately be the same. But as it presumably caches lots of stuff in memory, it makes sense to run at near memory capacity for performance reasons; and in any case, if it steadily adds 6MB or so a day, sooner or later it would run out anyway.

I'm using it perhaps a bit differently than for example logstash would. Logstash adds new indexes every day, and primarily adds new records but does little subsequent modification, while I'm more a conventional app where the indexes are fixed and the records within them routinely change from time to time. It's also quite a small database, and quite diverse data, so there are between 30 and 40 dissimilar indexes per database, with 5 instances of the app running, so around 200 indexes actively in use, though not terribly heavily.

What can I do to pin this down further? (I have another Pi set up almost identically which I could experiment more freely with than this production on if that helps).


(Christian Dahlqvist) #2

That sounds like a lot of indices and shards for such a small heap. Each shard comes with overhead, so having lots of small indices and shards can be very inefficient. For that type of environment I would recommend reducing the shard count to a minimum.


(David Earl) #3

No, it’s only one shard, as I said.

I was rather forced into using lots of indexes by the removal of types in es6: it was not my choice to do it that way. Previously I had one index per database with types distinguishing the records. I consolidated record types where they had a lot on common, but that still leaves many dissimilar ones.

But the main point is, this seems to be a relatively new problem, it’s been running completely satisfactorily for a long time, including under es6 with all these indexes.

And however large a server, if it routinely eats up memory at a steady rate it will fail eventually. If you’re running on a less constrained server, the problem may well just be less apparent if you have lots of overhead, and probably the time between reboots for other reasons is longer than the time to run out of memory.

(And as Logstash allocates an index per day, surely it isn’t very long before it uses far, far more indexes than I am, and at a steadily increasing rate).

And I know we have rather larger machines in general these days, but 1GB is still a rather substantial amount of memory. All of the data in my databases would fit in 1GB several times over.


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.