We are running Elastic with 4 nodes in production. Recently we added a
new index (~13G including the replica) and interestingly a rouge query
failed to parse. As a result, after a few hours we are seeing the
nodes run out of memory. The logs are full of parse exceptions. While
we fix the query, I was curious if the excessive parse failures and/or
logging leading to some sort of a memory leak? The nodes when bounced
start fine and stay fine for a good number of hours, and then all of a
sudden they run out of memory. Asking as I had an experience with this
behavior with Tomcat back in the days, when excessive logging ended up
creating too many references to strings which were never GCd causing
it to go OOM.
Any input is appreciated!