Elasticsearch 1.7.2 sudden crash after 26 hours


(Jaminvp) #1

Greetings,

I noticed my Elasticsearch v1.7.2 crashed after running for over 24 hours. The exact error is:

"A fatal error has been detected by the Java Runtime Environment:

EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x000000006afd18a7, pid=23416, tid=13004

JRE version: Java(TM) SE Runtime Environment (8.0_60-b27) (build 1.8.0_60-b27)
Java VM: Java HotSpot(TM) 64-Bit Server VM (25.60-b23 mixed mode windows-amd64 compressed oops)
Problematic frame:
V [jvm.dll+0x2118a7]

Failed to write core dump. Call to MiniDumpWriteDump() failed (Error 0x800705af: The paging file is too small for this operation to complete.

)"

I wanted to add the hs_err_pid log file, but it seems only images can be added? I can PM the entire log if someone wants to see it. Adding it as plain text here would clutter this post too much.

I don't know what happened, could anyone provide some insight? Swapping is disabled so it shouldn't be the cause of this error.

Kind regards,

JaminVP


(Mark Walkom) #2

Please don't attach heap dumps :smile:

It's hard to say why, is there more to the ES logs?


(Jaminvp) #3

Nothing in the ES logs, only the hs_err_pid log shows details but they're not that clear to me. I guess I'll just restart it and see what happens in the next 48 hours. I suspect there was a memory error seeing as the system has a lot of stuff to process around the time ES crashed.

It's strange though how it took over 26 hours to crash, and even stranger how it crashed on a holiday when the system has less to process.


(Jörg Prante) #4

Your Java JVM is buggy, you should report this to Oracle.


(Jaminvp) #5

Well, after 4 days of continuous hard labor, Elasticsearch crashed again with the JVM stopping out of nowhere. Elasticsearch wasn't using its full allocated memory, I checked this one hour before it crashed.

Some things I noticed though:

  • Before ES crashed, the java.exe process linked to Logstash 1.5.4 was using A LOT of memory, like 1.6GB. I restarted both Logstash and Elasticsearch and I noticed that there seems to be a memory leak in the java.exe linked to Logstash. It steadily rises with 8-32KB increments, never ceasing and never lowering. I suspect the Logstash memory leak is causing the ES JVM to crash.

  • After ES crashes, the services tab shows a new service installed: elasticsearch-service-x86. Notice how the service.bat file installs the elasticsearch-service-x64 service. I have no idea why the x86 version gets installed after x64 crashes. It's installed as a disabled service.

Other servers with Logstash 1.5.4 installed do not seem to have this memory leak. Is there some sort of leak caused by running both ES and Logstash on the same machine?


(Mark Walkom) #6

There is a known issue with ruby on windows that causes a memory leak, I think this is the issue about it.


(Patrick Kik) #7

We're having the about same problem.

Some of us see the node collapsing moments after startup. There's no load on the node.

Elasticsearch 1.7.5, Java 1.8.0_65 and other versions of Java 1.8.0.


(Patrick Kik) #8

In our case it turns out that the combination of having Windows Updates KB3126593 and KB3126587 installed was causing the problem. We're on Window 7 Enterprise.


(system) #9