Whilst updating records in the index this morning, our elastic service decided to spit out it's dummy and come to an abrupt halt.
In the logs we can see lots of GC activity, and if I'm understanding the numbers correctly, GC seems to be slowing down (or taking longer)..
The post is a little long so will put the log snippet for the GC in a seperate post below!
The failure:
[2018-11-26T07:50:36,814][WARN ][o.e.j.s.ServletHandler ] /availabilityv2%2f460/_bulk_docs
java.io.IOException: java.util.concurrent.TimeoutException: Idle timeout expired: 33128/30000 ms
at org.eclipse.jetty.server.HttpInput$ErrorState.noContent(HttpInput.java:831) ~[jetty-server-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:189) ~[jetty-server-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.apache.commons.io.IOUtils.read(IOUtils.java:2454) ~[commons-io-2.4.jar:2.4]
at org.apache.commons.io.IOUtils.readFully(IOUtils.java:2537) ~[commons-io-2.4.jar:2.4]
at com.couchbase.capi.servlet.CAPIServlet.handleBulkDocs(CAPIServlet.java:520) ~[couchbase-capi-server-1.6.2.jar:?]
at com.couchbase.capi.servlet.CAPIServlet.service(CAPIServlet.java:89) ~[couchbase-capi-server-1.6.2.jar:?]
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) ~[javax.servlet-api-3.1.0.jar:3.1.0]
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:845) ~[jetty-servlet-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:584) [jetty-servlet-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) [jetty-server-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:513) [jetty-security-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) [jetty-server-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) [jetty-server-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) [jetty-servlet-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) [jetty-server-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) [jetty-server-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) [jetty-server-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) [jetty-server-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.eclipse.jetty.server.Server.handle(Server.java:523) [jetty-server-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320) [jetty-server-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251) [jetty-server-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273) [jetty-io-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95) [jetty-io-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) [jetty-io-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) [jetty-util-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) [jetty-util-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) [jetty-util-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) [jetty-util-9.3.13.v20161014.jar:9.3.13.v20161014]
at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) [jetty-util-9.3.13.v20161014.jar:9.3.13.v20161014]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_151]
Caused by: java.util.concurrent.TimeoutException: Idle timeout expired: 33128/30000 ms
at org.eclipse.jetty.io.IdleTimeout.checkIdleTimeout(IdleTimeout.java:166) ~[?:?]
at org.eclipse.jetty.io.IdleTimeout$1.run(IdleTimeout.java:50) ~[?:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_151]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_151]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_151]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[?:1.8.0_151]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_151]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_151]
... 1 more
even though there's no reference to memory issues in the logs (edit - yes there is.. see below), I've since increased the jvm heap size to 8GB (we scaled up the AWS instances to 16GB machines temporarily)
the index isn't "massive" by ES standards!
pri rep docs.count docs.deleted store.size pri.store.size
1 2 2782021 1933869 2.8gb 1gb
is there anything further I should be checking to ensure this doesn't fall over tomorrow when updating the index?
thanks
Phil