Does snapshot restore lead to a memory leak?


(José de Zárate) #1

We have one one-machine cluster with about 1k indices. It used to work
flawlessly , (being with a high load, of course)

but since we started to use heavily the snapshot-restore feature, it's
getting its memory exhausted within 7 days of use. The cluster make about
700 restore proceedings during the day. Maybe there are some memory
considerations when using the restore feature???

--
uh, oh http://www.youtube.com/watch?v=GMD_T7ICL0o.

http://www.defectivebydesign.org/no-drm-in-html5

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKNaH0XTQtcSsXPBAb%2BbOh2Hcg-9QCBRd4hNjxN-N1UFLvENBw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Igor Motov) #2

Just to make sure I got it right, you really meant 700 restores (not just
700 snapshots), correct? What type of repository are you using? Could you
add a bit more details about your use case?

On Monday, June 30, 2014 8:53:10 AM UTC-4, JoeZ99 wrote:

We have one one-machine cluster with about 1k indices. It used to work
flawlessly , (being with a high load, of course)

but since we started to use heavily the snapshot-restore feature, it's
getting its memory exhausted within 7 days of use. The cluster make about
700 restore proceedings during the day. Maybe there are some memory
considerations when using the restore feature???

--
uh, oh http://www.youtube.com/watch?v=GMD_T7ICL0o.

http://www.defectivebydesign.org/no-drm-in-html5

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6158eb50-bdbd-40c3-80fb-b18102cacb6d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(José de Zárate) #3

Hey, Igor, thanks for answering! and sorry for the delay. Didn't catch the
update.

I explain:

  • we have one cluster of one machine which is only meant for serving
    search requests. the goal is not to index anything to it. It contains 1.7k
    indices, give it or take it.
  • every day, those 1.7k indices are reindexed, and snapshoted in pairs
    to a S3 repository (producint 850 snapshots)repository.
  • every day, the one "reading only" cluster of the first point restores
    those 850 snapshots to "update" its 1.7k indices from that same S3
    repository.

It works like a real charm. Load has dropped dramatically, and we can set a
"farm" of temporary machines to do the indexing duties.

But memory consumption never stops growing.

we don't get any "out of memory" error or anything. In fact, there is
nothing in the logs that shows any error, but after a week or a few days,
the host has its memory almost exhausted and elasticsearch is not
responding. The memory consumption is of course way ahead of the HEAP_SIZE
We have to restart it and, when we do it we get the following error:

java.util.concurrent.RejectedExecutionException: Worker has already been
shutdown
at org.elasticsearch.common.netty.channel.socket.nio.
AbstractNioSelector.registerTask(AbstractNioSelector.java:120)
at org.elasticsearch.common.netty.channel.socket.nio.
AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:72)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.
executeInIoThread(NioWorker.java:36)
at org.elasticsearch.common.netty.channel.socket.nio.
AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:56)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.
executeInIoThread(NioWorker.java:36)
at org.elasticsearch.common.netty.channel.socket.nio.
AbstractNioChannelSink.execute(AbstractNioChannelSink.java:34)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
execute(DefaultChannelPipeline.java:636)
at org.elasticsearch.common.netty.channel.Channels.
fireExceptionCaughtLater(Channels.java:496)
at org.elasticsearch.common.netty.channel.AbstractChannelSink.
exceptionCaught(AbstractChannelSink.java:46)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
notifyHandlerException(DefaultChannelPipeline.java:658)
at org.elasticsearch.common.netty.channel.
DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(
DefaultChannelPipeline.java:781)
at org.elasticsearch.common.netty.channel.Channels.write(Channels.
java:725)
at org.elasticsearch.common.netty.handler.codec.oneone.
OneToOneEncoder.doEncode(OneToOneEncoder.java:71)
at org.elasticsearch.common.netty.handler.codec.oneone.
OneToOneEncoder.handleDownstream(OneToOneEncoder.java:59)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
sendDownstream(DefaultChannelPipeline.java:591)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
sendDownstream(DefaultChannelPipeline.java:582)
at org.elasticsearch.common.netty.channel.Channels.write(Channels.
java:704)
at org.elasticsearch.common.netty.channel.Channels.write(Channels.
java:671)
at org.elasticsearch.common.netty.channel.AbstractChannel.write(
AbstractChannel.java:248)
at org.elasticsearch.http.netty.NettyHttpChannel.sendResponse(
NettyHttpChannel.java:158)
at org.elasticsearch.rest.action.search.RestSearchAction$1.
onResponse(RestSearchAction.java:106)
at org.elasticsearch.rest.action.search.RestSearchAction$1.
onResponse(RestSearchAction.java:98)
at org.elasticsearch.action.search.type.
TransportSearchQueryAndFetchAction$AsyncAction.innerFinishHim(
TransportSearchQueryAndFetchAction.java:94)
at org.elasticsearch.action.search.type.
TransportSearchQueryAndFetchAction$AsyncAction.moveToSecondPhase(
TransportSearchQueryAndFetchAction.java:77)
at org.elasticsearch.action.search.type.
TransportSearchTypeAction$BaseAsyncAction.innerMoveToSecondPhase(
TransportSearchTypeAction.java:425)
at org.elasticsearch.action.search.type.
TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(
TransportSearchTypeAction.java:243)
at org.elasticsearch.action.search.type.
TransportSearchTypeAction$BaseAsyncAction$3.onResult(
TransportSearchTypeAction.java:219)
at org.elasticsearch.action.search.type.
TransportSearchTypeAction$BaseAsyncAction$3.onResult(
TransportSearchTypeAction.java:216)
at org.elasticsearch.search.action.SearchServiceTransportAction.
sendExecuteFetch(SearchServiceTransportAction.java:305)
at org.elasticsearch.action.search.type.
TransportSearchQueryAndFetchAction$AsyncAction.sendExecuteFirstPhase(
TransportSearchQueryAndFetchAction.java:71)
at org.elasticsearch.action.search.type.
TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(
TransportSearchTypeAction.java:216)
at org.elasticsearch.action.search.type.
TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(
TransportSearchTypeAction.java:203)
at org.elasticsearch.action.search.type.
TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.
java:186)
at java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:701)

It looks like it shutdowns itself down for some reason...

The hosts in which elasticsearch is living has nothing else installed
besides a standard ubuntu distribution. It's completely devoted to
elasticsearch.

the memory consumption grows a 10% in 36h

On Monday, June 30, 2014 10:45:18 PM UTC-4, Igor Motov wrote:

Just to make sure I got it right, you really meant 700 restores (not just
700 snapshots), correct? What type of repository are you using? Could you
add a bit more details about your use case?

On Monday, June 30, 2014 8:53:10 AM UTC-4, JoeZ99 wrote:

We have one one-machine cluster with about 1k indices. It used to work
flawlessly , (being with a high load, of course)

but since we started to use heavily the snapshot-restore feature, it's
getting its memory exhausted within 7 days of use. The cluster make about
700 restore proceedings during the day. Maybe there are some memory
considerations when using the restore feature???

--
uh, oh http://www.youtube.com/watch?v=GMD_T7ICL0o.

http://www.defectivebydesign.org/no-drm-in-html5

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3efafd17-5a2b-4dc5-b935-4a0c5c1a8bab%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Igor Motov) #4

So, your "search-only" machines are running out of memory, while your
"index-only" machines are doing fine. Did I understand you correctly? Could
you send me nodes stats (curl "localhost:9200/_nodes/stats?pretty") from
the machine that runs out of memory, please run stats a few times with 1
hour interval. I would like to see how memory consumption is increasing
over time. Please, also run nodes info ones (curl "localhost:9200/_nodes")
and post here (or send me by email) the results. Thanks!

On Wednesday, July 2, 2014 10:15:46 AM UTC-4, JoeZ99 wrote:

Hey, Igor, thanks for answering! and sorry for the delay. Didn't catch the
update.

I explain:

  • we have one cluster of one machine which is only meant for serving
    search requests. the goal is not to index anything to it. It contains 1.7k
    indices, give it or take it.
  • every day, those 1.7k indices are reindexed, and snapshoted in pairs
    to a S3 repository (producint 850 snapshots)repository.
  • every day, the one "reading only" cluster of the first point restores
    those 850 snapshots to "update" its 1.7k indices from that same S3
    repository.

It works like a real charm. Load has dropped dramatically, and we can set
a "farm" of temporary machines to do the indexing duties.

But memory consumption never stops growing.

we don't get any "out of memory" error or anything. In fact, there is
nothing in the logs that shows any error, but after a week or a few days,
the host has its memory almost exhausted and elasticsearch is not
responding. The memory consumption is of course way ahead of the HEAP_SIZE
We have to restart it and, when we do it we get the following error:

java.util.concurrent.RejectedExecutionException: Worker has already been
shutdown
at org.elasticsearch.common.netty.channel.socket.nio.
AbstractNioSelector.registerTask(AbstractNioSelector.java:120)
at org.elasticsearch.common.netty.channel.socket.nio.
AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:72)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.
executeInIoThread(NioWorker.java:36)
at org.elasticsearch.common.netty.channel.socket.nio.
AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:56)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.
executeInIoThread(NioWorker.java:36)
at org.elasticsearch.common.netty.channel.socket.nio.
AbstractNioChannelSink.execute(AbstractNioChannelSink.java:34)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
execute(DefaultChannelPipeline.java:636)
at org.elasticsearch.common.netty.channel.Channels.
fireExceptionCaughtLater(Channels.java:496)
at org.elasticsearch.common.netty.channel.AbstractChannelSink.
exceptionCaught(AbstractChannelSink.java:46)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
notifyHandlerException(DefaultChannelPipeline.java:658)
at org.elasticsearch.common.netty.channel.
DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(
DefaultChannelPipeline.java:781)
at org.elasticsearch.common.netty.channel.Channels.write(Channels.
java:725)
at org.elasticsearch.common.netty.handler.codec.oneone.
OneToOneEncoder.doEncode(OneToOneEncoder.java:71)
at org.elasticsearch.common.netty.handler.codec.oneone.
OneToOneEncoder.handleDownstream(OneToOneEncoder.java:59)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
sendDownstream(DefaultChannelPipeline.java:591)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
sendDownstream(DefaultChannelPipeline.java:582)
at org.elasticsearch.common.netty.channel.Channels.write(Channels.
java:704)
at org.elasticsearch.common.netty.channel.Channels.write(Channels.
java:671)
at org.elasticsearch.common.netty.channel.AbstractChannel.write(
AbstractChannel.java:248)
at org.elasticsearch.http.netty.NettyHttpChannel.sendResponse(
NettyHttpChannel.java:158)
at org.elasticsearch.rest.action.search.RestSearchAction$1.
onResponse(RestSearchAction.java:106)
at org.elasticsearch.rest.action.search.RestSearchAction$1.
onResponse(RestSearchAction.java:98)
at org.elasticsearch.action.search.type.
TransportSearchQueryAndFetchAction$AsyncAction.innerFinishHim(
TransportSearchQueryAndFetchAction.java:94)
at org.elasticsearch.action.search.type.
TransportSearchQueryAndFetchAction$AsyncAction.moveToSecondPhase(
TransportSearchQueryAndFetchAction.java:77)
at org.elasticsearch.action.search.type.
TransportSearchTypeAction$BaseAsyncAction.innerMoveToSecondPhase(
TransportSearchTypeAction.java:425)
at org.elasticsearch.action.search.type.
TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(
TransportSearchTypeAction.java:243)
at org.elasticsearch.action.search.<span style="color: #
...

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7374c233-51b4-4697-b534-2da65ddfb967%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(José de Zárate) #5

Igor.
Yes, that's right. My "index only" machines are just machines that are
booted just for the indexing-snapshotting task. once there is no more tasks
in queue, those machines are terminated. they only handle a few indices
each time (their only purpose is to "snapshot").

I will do as you tell me. I guess I'll better wait to the timeframe in
which most of the restores occurs, because that's when the memory
consumption grows more, so expect those postings in 5 or 6 hours.

On Wednesday, July 2, 2014 10:29:53 AM UTC-4, Igor Motov wrote:

So, your "search-only" machines are running out of memory, while your
"index-only" machines are doing fine. Did I understand you correctly? Could
you send me nodes stats (curl "localhost:9200/_nodes/stats?pretty") from
the machine that runs out of memory, please run stats a few times with 1
hour interval. I would like to see how memory consumption is increasing
over time. Please, also run nodes info ones (curl "localhost:9200/_nodes")
and post here (or send me by email) the results. Thanks!

On Wednesday, July 2, 2014 10:15:46 AM UTC-4, JoeZ99 wrote:

Hey, Igor, thanks for answering! and sorry for the delay. Didn't catch
the update.

I explain:

  • we have one cluster of one machine which is only meant for serving
    search requests. the goal is not to index anything to it. It contains 1.7k
    indices, give it or take it.
  • every day, those 1.7k indices are reindexed, and snapshoted in pairs
    to a S3 repository (producint 850 snapshots)repository.
  • every day, the one "reading only" cluster of the first point
    restores those 850 snapshots to "update" its 1.7k indices from that same S3
    repository.

It works like a real charm. Load has dropped dramatically, and we can set
a "farm" of temporary machines to do the indexing duties.

But memory consumption never stops growing.

we don't get any "out of memory" error or anything. In fact, there is
nothing in the logs that shows any error, but after a week or a few days,
the host has its memory almost exhausted and elasticsearch is not
responding. The memory consumption is of course way ahead of the HEAP_SIZE
We have to restart it and, when we do it we get the following error:

java.util.concurrent.RejectedExecutionException: Worker has already been
shutdown
at org.elasticsearch.common.netty.channel.socket.nio.
AbstractNioSelector.registerTask(AbstractNioSelector.java:120)
at org.elasticsearch.common.netty.channel.socket.nio.
AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:72)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.
executeInIoThread(NioWorker.java:36)
at org.elasticsearch.common.netty.channel.socket.nio.
AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:56)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.
executeInIoThread(NioWorker.java:36)
at org.elasticsearch.common.netty.channel.socket.nio.
AbstractNioChannelSink.execute(AbstractNioChannelSink.java:34)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
execute(DefaultChannelPipeline.java:636)
at org.elasticsearch.common.netty.channel.Channels.
fireExceptionCaughtLater(Channels.java:496)
at org.elasticsearch.common.netty.channel.AbstractChannelSink.
exceptionCaught(AbstractChannelSink.java:46)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
notifyHandlerException(DefaultChannelPipeline.java:658)
at org.elasticsearch.common.netty.channel.
DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(
DefaultChannelPipeline.java:781)
at org.elasticsearch.common.netty.channel.Channels.write(Channels
.java:725)
at org.elasticsearch.common.netty.handler.codec.oneone.
OneToOneEncoder.doEncode(OneToOneEncoder.java:71)
at org.elasticsearch.common.netty.handler.codec.oneone.
OneToOneEncoder.handleDownstream(OneToOneEncoder.java:59)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
sendDownstream(DefaultChannelPipeline.java:591)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
sendDownstream(DefaultChannelPipeline.java:582)
at org.elasticsearch.common.netty.channel.Channels.write(Channels
.java:704)
at org.elasticsearch.common.netty.channel.Channels.write(Channels
.java:671)
at org.elasticsearch.common.netty.channel.AbstractChannel.write(
AbstractChannel.java:248)
at org.elasticsearch.http.netty.NettyHttpChannel.sendResponse(
NettyHttpChannel.java:158)
at org.elasticsearch.rest.action.search.RestSearchAction$1.
onResponse(RestSearchAction.java:106)
at org.elasticsearch.rest.action.search.RestSearchAction$1.
onResponse(RestSearchAction.java:98)
at org.elasticsearch.action.search.type.
TransportSearchQueryAndFetchAction$AsyncAction.innerFinishHim(
TransportSearchQueryAndFetchAction.java:94)
at org.elasticsearch.action.search.type.
TransportSearchQueryAndFetchAction$AsyncAction.moveToSecondPhase(
TransportSearchQueryAndFetchAction.java:77)
at org.elasticsearch.action.search.type.
TransportSearchTypeAction$BaseAsyncAction.innerMoveToSecondPhase(
TransportSearchTypeAction.java:425)
at org.elasticsearch.action.search.type.
TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(
TransportSearchTypeAction.java:243)
at org.elasticsearch.action.search.<span style="color: #
...

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a00cc733-a81c-4f8b-bdc0-b2bf4250481b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #6

This memory issue report might be related

https://groups.google.com/forum/#!topic/elasticsearch/EH76o1CIeQQ

Jörg

On Wed, Jul 2, 2014 at 5:34 PM, JoeZ99 jzarate@gmail.com wrote:

Igor.
Yes, that's right. My "index only" machines are just machines that are
booted just for the indexing-snapshotting task. once there is no more tasks
in queue, those machines are terminated. they only handle a few indices
each time (their only purpose is to "snapshot").

I will do as you tell me. I guess I'll better wait to the timeframe in
which most of the restores occurs, because that's when the memory
consumption grows more, so expect those postings in 5 or 6 hours.

On Wednesday, July 2, 2014 10:29:53 AM UTC-4, Igor Motov wrote:

So, your "search-only" machines are running out of memory, while your
"index-only" machines are doing fine. Did I understand you correctly? Could
you send me nodes stats (curl "localhost:9200/_nodes/stats?pretty") from
the machine that runs out of memory, please run stats a few times with 1
hour interval. I would like to see how memory consumption is increasing
over time. Please, also run nodes info ones (curl "localhost:9200/_nodes")
and post here (or send me by email) the results. Thanks!

On Wednesday, July 2, 2014 10:15:46 AM UTC-4, JoeZ99 wrote:

Hey, Igor, thanks for answering! and sorry for the delay. Didn't catch
the update.

I explain:

  • we have one cluster of one machine which is only meant for serving
    search requests. the goal is not to index anything to it. It contains 1.7k
    indices, give it or take it.
  • every day, those 1.7k indices are reindexed, and snapshoted in
    pairs to a S3 repository (producint 850 snapshots)repository.
  • every day, the one "reading only" cluster of the first point
    restores those 850 snapshots to "update" its 1.7k indices from that same S3
    repository.

It works like a real charm. Load has dropped dramatically, and we can
set a "farm" of temporary machines to do the indexing duties.

But memory consumption never stops growing.

we don't get any "out of memory" error or anything. In fact, there is
nothing in the logs that shows any error, but after a week or a few days,
the host has its memory almost exhausted and elasticsearch is not
responding. The memory consumption is of course way ahead of the HEAP_SIZE
We have to restart it and, when we do it we get the following error:

java.util.concurrent.RejectedExecutionException: Worker has already
been shutdown
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNi
oSelector.registerTask(AbstractNioSelector.java:120)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNi
oWorker.executeInIoThread(AbstractNioWorker.java:72)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.
executeInIoThread(NioWorker.java:36)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNi
oWorker.executeInIoThread(AbstractNioWorker.java:56)
at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.
executeInIoThread(NioWorker.java:36)
at org.elasticsearch.common.netty.channel.socket.nio.AbstractNi
oChannelSink.execute(AbstractNioChannelSink.java:34)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline
.execute(DefaultChannelPipeline.java:636)
at org.elasticsearch.common.netty.channel.Channels.fireExceptio
nCaughtLater(Channels.java:496)
at org.elasticsearch.common.netty.channel.AbstractChannelSink.e
xceptionCaught(AbstractChannelSink.java:46)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline
.notifyHandlerException(DefaultChannelPipeline.java:658)
at org.elasticsearch.common.netty.channel.DefaultChannelPipelin
e$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.
java:781)
at org.elasticsearch.common.netty.channel.Channels.write(Channe
ls.java:725)
at org.elasticsearch.common.netty.handler.codec.oneone.OneToOne
Encoder.doEncode(OneToOneEncoder.java:71)
at org.elasticsearch.common.netty.handler.codec.oneone.OneToOne
Encoder.handleDownstream(OneToOneEncoder.java:59)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline
.sendDownstream(DefaultChannelPipeline.java:591)
at org.elasticsearch.common.netty.channel.DefaultChannelPipeline
.sendDownstream(DefaultChannelPipeline.java:582)
at org.elasticsearch.common.netty.channel.Channels.write(Channe
ls.java:704)
at org.elasticsearch.common.netty.channel.Channels.write(Channe
ls.java:671)
at org.elasticsearch.common.netty.channel.AbstractChannel.write(
AbstractChannel.java:248)
at org.elasticsearch.http.netty.NettyHttpChannel.sendResponse(N
ettyHttpChannel.java:158)
at org.elasticsearch.rest.action.search.RestSearchAction$1.onRe
sponse(RestSearchAction.java:106)
at org.elasticsearch.rest.action.search.RestSearchAction$1.onRe
sponse(RestSearchAction.java:98)
at org.elasticsearch.action.search.type.TransportSearchQueryAnd
FetchAction$AsyncAction.innerFinishHim(TransportSearchQueryA
ndFetchAction.java:94)
at org.elasticsearch.action.search.type.TransportSearchQueryAnd
FetchAction$AsyncAction.moveToSecondPhase(TransportSearchQue
ryAndFetchAction.java:77)
at org.elasticsearch.action.search.type.TransportSearchTypeActi
on$BaseAsyncAction.innerMoveToSecondPhase(TransportSearchTypeAction.java
:425)
at org.elasticsearch.action.search.type.TransportSearchTypeActi
on$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:243
)
at org.elasticsearch.action.search.<span style="color: #
...

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a00cc733-a81c-4f8b-bdc0-b2bf4250481b%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/a00cc733-a81c-4f8b-bdc0-b2bf4250481b%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHkMrYX_u5oPFK7o0Gr8ngU2byU71M1x61MiVJ8_tXmbA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(José de Zárate) #7

Igor.
I'm posting a pdf document with some graphs I think are quite enlightening
. The "jvm threads" is particularly interesting.
the times are utc-4. and during the jvm growing period is when most of the
restore process have been taking place.
Igor, I will send you the reports you asked me in an email, since they
contain filesystem data. Hope you don't mind.

The graphs contain data from two elasticsearch clusters. ES1 is the one
we've been talking about in this thread. ES4 is on cluster devoted to two
indices, not very big but with a highly search demand.

txs!!!

On Monday, June 30, 2014 8:53:10 AM UTC-4, JoeZ99 wrote:

We have one one-machine cluster with about 1k indices. It used to work
flawlessly , (being with a high load, of course)

but since we started to use heavily the snapshot-restore feature, it's
getting its memory exhausted within 7 days of use. The cluster make about
700 restore proceedings during the day. Maybe there are some memory
considerations when using the restore feature???

--
uh, oh http://www.youtube.com/watch?v=GMD_T7ICL0o.

http://www.defectivebydesign.org/no-drm-in-html5

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8c00e5fb-908c-4b11-8365-1cc766705940%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Igor Motov) #8

So, you are running out of threads not memory. Are you re-registering
repository every time you restore from it? If you do, you might be running
into this issue https://github.com/elasticsearch/elasticsearch/issues/6181

On Thursday, July 3, 2014 2:06:38 PM UTC-4, JoeZ99 wrote:

Igor.
I'm posting a pdf document with some graphs I think are quite enlightening
. The "jvm threads" is particularly interesting.
the times are utc-4. and during the jvm growing period is when most of the
restore process have been taking place.
Igor, I will send you the reports you asked me in an email, since they
contain filesystem data. Hope you don't mind.

The graphs contain data from two elasticsearch clusters. ES1 is the one
we've been talking about in this thread. ES4 is on cluster devoted to two
indices, not very big but with a highly search demand.

txs!!!

On Monday, June 30, 2014 8:53:10 AM UTC-4, JoeZ99 wrote:

We have one one-machine cluster with about 1k indices. It used to work
flawlessly , (being with a high load, of course)

but since we started to use heavily the snapshot-restore feature, it's
getting its memory exhausted within 7 days of use. The cluster make about
700 restore proceedings during the day. Maybe there are some memory
considerations when using the restore feature???

--
uh, oh http://www.youtube.com/watch?v=GMD_T7ICL0o.

http://www.defectivebydesign.org/no-drm-in-html5

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e6705c17-ea08-48fa-873c-b44dc797a9d4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(José de Zárate) #9

precisely!!! I re-issue the repository PUT command every time I do the
restore . I know it's not the smartest thing in the world, but I wanted to
make sure the repos will always be available without worrying if the
elasticsearch cluster was newly created or not.

I'll look into that.

On Thu, Jul 3, 2014 at 2:17 PM, Igor Motov imotov@gmail.com wrote:

So, you are running out of threads not memory. Are you re-registering
repository every time you restore from it? If you do, you might be running
into this issue https://github.com/elasticsearch/elasticsearch/issues/6181

On Thursday, July 3, 2014 2:06:38 PM UTC-4, JoeZ99 wrote:

Igor.
I'm posting a pdf document with some graphs I think are quite
enlightening . The "jvm threads" is particularly interesting.
the times are utc-4. and during the jvm growing period is when most of
the restore process have been taking place.
Igor, I will send you the reports you asked me in an email, since they
contain filesystem data. Hope you don't mind.

The graphs contain data from two elasticsearch clusters. ES1 is the one
we've been talking about in this thread. ES4 is on cluster devoted to two
indices, not very big but with a highly search demand.

txs!!!

On Monday, June 30, 2014 8:53:10 AM UTC-4, JoeZ99 wrote:

We have one one-machine cluster with about 1k indices. It used to work
flawlessly , (being with a high load, of course)

but since we started to use heavily the snapshot-restore feature, it's
getting its memory exhausted within 7 days of use. The cluster make about
700 restore proceedings during the day. Maybe there are some memory
considerations when using the restore feature???

--
uh, oh http://www.youtube.com/watch?v=GMD_T7ICL0o.

http://www.defectivebydesign.org/no-drm-in-html5

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/jYB9n-mXsbU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e6705c17-ea08-48fa-873c-b44dc797a9d4%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e6705c17-ea08-48fa-873c-b44dc797a9d4%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
uh, oh http://www.youtube.com/watch?v=GMD_T7ICL0o.

http://www.defectivebydesign.org/no-drm-in-html5

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKNaH0VcGoNoXOZW01_YaJJetE8GbXzw56HOjn97J2i4eC%3DB1A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #10