If i step through my code after each
'client.IndexAsync(postQueue);' I see java's memory growing but
and never recovering from that memory whilst my indexer gets rid of
the postQueue in memory. Java jumps from 200mb to 1,5 gb when it
starts to spit out javaheap space errors since it can no longer
allocate anymore memory. The indexer stays at around 20mb's throughout
the process.
I see that you call IndexAsync, is there a chance that you are simply
creating too many concurrent bulk indexing requests into the server? Since
you don't wait for the bulk indexing request to get back, there might
eventually be hundreds of concurrent bulk indexing requests happening on the
server, eventually causing it to max out on mem. If you have 10-15
concurrent indexing actions, does it still happen?
On Sat, Nov 27, 2010 at 9:49 PM, Martijn Laarman mpdreamz@gmail.com wrote:
I'm creating a rather naive implementation of the bulk API (naive in
the sense its not zero copy although i plan too support that later
on).
Im using the hacker news database dump to insert data en masse into ES
see:
If i step through my code after each
'client.IndexAsync(postQueue);' I see java's memory growing but
and never recovering from that memory whilst my indexer gets rid of
the postQueue in memory. Java jumps from 200mb to 1,5 gb when it
starts to spit out javaheap space errors since it can no longer
allocate anymore memory. The indexer stays at around 20mb's throughout
the process.
thats what i thought at first too but i am putting a breakpoint afterwards
and even if i wait for the responses to come back in i can see
java's memory growing gradually.
Even if i use fiddler to manually fire the requests one by one i am seeing
java jumping 4 to 10mb after each request but never releasing it.
I see that you call IndexAsync, is there a chance that you are simply
creating too many concurrent bulk indexing requests into the server? Since
you don't wait for the bulk indexing request to get back, there might
eventually be hundreds of concurrent bulk indexing requests happening on the
server, eventually causing it to max out on mem. If you have 10-15
concurrent indexing actions, does it still happen?
On Sat, Nov 27, 2010 at 9:49 PM, Martijn Laarman mpdreamz@gmail.comwrote:
I'm creating a rather naive implementation of the bulk API (naive in
the sense its not zero copy although i plan too support that later
on).
Im using the hacker news database dump to insert data en masse into ES
see:
If i step through my code after each
'client.IndexAsync(postQueue);' I see java's memory growing but
and never recovering from that memory whilst my indexer gets rid of
the postQueue in memory. Java jumps from 200mb to 1,5 gb when it
starts to spit out javaheap space errors since it can no longer
allocate anymore memory. The indexer stays at around 20mb's throughout
the process.
The JVM will continue to use memory up to what you set it to use (max mem).
Once the JVM gets that memory, it will not free it to the OS, but
internally, memory will be freed once the garbage collector runs.
When you run it one by one, do you still get OutOfMemory errors?
On Sun, Nov 28, 2010 at 3:38 PM, Martijn Laarman mpdreamz@gmail.com wrote:
thats what i thought at first too but i am putting a breakpoint afterwards
and even if i wait for the responses to come back in i can see
java's memory growing gradually.
Even if i use fiddler to manually fire the requests one by one i am seeing
java jumping 4 to 10mb after each request but never releasing it.
I see that you call IndexAsync, is there a chance that you are simply
creating too many concurrent bulk indexing requests into the server? Since
you don't wait for the bulk indexing request to get back, there might
eventually be hundreds of concurrent bulk indexing requests happening on the
server, eventually causing it to max out on mem. If you have 10-15
concurrent indexing actions, does it still happen?
On Sat, Nov 27, 2010 at 9:49 PM, Martijn Laarman mpdreamz@gmail.comwrote:
I'm creating a rather naive implementation of the bulk API (naive in
the sense its not zero copy although i plan too support that later
on).
Im using the hacker news database dump to insert data en masse into ES
see:
If i step through my code after each
'client.IndexAsync(postQueue);' I see java's memory growing but
and never recovering from that memory whilst my indexer gets rid of
the postQueue in memory. Java jumps from 200mb to 1,5 gb when it
starts to spit out javaheap space errors since it can no longer
allocate anymore memory. The indexer stays at around 20mb's throughout
the process.
Ahh i saw java eating mem and assumed the problem was there. NEST now has
built in semaphore support for async connections which solved the issues.
Setting the maxasyncconnection too high still throws outofmem exceptions but
its easy to tweak now.
The JVM will continue to use memory up to what you set it to use (max mem).
Once the JVM gets that memory, it will not free it to the OS, but
internally, memory will be freed once the garbage collector runs.
When you run it one by one, do you still get OutOfMemory errors?
On Sun, Nov 28, 2010 at 3:38 PM, Martijn Laarman mpdreamz@gmail.comwrote:
thats what i thought at first too but i am putting a breakpoint afterwards
and even if i wait for the responses to come back in i can see
java's memory growing gradually.
Even if i use fiddler to manually fire the requests one by one i am seeing
java jumping 4 to 10mb after each request but never releasing it.
I see that you call IndexAsync, is there a chance that you are simply
creating too many concurrent bulk indexing requests into the server? Since
you don't wait for the bulk indexing request to get back, there might
eventually be hundreds of concurrent bulk indexing requests happening on the
server, eventually causing it to max out on mem. If you have 10-15
concurrent indexing actions, does it still happen?
On Sat, Nov 27, 2010 at 9:49 PM, Martijn Laarman mpdreamz@gmail.comwrote:
I'm creating a rather naive implementation of the bulk API (naive in
the sense its not zero copy although i plan too support that later
on).
Im using the hacker news database dump to insert data en masse into ES
see:
If i step through my code after each
'client.IndexAsync(postQueue);' I see java's memory growing but
and never recovering from that memory whilst my indexer gets rid of
the postQueue in memory. Java jumps from 200mb to 1,5 gb when it
starts to spit out javaheap space errors since it can no longer
allocate anymore memory. The indexer stays at around 20mb's throughout
the process.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.