CouchDB River and memory


(Harry Waye) #1

I'm trying to index a 1.5GB couchdb database with the CouchDB river but I'm
getting memory issues. elasticsearch.conf is

The river and index were created with https://gist.github.com/1156889 . The
node on which the river is running has 1.4GB of memory allocated to the JVM.
I end up running in to OOM errors after a couple of minutes:
https://gist.github.com/1156931 . I'm also getting mapper exceptions on
interpreting dates but I assume these are a separate issue. What should I
be expecting the memory usage to look like?

Thanks
Harry


(avasilenko) #2

I'm also getting mapper exceptions on interpreting dates but I assume
these are a separate issue.

What ES version are you using? There was a hotfix in 0.17.6 -

Regards,
Alexandr Vasilenko

2011/8/19 Harry Waye hwaye@microwayes.net

I'm trying to index a 1.5GB couchdb database with the CouchDB river but I'm
getting memory issues. elasticsearch.conf is
https://gist.github.com/1156885

The river and index were created with https://gist.github.com/1156889 .
The node on which the river is running has 1.4GB of memory allocated to the
JVM. I end up running in to OOM errors after a couple of minutes:
https://gist.github.com/1156931 . I'm also getting mapper exceptions on
interpreting dates but I assume these are a separate issue. What should I
be expecting the memory usage to look like?

Thanks
Harry


(Harry Waye-2) #3

This is in 0.17.6. Is the date issue likely to be related to the OOM
exception? Let me know if there is anything I can do, more information etc.

Regards
Harry

On 22 August 2011 08:02, Alex Vasilenko aa.vasilenko@gmail.com wrote:

I'm also getting mapper exceptions on interpreting dates but I assume

these are a separate issue.

What ES version are you using? There was a hotfix in 0.17.6 -
https://github.com/elasticsearch/elasticsearch/issues?labels=v0.17.6&sort=created&direction=desc&state=closed&page=1

Regards,
Alexandr Vasilenko

2011/8/19 Harry Waye hwaye@microwayes.net

I'm trying to index a 1.5GB couchdb database with the CouchDB river but
I'm getting memory issues. elasticsearch.conf is
https://gist.github.com/1156885

The river and index were created with https://gist.github.com/1156889 .
The node on which the river is running has 1.4GB of memory allocated to the
JVM. I end up running in to OOM errors after a couple of minutes:
https://gist.github.com/1156931 . I'm also getting mapper exceptions on
interpreting dates but I assume these are a separate issue. What should I
be expecting the memory usage to look like?

Thanks
Harry


(Shay Banon) #4

How big are the docs you are trying to index? I wonder if its not indexing
fast enough as its pulling data from couchdb (and we probably need a way to
throttle if that happens). How many nodes are you running?

On Mon, Aug 22, 2011 at 12:30 PM, Harry Waye harry@arachnys.com wrote:

This is in 0.17.6. Is the date issue likely to be related to the OOM
exception? Let me know if there is anything I can do, more information etc.

Regards
Harry

On 22 August 2011 08:02, Alex Vasilenko aa.vasilenko@gmail.com wrote:

I'm also getting mapper exceptions on interpreting dates but I assume

these are a separate issue.

What ES version are you using? There was a hotfix in 0.17.6 -
https://github.com/elasticsearch/elasticsearch/issues?labels=v0.17.6&sort=created&direction=desc&state=closed&page=1

Regards,
Alexandr Vasilenko

2011/8/19 Harry Waye hwaye@microwayes.net

I'm trying to index a 1.5GB couchdb database with the CouchDB river but
I'm getting memory issues. elasticsearch.conf is
https://gist.github.com/1156885

The river and index were created with https://gist.github.com/1156889 .
The node on which the river is running has 1.4GB of memory allocated to the
JVM. I end up running in to OOM errors after a couple of minutes:
https://gist.github.com/1156931 . I'm also getting mapper exceptions on
interpreting dates but I assume these are a separate issue. What should I
be expecting the memory usage to look like?

Thanks
Harry


(Shay Banon) #5

Opened this issue:
https://github.com/elasticsearch/elasticsearch/issues/1269. Should provide
much better behavior of the couchdb river (its on 0.17 branch as well).

On Mon, Aug 22, 2011 at 10:01 PM, Shay Banon kimchy@gmail.com wrote:

How big are the docs you are trying to index? I wonder if its not indexing
fast enough as its pulling data from couchdb (and we probably need a way to
throttle if that happens). How many nodes are you running?

On Mon, Aug 22, 2011 at 12:30 PM, Harry Waye harry@arachnys.com wrote:

This is in 0.17.6. Is the date issue likely to be related to the OOM
exception? Let me know if there is anything I can do, more information etc.

Regards
Harry

On 22 August 2011 08:02, Alex Vasilenko aa.vasilenko@gmail.com wrote:

I'm also getting mapper exceptions on interpreting dates but I assume

these are a separate issue.

What ES version are you using? There was a hotfix in 0.17.6 -
https://github.com/elasticsearch/elasticsearch/issues?labels=v0.17.6&sort=created&direction=desc&state=closed&page=1

Regards,
Alexandr Vasilenko

2011/8/19 Harry Waye hwaye@microwayes.net

I'm trying to index a 1.5GB couchdb database with the CouchDB river but
I'm getting memory issues. elasticsearch.conf is
https://gist.github.com/1156885

The river and index were created with https://gist.github.com/1156889 .
The node on which the river is running has 1.4GB of memory allocated to the
JVM. I end up running in to OOM errors after a couple of minutes:
https://gist.github.com/1156931 . I'm also getting mapper exceptions
on interpreting dates but I assume these are a separate issue. What should
I be expecting the memory usage to look like?

Thanks
Harry


(Harry Waye) #6

I suspect the index rate is probably the issue. I was originally running 3
ec2 large 32bit nodes, as per the original post. I've subsequently just
written my own "river" which I presume due to its poor efficiency is working
fine. I'll have to get back to you on the document size...

H
On 22 Aug 2011 20:20, "Shay Banon" kimchy@gmail.com wrote:


(Harry Waye) #7

Ah, thanks for the commit, I'll. Try this out tomorrow.

H
On 22 Aug 2011 22:46, "Harry Waye" hwaye@microwayes.net wrote:

I suspect the index rate is probably the issue. I was originally running 3
ec2 large 32bit nodes, as per the original post. I've subsequently just
written my own "river" which I presume due to its poor efficiency is
working
fine. I'll have to get back to you on the document size...

H
On 22 Aug 2011 20:20, "Shay Banon" kimchy@gmail.com wrote:


(Harry Waye) #8

Looks too be a success, cheers!

On 22 August 2011 22:53, Harry Waye hwaye@microwayes.net wrote:

Ah, thanks for the commit, I'll. Try this out tomorrow.

H
On 22 Aug 2011 22:46, "Harry Waye" hwaye@microwayes.net wrote:

I suspect the index rate is probably the issue. I was originally running
3
ec2 large 32bit nodes, as per the original post. I've subsequently just
written my own "river" which I presume due to its poor efficiency is
working
fine. I'll have to get back to you on the document size...

H
On 22 Aug 2011 20:20, "Shay Banon" kimchy@gmail.com wrote:


(system) #9