CouchDB River and memory

I'm trying to index a 1.5GB couchdb database with the CouchDB river but I'm
getting memory issues. elasticsearch.conf is

The river and index were created with https://gist.github.com/1156889 . The
node on which the river is running has 1.4GB of memory allocated to the JVM.
I end up running in to OOM errors after a couple of minutes:
https://gist.github.com/1156931 . I'm also getting mapper exceptions on
interpreting dates but I assume these are a separate issue. What should I
be expecting the memory usage to look like?

Thanks
Harry

I'm also getting mapper exceptions on interpreting dates but I assume
these are a separate issue.

What ES version are you using? There was a hotfix in 0.17.6 -

Regards,
Alexandr Vasilenko

2011/8/19 Harry Waye hwaye@microwayes.net

I'm trying to index a 1.5GB couchdb database with the CouchDB river but I'm
getting memory issues. elasticsearch.conf is
gist:1156885 · GitHub

The river and index were created with gist:1156889 · GitHub .
The node on which the river is running has 1.4GB of memory allocated to the
JVM. I end up running in to OOM errors after a couple of minutes:
gist:1156931 · GitHub . I'm also getting mapper exceptions on
interpreting dates but I assume these are a separate issue. What should I
be expecting the memory usage to look like?

Thanks
Harry

This is in 0.17.6. Is the date issue likely to be related to the OOM
exception? Let me know if there is anything I can do, more information etc.

Regards
Harry

On 22 August 2011 08:02, Alex Vasilenko aa.vasilenko@gmail.com wrote:

I'm also getting mapper exceptions on interpreting dates but I assume

these are a separate issue.

What ES version are you using? There was a hotfix in 0.17.6 -
Issues · elastic/elasticsearch · GitHub

Regards,
Alexandr Vasilenko

2011/8/19 Harry Waye hwaye@microwayes.net

I'm trying to index a 1.5GB couchdb database with the CouchDB river but
I'm getting memory issues. elasticsearch.conf is
gist:1156885 · GitHub

The river and index were created with gist:1156889 · GitHub .
The node on which the river is running has 1.4GB of memory allocated to the
JVM. I end up running in to OOM errors after a couple of minutes:
gist:1156931 · GitHub . I'm also getting mapper exceptions on
interpreting dates but I assume these are a separate issue. What should I
be expecting the memory usage to look like?

Thanks
Harry

How big are the docs you are trying to index? I wonder if its not indexing
fast enough as its pulling data from couchdb (and we probably need a way to
throttle if that happens). How many nodes are you running?

On Mon, Aug 22, 2011 at 12:30 PM, Harry Waye harry@arachnys.com wrote:

This is in 0.17.6. Is the date issue likely to be related to the OOM
exception? Let me know if there is anything I can do, more information etc.

Regards
Harry

On 22 August 2011 08:02, Alex Vasilenko aa.vasilenko@gmail.com wrote:

I'm also getting mapper exceptions on interpreting dates but I assume

these are a separate issue.

What ES version are you using? There was a hotfix in 0.17.6 -
Issues · elastic/elasticsearch · GitHub

Regards,
Alexandr Vasilenko

2011/8/19 Harry Waye hwaye@microwayes.net

I'm trying to index a 1.5GB couchdb database with the CouchDB river but
I'm getting memory issues. elasticsearch.conf is
gist:1156885 · GitHub

The river and index were created with gist:1156889 · GitHub .
The node on which the river is running has 1.4GB of memory allocated to the
JVM. I end up running in to OOM errors after a couple of minutes:
gist:1156931 · GitHub . I'm also getting mapper exceptions on
interpreting dates but I assume these are a separate issue. What should I
be expecting the memory usage to look like?

Thanks
Harry

Opened this issue:
CouchDB River: Add throttling when indexing does not keep up with fetching _changes · Issue #1269 · elastic/elasticsearch · GitHub. Should provide
much better behavior of the couchdb river (its on 0.17 branch as well).

On Mon, Aug 22, 2011 at 10:01 PM, Shay Banon kimchy@gmail.com wrote:

How big are the docs you are trying to index? I wonder if its not indexing
fast enough as its pulling data from couchdb (and we probably need a way to
throttle if that happens). How many nodes are you running?

On Mon, Aug 22, 2011 at 12:30 PM, Harry Waye harry@arachnys.com wrote:

This is in 0.17.6. Is the date issue likely to be related to the OOM
exception? Let me know if there is anything I can do, more information etc.

Regards
Harry

On 22 August 2011 08:02, Alex Vasilenko aa.vasilenko@gmail.com wrote:

I'm also getting mapper exceptions on interpreting dates but I assume

these are a separate issue.

What ES version are you using? There was a hotfix in 0.17.6 -
Issues · elastic/elasticsearch · GitHub

Regards,
Alexandr Vasilenko

2011/8/19 Harry Waye hwaye@microwayes.net

I'm trying to index a 1.5GB couchdb database with the CouchDB river but
I'm getting memory issues. elasticsearch.conf is
gist:1156885 · GitHub

The river and index were created with gist:1156889 · GitHub .
The node on which the river is running has 1.4GB of memory allocated to the
JVM. I end up running in to OOM errors after a couple of minutes:
gist:1156931 · GitHub . I'm also getting mapper exceptions
on interpreting dates but I assume these are a separate issue. What should
I be expecting the memory usage to look like?

Thanks
Harry

I suspect the index rate is probably the issue. I was originally running 3
ec2 large 32bit nodes, as per the original post. I've subsequently just
written my own "river" which I presume due to its poor efficiency is working
fine. I'll have to get back to you on the document size...

H
On 22 Aug 2011 20:20, "Shay Banon" kimchy@gmail.com wrote:

Ah, thanks for the commit, I'll. Try this out tomorrow.

H
On 22 Aug 2011 22:46, "Harry Waye" hwaye@microwayes.net wrote:

I suspect the index rate is probably the issue. I was originally running 3
ec2 large 32bit nodes, as per the original post. I've subsequently just
written my own "river" which I presume due to its poor efficiency is
working
fine. I'll have to get back to you on the document size...

H
On 22 Aug 2011 20:20, "Shay Banon" kimchy@gmail.com wrote:

Looks too be a success, cheers!

On 22 August 2011 22:53, Harry Waye hwaye@microwayes.net wrote:

Ah, thanks for the commit, I'll. Try this out tomorrow.

H
On 22 Aug 2011 22:46, "Harry Waye" hwaye@microwayes.net wrote:

I suspect the index rate is probably the issue. I was originally running
3
ec2 large 32bit nodes, as per the original post. I've subsequently just
written my own "river" which I presume due to its poor efficiency is
working
fine. I'll have to get back to you on the document size...

H
On 22 Aug 2011 20:20, "Shay Banon" kimchy@gmail.com wrote: