100% cpu on multiple rivers/collections

Hi!

I'm trying to set up Elasticsearch upon MongoDB using MongoDB river plugin.
During the test setup on one collection/one river all went well, but when I
tried to complete the test plugging in my other db collections (there are
about 15 total) I've faced a problem of an extreme CPU load (about 100 all
the time).

Here's my setup: there are like 15 collections, most of them empty (the max
one contains like 50 documents, others from 0 to 10), so there's quite a
little data. I use a MongoDB instance (configured as a replica set, of
course) and a localhosted Elasticsearch server. I create a river and an
index per each collection. Here's what happens:

  1. If I send 15 requests at one time, elasticsearch service's going mad.
    CPU level raises to 100%, everything gets almost frozen, I can't stop the
    service via service elasticsearch stop command (only kill it), I can't run
    search queries (timeout), and when I kill the service and re-raise it
    again, the same situation happens again.
  2. If I create rivers one at a time, it's all going well - although each
    time elasticsearch's allocated memory grows, but not very high (at most 8%
    of my 8Gb) - except sometimes when river number is about 10 the same thing
    happens. CPU goes high, and I cannot add any more rivers because of
    timeout. No search queries either.

There're a lot of messages like this:

[2013-07-09 06:45:45,863][INFO ][river.mongodb ] [Judas

Traveller] [mongodb][51d290145cfc217a89ae1bfd_river] starting mongodb
stream. options: secondaryreadpreference [false], throttlesize [10], gridfs
[false], filter [], db [kiss_db], collection [51d290145cfc217a89ae1bfd],
script [null], indexing to
[51d290145cfc217a89ae1bfd_index]/[51d290145cfc217a89ae1bfd]

I tried various settings, setting memory limits and heap size, but nothing
seems to work. I only wonder why such a performance leap - I've only got
about 70 little documents.

Thank you,
Artem

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hey,

have you tried to attach a profiler, to see where the river is stuck? Try
using visualvm (free) or YourKit (commercial). Without any more hints, it
is hard to help...
You could also call the hot_threads API of elasticsearch (on the node,
where the river is running), see
http://www.elasticsearch.org/guide/reference/api/admin-cluster-nodes-hot-threads/

--Alex

On Wed, Jul 10, 2013 at 12:13 AM, art1783@gmail.com wrote:

Hi!

I'm trying to set up Elasticsearch upon MongoDB using MongoDB river
plugin. During the test setup on one collection/one river all went well,
but when I tried to complete the test plugging in my other db collections
(there are about 15 total) I've faced a problem of an extreme CPU load
(about 100 all the time).

Here's my setup: there are like 15 collections, most of them empty (the
max one contains like 50 documents, others from 0 to 10), so there's quite
a little data. I use a MongoDB instance (configured as a replica set, of
course) and a localhosted Elasticsearch server. I create a river and an
index per each collection. Here's what happens:

  1. If I send 15 requests at one time, elasticsearch service's going mad.
    CPU level raises to 100%, everything gets almost frozen, I can't stop the
    service via service elasticsearch stop command (only kill it), I can't run
    search queries (timeout), and when I kill the service and re-raise it
    again, the same situation happens again.
  2. If I create rivers one at a time, it's all going well - although each
    time elasticsearch's allocated memory grows, but not very high (at most 8%
    of my 8Gb) - except sometimes when river number is about 10 the same thing
    happens. CPU goes high, and I cannot add any more rivers because of
    timeout. No search queries either.

There're a lot of messages like this:

[2013-07-09 06:45:45,863][INFO ][river.mongodb ] [Judas

Traveller] [mongodb][51d290145cfc217a89ae1bfd_river] starting mongodb
stream. options: secondaryreadpreference [false], throttlesize [10], gridfs
[false], filter [], db [kiss_db], collection [51d290145cfc217a89ae1bfd],
script [null], indexing to
[51d290145cfc217a89ae1bfd_index]/[51d290145cfc217a89ae1bfd]

I tried various settings, setting memory limits and heap size, but nothing
seems to work. I only wonder why such a performance leap - I've only got
about 70 little documents.

Thank you,
Artem

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.