I'm trying to set up Elasticsearch upon MongoDB using MongoDB river plugin.
During the test setup on one collection/one river all went well, but when I
tried to complete the test plugging in my other db collections (there are
about 15 total) I've faced a problem of an extreme CPU load (about 100 all
the time).
Here's my setup: there are like 15 collections, most of them empty (the max
one contains like 50 documents, others from 0 to 10), so there's quite a
little data. I use a MongoDB instance (configured as a replica set, of
course) and a localhosted Elasticsearch server. I create a river and an
index per each collection. Here's what happens:
If I send 15 requests at one time, elasticsearch service's going mad.
CPU level raises to 100%, everything gets almost frozen, I can't stop the
service via service elasticsearch stop command (only kill it), I can't run
search queries (timeout), and when I kill the service and re-raise it
again, the same situation happens again.
If I create rivers one at a time, it's all going well - although each
time elasticsearch's allocated memory grows, but not very high (at most 8%
of my 8Gb) - except sometimes when river number is about 10 the same thing
happens. CPU goes high, and I cannot add any more rivers because of
timeout. No search queries either.
I tried various settings, setting memory limits and heap size, but nothing
seems to work. I only wonder why such a performance leap - I've only got
about 70 little documents.
have you tried to attach a profiler, to see where the river is stuck? Try
using visualvm (free) or YourKit (commercial). Without any more hints, it
is hard to help...
You could also call the hot_threads API of elasticsearch (on the node,
where the river is running), see
I'm trying to set up Elasticsearch upon MongoDB using MongoDB river
plugin. During the test setup on one collection/one river all went well,
but when I tried to complete the test plugging in my other db collections
(there are about 15 total) I've faced a problem of an extreme CPU load
(about 100 all the time).
Here's my setup: there are like 15 collections, most of them empty (the
max one contains like 50 documents, others from 0 to 10), so there's quite
a little data. I use a MongoDB instance (configured as a replica set, of
course) and a localhosted Elasticsearch server. I create a river and an
index per each collection. Here's what happens:
If I send 15 requests at one time, elasticsearch service's going mad.
CPU level raises to 100%, everything gets almost frozen, I can't stop the
service via service elasticsearch stop command (only kill it), I can't run
search queries (timeout), and when I kill the service and re-raise it
again, the same situation happens again.
If I create rivers one at a time, it's all going well - although each
time elasticsearch's allocated memory grows, but not very high (at most 8%
of my 8Gb) - except sometimes when river number is about 10 the same thing
happens. CPU goes high, and I cannot add any more rivers because of
timeout. No search queries either.
I tried various settings, setting memory limits and heap size, but nothing
seems to work. I only wonder why such a performance leap - I've only got
about 70 little documents.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.