I'm trying to set up Elasticsearch upon MongoDB using MongoDB river plugin.
During the test setup on one collection/one river all went well, but when I
tried to complete the test plugging in my other db collections (there are
about 15 total) I've faced a problem of an extreme CPU load (about 100 all
the time).
Here's my setup: there are like 15 collections, most of them empty (the max
one contains like 50 documents, others from 0 to 10), so there's quite a
little data. I use a MongoDB instance (configured as a replica set, of
course) and a localhosted Elasticsearch server. I create a river and an
index per each collection. Here's what happens:
If I send 15 requests at one time, elasticsearch service's going mad.
CPU level raises to 100%, everything gets almost frozen, I can't stop the
service via service elasticsearch stop command (only kill it), I can't run
search queries (timeout), and when I kill the service and re-raise it
again, the same situation happens again.
If I create rivers one at a time, it's all going well - although each
time elasticsearch's allocated memory grows, but not very high (at most 8%
of my 8Gb) - except sometimes when river number is about 10 the same thing
happens. CPU goes high, and I cannot add any more rivers because of
timeout. No search queries either.
I tried various settings, setting memory limits and heap size, but nothing
seems to work. I only wonder why such a performance leap - I've only got
about 70 little documents.
have you tried to attach a profiler, to see where the river is stuck? Try
using visualvm (free) or YourKit (commercial). Without any more hints, it
is hard to help...
You could also call the hot_threads API of elasticsearch (on the node,
where the river is running), see http://www.elasticsearch.org/guide/reference/api/admin-cluster-nodes-hot-threads/
I'm trying to set up Elasticsearch upon MongoDB using MongoDB river
plugin. During the test setup on one collection/one river all went well,
but when I tried to complete the test plugging in my other db collections
(there are about 15 total) I've faced a problem of an extreme CPU load
(about 100 all the time).
Here's my setup: there are like 15 collections, most of them empty (the
max one contains like 50 documents, others from 0 to 10), so there's quite
a little data. I use a MongoDB instance (configured as a replica set, of
course) and a localhosted Elasticsearch server. I create a river and an
index per each collection. Here's what happens:
If I send 15 requests at one time, elasticsearch service's going mad.
CPU level raises to 100%, everything gets almost frozen, I can't stop the
service via service elasticsearch stop command (only kill it), I can't run
search queries (timeout), and when I kill the service and re-raise it
again, the same situation happens again.
If I create rivers one at a time, it's all going well - although each
time elasticsearch's allocated memory grows, but not very high (at most 8%
of my 8Gb) - except sometimes when river number is about 10 the same thing
happens. CPU goes high, and I cannot add any more rivers because of
timeout. No search queries either.
I tried various settings, setting memory limits and heap size, but nothing
seems to work. I only wonder why such a performance leap - I've only got
about 70 little documents.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.