Strange system load increase

Hello,

Last year paramedic reported thousands o searches per second (whereas our
regular load are in the hundreds range) this eventually led to an excessive
cpu load across the cluster (4 machines). Not more than a month later the
same thing happened. We updated to ES 0.90.12 at the time, thinking it
could have something to do with a "forever looping query" bug that was
fixed.

Since then (around 6 months) everything was fine. Yesterday the same thing
happened again (we're still in the same elastic search version). One
interesting thing we noted is that the increase in searches over time,
which we thought was due to more adoption of the cluster in the company,
was actually a product of that weird behavior. We were at 1000 searches per
second. Yesterday it suddenly spiked to 2000 and it required a cluster
restart. After the restart it dropped to 600 and stayed like that.

Is there some recommendation for restarting machines in the cluster from
time to time? Has anyone seen anything like this?

[]'s
Rafael

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKCnWVk2Ji0TNEx_Jdgzw1G5cMTiZZuk%2B1LzsSWVDbkzqiwJ1A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

How are you measuring the searches/s metric? ES doesn't run searches within
itself, they have to be initiated externally somehow.

Also, you should really upgrade :slight_smile:

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 10 August 2014 02:24, Rafael Almeida almeidaraf@gmail.com wrote:

Hello,

Last year paramedic reported thousands o searches per second (whereas our
regular load are in the hundreds range) this eventually led to an excessive
cpu load across the cluster (4 machines). Not more than a month later the
same thing happened. We updated to ES 0.90.12 at the time, thinking it
could have something to do with a "forever looping query" bug that was
fixed.

Since then (around 6 months) everything was fine. Yesterday the same thing
happened again (we're still in the same Elasticsearch version). One
interesting thing we noted is that the increase in searches over time,
which we thought was due to more adoption of the cluster in the company,
was actually a product of that weird behavior. We were at 1000 searches per
second. Yesterday it suddenly spiked to 2000 and it required a cluster
restart. After the restart it dropped to 600 and stayed like that.

Is there some recommendation for restarting machines in the cluster from
time to time? Has anyone seen anything like this?

's
Rafael

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKCnWVk2Ji0TNEx_Jdgzw1G5cMTiZZuk%2B1LzsSWVDbkzqiwJ1A%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKCnWVk2Ji0TNEx_Jdgzw1G5cMTiZZuk%2B1LzsSWVDbkzqiwJ1A%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624bqfuF4Kucx2gWSy40quQQHaFUf0DZwZ3GiaaupHROB-g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

At first I was using elaticsearch paramedic (
GitHub - karmi/elasticsearch-paramedic: A simple tool to inspect the state and statistics about ElasticSearch clusters) recently I used marvel.
Marvel was reporting a 2000 searches/s mark while the cluster was acting
up. After the restart, it now reports 600 searches/s. Looking at nginx logs
I see no change in rate before or after the restart. Maybe something other
than elasticsearch is acting up, but I have no clue what else could it be.

I do need to upgrade, but the breaking changes are making it hard for me to
keep moving :frowning:

On Sat, Aug 9, 2014 at 7:03 PM, Mark Walkom markw@campaignmonitor.com
wrote:

How are you measuring the searches/s metric? ES doesn't run searches
within itself, they have to be initiated externally somehow.

Also, you should really upgrade :slight_smile:

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 10 August 2014 02:24, Rafael Almeida almeidaraf@gmail.com wrote:

Hello,

Last year paramedic reported thousands o searches per second (whereas our
regular load are in the hundreds range) this eventually led to an excessive
cpu load across the cluster (4 machines). Not more than a month later the
same thing happened. We updated to ES 0.90.12 at the time, thinking it
could have something to do with a "forever looping query" bug that was
fixed.

Since then (around 6 months) everything was fine. Yesterday the same
thing happened again (we're still in the same Elasticsearch version). One
interesting thing we noted is that the increase in searches over time,
which we thought was due to more adoption of the cluster in the company,
was actually a product of that weird behavior. We were at 1000 searches per
second. Yesterday it suddenly spiked to 2000 and it required a cluster
restart. After the restart it dropped to 600 and stayed like that.

Is there some recommendation for restarting machines in the cluster from
time to time? Has anyone seen anything like this?

's
Rafael

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKCnWVk2Ji0TNEx_Jdgzw1G5cMTiZZuk%2B1LzsSWVDbkzqiwJ1A%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKCnWVk2Ji0TNEx_Jdgzw1G5cMTiZZuk%2B1LzsSWVDbkzqiwJ1A%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEM624bqfuF4Kucx2gWSy40quQQHaFUf0DZwZ3GiaaupHROB-g%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAEM624bqfuF4Kucx2gWSy40quQQHaFUf0DZwZ3GiaaupHROB-g%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKCnWVkYGwdAEPmQ0EK%3Dqw5LbSM8fV1S%2BjE5cm-Aqii-9dQkJg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.