2AM to 3AM connectivity issues


(Bryan Murphy) #1

We have a simple elastic search cluster (two nodes + logstash). Every
night at around 2am if we have 4-5 or more indexes in the database (each
index takes up roughly 1gb) elastic search stops responding to queries and
our pagers start going off.

An hour or two later everything clears up and the monitoring system is
happy again.

I'd like an uninterrupted nights sleep. Does anybody have an idea what
could be going on during this period? As far as I can tell the other parts
of this service (logstash+redis queues) aren't misbehaving.

It very consistently happens between 2am and 3am when nobody is lucent or
available.

Thanks,
Bryan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Mark Walkom) #2

What do the ES logs from that time tell you?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 26 October 2013 05:51, Bryan Murphy bmurphy1976@gmail.com wrote:

We have a simple elastic search cluster (two nodes + logstash). Every
night at around 2am if we have 4-5 or more indexes in the database (each
index takes up roughly 1gb) elastic search stops responding to queries and
our pagers start going off.

An hour or two later everything clears up and the monitoring system is
happy again.

I'd like an uninterrupted nights sleep. Does anybody have an idea what
could be going on during this period? As far as I can tell the other parts
of this service (logstash+redis queues) aren't misbehaving.

It very consistently happens between 2am and 3am when nobody is lucent or
available.

Thanks,
Bryan

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(vineeth mohan-2) #3

Quite interesting problem .

Some years back, i had a similar problem. During morning 8 AM , things
don't work as it usually had to.People even thought that the machines were
haunted :stuck_out_tongue: .
So what happened was that , there was a script in the cron which tipped off
at that time and it did the backlog indexing for the entire day at that
particular time.
So the ghost was actually a cron doing all its backlogs at 8 AM.
See if the issues are similar.

Thanks
Vineeth

On Sat, Oct 26, 2013 at 12:11 PM, Mark Walkom markw@campaignmonitor.comwrote:

What do the ES logs from that time tell you?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 26 October 2013 05:51, Bryan Murphy bmurphy1976@gmail.com wrote:

We have a simple elastic search cluster (two nodes + logstash). Every
night at around 2am if we have 4-5 or more indexes in the database (each
index takes up roughly 1gb) elastic search stops responding to queries and
our pagers start going off.

An hour or two later everything clears up and the monitoring system is
happy again.

I'd like an uninterrupted nights sleep. Does anybody have an idea what
could be going on during this period? As far as I can tell the other parts
of this service (logstash+redis queues) aren't misbehaving.

It very consistently happens between 2am and 3am when nobody is lucent or
available.

Thanks,
Bryan

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Igor Motov) #4

Bryan, you didn't mention which timezone you are in. Is it 2am CET by any
chance? Are there any messages in the log files around 2am? If you don't
see anything in log files try enabling slow index and slow query log and if
it will wake you up another time, try running hot_threadhttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-nodes-hot-threads.html a
few times around 2:30 am and post results here.

On Saturday, October 26, 2013 3:07:46 AM UTC-4, vineeth mohan wrote:

Quite interesting problem .

Some years back, i had a similar problem. During morning 8 AM , things
don't work as it usually had to.People even thought that the machines were
haunted :stuck_out_tongue: .
So what happened was that , there was a script in the cron which tipped
off at that time and it did the backlog indexing for the entire day at that
particular time.
So the ghost was actually a cron doing all its backlogs at 8 AM.
See if the issues are similar.

Thanks
Vineeth

On Sat, Oct 26, 2013 at 12:11 PM, Mark Walkom <ma...@campaignmonitor.com<javascript:>

wrote:

What do the ES logs from that time tell you?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com <javascript:>
web: www.campaignmonitor.com

On 26 October 2013 05:51, Bryan Murphy <bmurp...@gmail.com <javascript:>>wrote:

We have a simple elastic search cluster (two nodes + logstash). Every
night at around 2am if we have 4-5 or more indexes in the database (each
index takes up roughly 1gb) elastic search stops responding to queries and
our pagers start going off.

An hour or two later everything clears up and the monitoring system is
happy again.

I'd like an uninterrupted nights sleep. Does anybody have an idea what
could be going on during this period? As far as I can tell the other parts
of this service (logstash+redis queues) aren't misbehaving.

It very consistently happens between 2am and 3am when nobody is lucent
or available.

Thanks,
Bryan

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Bryan Murphy) #5

Central Time Zone. I'll look into those options. As far as I can tell
nothing else is going on on that machine at during that time but I'll
definitely take a closer look. It's been a tough one because of the time
it happens (and our desire to get back to sleep).

Thanks!
Bryan

On Sat, Oct 26, 2013 at 3:04 PM, Igor Motov imotov@gmail.com wrote:

Bryan, you didn't mention which timezone you are in. Is it 2am CET by any
chance? Are there any messages in the log files around 2am? If you don't
see anything in log files try enabling slow index and slow query log and if
it will wake you up another time, try running hot_threadhttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-nodes-hot-threads.html a
few times around 2:30 am and post results here.

On Saturday, October 26, 2013 3:07:46 AM UTC-4, vineeth mohan wrote:

Quite interesting problem .

Some years back, i had a similar problem. During morning 8 AM , things
don't work as it usually had to.People even thought that the machines were
haunted :stuck_out_tongue: .
So what happened was that , there was a script in the cron which tipped
off at that time and it did the backlog indexing for the entire day at that
particular time.
So the ghost was actually a cron doing all its backlogs at 8 AM.
See if the issues are similar.

Thanks
Vineeth

On Sat, Oct 26, 2013 at 12:11 PM, Mark Walkom ma...@campaignmonitor.comwrote:

What do the ES logs from that time tell you?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 26 October 2013 05:51, Bryan Murphy bmurp...@gmail.com wrote:

We have a simple elastic search cluster (two nodes + logstash). Every
night at around 2am if we have 4-5 or more indexes in the database (each
index takes up roughly 1gb) elastic search stops responding to queries and
our pagers start going off.

An hour or two later everything clears up and the monitoring system is
happy again.

I'd like an uninterrupted nights sleep. Does anybody have an idea what
could be going on during this period? As far as I can tell the other parts
of this service (logstash+redis queues) aren't misbehaving.

It very consistently happens between 2am and 3am when nobody is lucent
or available.

Thanks,
Bryan

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@**googlegroups.com.

For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out
.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@**googlegroups.com.

For more options, visit https://groups.google.com/**groups/opt_outhttps://groups.google.com/groups/opt_out
.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #6