We just installed an ELK server and configured the logstash configuration
to match the data that we send to it and until last month it seems to be
working fine but since then we see very strange behavior in the Kibana, the
event over time histogram shows the event rate at the normal level for
about a half an hour, then drops to about 20% of the normal rate and then
it continues to drop slowly for about two hours and then stops and after a
minute or two it returns to normal for the next half an hour or so and the
same behavior repeats. Needless to say that both the /var/log/logstash and
/var/log/elasticsearch both show nothing since the service started and by
using tcpdump we can verify that events keep coming in at the same rate all
time. I attached our logstash configuration, the
/var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and
a screenshot of our Kibana with no filter applied so that you can see the
weird behavior that we see.
Is there someone/somewhere that we can turn to to get some help on the
subject?
On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa iyuvalk@gmail.com wrote:
Hi,
We just installed an ELK server and configured the logstash configuration
to match the data that we send to it and until last month it seems to be
working fine but since then we see very strange behavior in the Kibana, the
event over time histogram shows the event rate at the normal level for
about a half an hour, then drops to about 20% of the normal rate and then
it continues to drop slowly for about two hours and then stops and after a
minute or two it returns to normal for the next half an hour or so and the
same behavior repeats. Needless to say that both the /var/log/logstash and
/var/log/elasticsearch both show nothing since the service started and by
using tcpdump we can verify that events keep coming in at the same rate all
time. I attached our logstash configuration, the
/var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and
a screenshot of our Kibana with no filter applied so that you can see the
weird behavior that we see.
Is there someone/somewhere that we can turn to to get some help on the
subject?
Absolutely (but since that in the past I also worked at the helpdesk dept.
I certainly understand why it is important to ask those "Are you sure it's
plugged in?" questions...). One of the logs is comming from SecurityOnion
which logs (via bro-conn) all the connections so it must be sending data
24x7x365.
Thanks for the quick reply,
Yuval.
On Tuesday, February 10, 2015, Itamar Syn-Hershko itamar@code972.com
wrote:
Are you sure your logs are generated linearly without bursts?
On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa <iyuvalk@gmail.com
<javascript:_e(%7B%7D,'cvml','iyuvalk@gmail.com');>> wrote:
Hi,
We just installed an ELK server and configured the logstash configuration
to match the data that we send to it and until last month it seems to be
working fine but since then we see very strange behavior in the Kibana, the
event over time histogram shows the event rate at the normal level for
about a half an hour, then drops to about 20% of the normal rate and then
it continues to drop slowly for about two hours and then stops and after a
minute or two it returns to normal for the next half an hour or so and the
same behavior repeats. Needless to say that both the /var/log/logstash and
/var/log/elasticsearch both show nothing since the service started and by
using tcpdump we can verify that events keep coming in at the same rate all
time. I attached our logstash configuration, the
/var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and
a screenshot of our Kibana with no filter applied so that you can see the
weird behavior that we see.
Is there someone/somewhere that we can turn to to get some help on the
subject?
The graphic you sent suggests the issue is with logstash - since the @timestamp field is being populated by logstash and is the one that is used
to display the date histogram graphics in Kibana. I would start there. I.e.
maybe SecurityOnion buffers writes etc, and then to check the logstash
shipper process stats.
On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa iyuvalk@gmail.com wrote:
Hi.
Absolutely (but since that in the past I also worked at the helpdesk dept.
I certainly understand why it is important to ask those "Are you sure it's
plugged in?" questions...). One of the logs is comming from SecurityOnion
which logs (via bro-conn) all the connections so it must be sending data
24x7x365.
Thanks for the quick reply,
Yuval.
On Tuesday, February 10, 2015, Itamar Syn-Hershko itamar@code972.com
wrote:
Are you sure your logs are generated linearly without bursts?
On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa iyuvalk@gmail.com wrote:
Hi,
We just installed an ELK server and configured the logstash
configuration to match the data that we send to it and until last month it
seems to be working fine but since then we see very strange behavior in the
Kibana, the event over time histogram shows the event rate at the normal
level for about a half an hour, then drops to about 20% of the normal rate
and then it continues to drop slowly for about two hours and then stops and
after a minute or two it returns to normal for the next half an hour or so
and the same behavior repeats. Needless to say that both the
/var/log/logstash and /var/log/elasticsearch both show nothing since the
service started and by using tcpdump we can verify that events keep coming
in at the same rate all time. I attached our logstash configuration, the
/var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and
a screenshot of our Kibana with no filter applied so that you can see the
weird behavior that we see.
Is there someone/somewhere that we can turn to to get some help on the
subject?
On Tuesday, February 10, 2015, Itamar Syn-Hershko itamar@code972.com
wrote:
The graphic you sent suggests the issue is with logstash - since the @timestamp field is being populated by logstash and is the one that is used
to display the date histogram graphics in Kibana. I would start there. I.e.
maybe SecurityOnion buffers writes etc, and then to check the logstash
shipper process stats.
On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa <iyuvalk@gmail.com
<javascript:_e(%7B%7D,'cvml','iyuvalk@gmail.com');>> wrote:
Hi.
Absolutely (but since that in the past I also worked at the helpdesk
dept. I certainly understand why it is important to ask those "Are you sure
it's plugged in?" questions...). One of the logs is comming from
SecurityOnion which logs (via bro-conn) all the connections so it must be
sending data 24x7x365.
Thanks for the quick reply,
Yuval.
On Tuesday, February 10, 2015, Itamar Syn-Hershko <itamar@code972.com
<javascript:_e(%7B%7D,'cvml','itamar@code972.com');>> wrote:
Are you sure your logs are generated linearly without bursts?
On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi,
We just installed an ELK server and configured the logstash
configuration to match the data that we send to it and until last month it
seems to be working fine but since then we see very strange behavior in the
Kibana, the event over time histogram shows the event rate at the normal
level for about a half an hour, then drops to about 20% of the normal rate
and then it continues to drop slowly for about two hours and then stops and
after a minute or two it returns to normal for the next half an hour or so
and the same behavior repeats. Needless to say that both the
/var/log/logstash and /var/log/elasticsearch both show nothing since the
service started and by using tcpdump we can verify that events keep coming
in at the same rate all time. I attached our logstash configuration, the
/var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and
a screenshot of our Kibana with no filter applied so that you can see the
weird behavior that we see.
Is there someone/somewhere that we can turn to to get some help on the
subject?
I'd start by using logstash with input tcp and output fs and see how the
file behaves. Same for the fs inputs - see how their files behave. And take
it from there.
On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa iyuvalk@gmail.com wrote:
Great! How can I check that?
On Tuesday, February 10, 2015, Itamar Syn-Hershko itamar@code972.com
wrote:
The graphic you sent suggests the issue is with logstash - since the @timestamp field is being populated by logstash and is the one that is used
to display the date histogram graphics in Kibana. I would start there. I.e.
maybe SecurityOnion buffers writes etc, and then to check the logstash
shipper process stats.
On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa iyuvalk@gmail.com wrote:
Hi.
Absolutely (but since that in the past I also worked at the helpdesk
dept. I certainly understand why it is important to ask those "Are you sure
it's plugged in?" questions...). One of the logs is comming from
SecurityOnion which logs (via bro-conn) all the connections so it must be
sending data 24x7x365.
Thanks for the quick reply,
Yuval.
On Tuesday, February 10, 2015, Itamar Syn-Hershko itamar@code972.com
wrote:
Are you sure your logs are generated linearly without bursts?
On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi,
We just installed an ELK server and configured the logstash
configuration to match the data that we send to it and until last month it
seems to be working fine but since then we see very strange behavior in the
Kibana, the event over time histogram shows the event rate at the normal
level for about a half an hour, then drops to about 20% of the normal rate
and then it continues to drop slowly for about two hours and then stops and
after a minute or two it returns to normal for the next half an hour or so
and the same behavior repeats. Needless to say that both the
/var/log/logstash and /var/log/elasticsearch both show nothing since the
service started and by using tcpdump we can verify that events keep coming
in at the same rate all time. I attached our logstash configuration, the
/var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and
a screenshot of our Kibana with no filter applied so that you can see the
weird behavior that we see.
Is there someone/somewhere that we can turn to to get some help on the
subject?
When you say "see how the file behaves" I'm not quite sure what you mean by
that... As I mentioned earlier, it's not that events do not appear at all
but instead, the RATE at which they come decreases, so how can I measure
the events rate in a file? I thought that there's another way that I can
test this: I'll write a quick-and-dirty program that will send an event to
the ELK via TCP every 12ms which should result in events rate of about
5,000 events per minute and I'll let you know if the events rate continues
to drop or not...
Thanks,
Yuval.
On Tuesday, February 10, 2015, Itamar Syn-Hershko itamar@code972.com
wrote:
I'd start by using logstash with input tcp and output fs and see how the
file behaves. Same for the fs inputs - see how their files behave. And take
it from there.
On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa <iyuvalk@gmail.com
<javascript:_e(%7B%7D,'cvml','iyuvalk@gmail.com');>> wrote:
Great! How can I check that?
On Tuesday, February 10, 2015, Itamar Syn-Hershko <itamar@code972.com
<javascript:_e(%7B%7D,'cvml','itamar@code972.com');>> wrote:
The graphic you sent suggests the issue is with logstash - since the @timestamp field is being populated by logstash and is the one that is used
to display the date histogram graphics in Kibana. I would start there. I.e.
maybe SecurityOnion buffers writes etc, and then to check the logstash
shipper process stats.
On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi.
Absolutely (but since that in the past I also worked at the helpdesk
dept. I certainly understand why it is important to ask those "Are you sure
it's plugged in?" questions...). One of the logs is comming from
SecurityOnion which logs (via bro-conn) all the connections so it must be
sending data 24x7x365.
Thanks for the quick reply,
Yuval.
On Tuesday, February 10, 2015, Itamar Syn-Hershko itamar@code972.com
wrote:
Are you sure your logs are generated linearly without bursts?
On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi,
We just installed an ELK server and configured the logstash
configuration to match the data that we send to it and until last month it
seems to be working fine but since then we see very strange behavior in the
Kibana, the event over time histogram shows the event rate at the normal
level for about a half an hour, then drops to about 20% of the normal rate
and then it continues to drop slowly for about two hours and then stops and
after a minute or two it returns to normal for the next half an hour or so
and the same behavior repeats. Needless to say that both the
/var/log/logstash and /var/log/elasticsearch both show nothing since the
service started and by using tcpdump we can verify that events keep coming
in at the same rate all time. I attached our logstash configuration, the
/var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and
a screenshot of our Kibana with no filter applied so that you can see the
weird behavior that we see.
Is there someone/somewhere that we can turn to to get some help on
the subject?
I wrote that program and ran it and it did managed to keep a steady rate of
about 1,000 events per minute even when the Kibana's total events per
minute dropped from 60,000 to 6,000. However, when the
Kibana's total events per minute dropped to zero, my program got a
"connection refused" exception. I ran netstat -s and found out that every
time the Kibana's line hit zero the number of RX-DRP increased. At that
point I understood that I forgot to mention that this server has a 10GbE
nic. Is it possible that the packets are being dropped because of some
bufferis filling up? If so, how can I test it and verify that this is
actually the case? If it is, how can I solve it?
Thanks,
Yuval.
On Wednesday, February 11, 2015, Yuval Khalifa iyuvalk@gmail.com wrote:
Hi.
When you say "see how the file behaves" I'm not quite sure what you mean
by that... As I mentioned earlier, it's not that events do not appear at
all but instead, the RATE at which they come decreases, so how can I
measure the events rate in a file? I thought that there's another way that
I can test this: I'll write a quick-and-dirty program that will send an
event to the ELK via TCP every 12ms which should result in events rate of
about 5,000 events per minute and I'll let you know if the events rate
continues to drop or not...
Thanks,
Yuval.
On Tuesday, February 10, 2015, Itamar Syn-Hershko <itamar@code972.com
<javascript:_e(%7B%7D,'cvml','itamar@code972.com');>> wrote:
I'd start by using logstash with input tcp and output fs and see how the
file behaves. Same for the fs inputs - see how their files behave. And take
it from there.
On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa iyuvalk@gmail.com wrote:
Great! How can I check that?
On Tuesday, February 10, 2015, Itamar Syn-Hershko itamar@code972.com
wrote:
The graphic you sent suggests the issue is with logstash - since the @timestamp field is being populated by logstash and is the one that is used
to display the date histogram graphics in Kibana. I would start there. I.e.
maybe SecurityOnion buffers writes etc, and then to check the logstash
shipper process stats.
On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi.
Absolutely (but since that in the past I also worked at the helpdesk
dept. I certainly understand why it is important to ask those "Are you sure
it's plugged in?" questions...). One of the logs is comming from
SecurityOnion which logs (via bro-conn) all the connections so it must be
sending data 24x7x365.
Thanks for the quick reply,
Yuval.
On Tuesday, February 10, 2015, Itamar Syn-Hershko itamar@code972.com
wrote:
Are you sure your logs are generated linearly without bursts?
On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi,
We just installed an ELK server and configured the logstash
configuration to match the data that we send to it and until last month it
seems to be working fine but since then we see very strange behavior in the
Kibana, the event over time histogram shows the event rate at the normal
level for about a half an hour, then drops to about 20% of the normal rate
and then it continues to drop slowly for about two hours and then stops and
after a minute or two it returns to normal for the next half an hour or so
and the same behavior repeats. Needless to say that both the
/var/log/logstash and /var/log/elasticsearch both show nothing since the
service started and by using tcpdump we can verify that events keep coming
in at the same rate all time. I attached our logstash configuration, the
/var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and
a screenshot of our Kibana with no filter applied so that you can see the
weird behavior that we see.
Is there someone/somewhere that we can turn to to get some help on
the subject?
On Thu, Feb 12, 2015 at 11:28 AM, Yuval Khalifa iyuvalk@gmail.com wrote:
Hi,
I wrote that program and ran it and it did managed to keep a steady rate
of about 1,000 events per minute even when the Kibana's total events per
minute dropped from 60,000 to 6,000. However, when the
Kibana's total events per minute dropped to zero, my program got a
"connection refused" exception. I ran netstat -s and found out that every
time the Kibana's line hit zero the number of RX-DRP increased. At that
point I understood that I forgot to mention that this server has a 10GbE
nic. Is it possible that the packets are being dropped because of some
bufferis filling up? If so, how can I test it and verify that this is
actually the case? If it is, how can I solve it?
Thanks,
Yuval.
On Wednesday, February 11, 2015, Yuval Khalifa iyuvalk@gmail.com wrote:
Hi.
When you say "see how the file behaves" I'm not quite sure what you mean
by that... As I mentioned earlier, it's not that events do not appear at
all but instead, the RATE at which they come decreases, so how can I
measure the events rate in a file? I thought that there's another way that
I can test this: I'll write a quick-and-dirty program that will send an
event to the ELK via TCP every 12ms which should result in events rate of
about 5,000 events per minute and I'll let you know if the events rate
continues to drop or not...
Thanks,
Yuval.
On Tuesday, February 10, 2015, Itamar Syn-Hershko itamar@code972.com
wrote:
I'd start by using logstash with input tcp and output fs and see how the
file behaves. Same for the fs inputs - see how their files behave. And take
it from there.
On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Great! How can I check that?
On Tuesday, February 10, 2015, Itamar Syn-Hershko itamar@code972.com
wrote:
The graphic you sent suggests the issue is with logstash - since the @timestamp field is being populated by logstash and is the one that is used
to display the date histogram graphics in Kibana. I would start there. I.e.
maybe SecurityOnion buffers writes etc, and then to check the logstash
shipper process stats.
On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi.
Absolutely (but since that in the past I also worked at the helpdesk
dept. I certainly understand why it is important to ask those "Are you sure
it's plugged in?" questions...). One of the logs is comming from
SecurityOnion which logs (via bro-conn) all the connections so it must be
sending data 24x7x365.
Thanks for the quick reply,
Yuval.
On Tuesday, February 10, 2015, Itamar Syn-Hershko itamar@code972.com
wrote:
Are you sure your logs are generated linearly without bursts?
On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi,
We just installed an ELK server and configured the logstash
configuration to match the data that we send to it and until last month it
seems to be working fine but since then we see very strange behavior in the
Kibana, the event over time histogram shows the event rate at the normal
level for about a half an hour, then drops to about 20% of the normal rate
and then it continues to drop slowly for about two hours and then stops and
after a minute or two it returns to normal for the next half an hour or so
and the same behavior repeats. Needless to say that both the
/var/log/logstash and /var/log/elasticsearch both show nothing since the
service started and by using tcpdump we can verify that events keep coming
in at the same rate all time. I attached our logstash configuration, the
/var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and
a screenshot of our Kibana with no filter applied so that you can see the
weird behavior that we see.
Is there someone/somewhere that we can turn to to get some help on
the subject?
On Thu, Feb 12, 2015 at 11:28 AM, Yuval Khalifa <iyuvalk@gmail.com
<javascript:_e(%7B%7D,'cvml','iyuvalk@gmail.com');>> wrote:
Hi,
I wrote that program and ran it and it did managed to keep a steady rate
of about 1,000 events per minute even when the Kibana's total events per
minute dropped from 60,000 to 6,000. However, when the
Kibana's total events per minute dropped to zero, my program got a
"connection refused" exception. I ran netstat -s and found out that every
time the Kibana's line hit zero the number of RX-DRP increased. At that
point I understood that I forgot to mention that this server has a 10GbE
nic. Is it possible that the packets are being dropped because of some
bufferis filling up? If so, how can I test it and verify that this is
actually the case? If it is, how can I solve it?
Thanks,
Yuval.
On Wednesday, February 11, 2015, Yuval Khalifa <iyuvalk@gmail.com
<javascript:_e(%7B%7D,'cvml','iyuvalk@gmail.com');>> wrote:
Hi.
When you say "see how the file behaves" I'm not quite sure what you
mean by that... As I mentioned earlier, it's not that events do not appear
at all but instead, the RATE at which they come decreases, so how can I
measure the events rate in a file? I thought that there's another way that
I can test this: I'll write a quick-and-dirty program that will send an
event to the ELK via TCP every 12ms which should result in events rate of
about 5,000 events per minute and I'll let you know if the events rate
continues to drop or not...
Thanks,
Yuval.
On Tuesday, February 10, 2015, Itamar Syn-Hershko itamar@code972.com
wrote:
I'd start by using logstash with input tcp and output fs and see how
the file behaves. Same for the fs inputs - see how their files behave. And
take it from there.
On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Great! How can I check that?
On Tuesday, February 10, 2015, Itamar Syn-Hershko itamar@code972.com
wrote:
The graphic you sent suggests the issue is with logstash - since the @timestamp field is being populated by logstash and is the one that is used
to display the date histogram graphics in Kibana. I would start there. I.e.
maybe SecurityOnion buffers writes etc, and then to check the logstash
shipper process stats.
On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi.
Absolutely (but since that in the past I also worked at the helpdesk
dept. I certainly understand why it is important to ask those "Are you sure
it's plugged in?" questions...). One of the logs is comming from
SecurityOnion which logs (via bro-conn) all the connections so it must be
sending data 24x7x365.
Thanks for the quick reply,
Yuval.
On Tuesday, February 10, 2015, Itamar Syn-Hershko < itamar@code972.com> wrote:
Are you sure your logs are generated linearly without bursts?
On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi,
We just installed an ELK server and configured the logstash
configuration to match the data that we send to it and until last month it
seems to be working fine but since then we see very strange behavior in the
Kibana, the event over time histogram shows the event rate at the normal
level for about a half an hour, then drops to about 20% of the normal rate
and then it continues to drop slowly for about two hours and then stops and
after a minute or two it returns to normal for the next half an hour or so
and the same behavior repeats. Needless to say that both the
/var/log/logstash and /var/log/elasticsearch both show nothing since the
service started and by using tcpdump we can verify that events keep coming
in at the same rate all time. I attached our logstash configuration, the
/var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and
a screenshot of our Kibana with no filter applied so that you can see the
weird behavior that we see.
Is there someone/somewhere that we can turn to to get some help on
the subject?
On Thu, Feb 12, 2015 at 11:28 AM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi,
I wrote that program and ran it and it did managed to keep a steady rate
of about 1,000 events per minute even when the Kibana's total events per
minute dropped from 60,000 to 6,000. However, when the
Kibana's total events per minute dropped to zero, my program got a
"connection refused" exception. I ran netstat -s and found out that every
time the Kibana's line hit zero the number of RX-DRP increased. At that
point I understood that I forgot to mention that this server has a 10GbE
nic. Is it possible that the packets are being dropped because of some
bufferis filling up? If so, how can I test it and verify that this is
actually the case? If it is, how can I solve it?
Thanks,
Yuval.
On Wednesday, February 11, 2015, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi.
When you say "see how the file behaves" I'm not quite sure what you
mean by that... As I mentioned earlier, it's not that events do not appear
at all but instead, the RATE at which they come decreases, so how can I
measure the events rate in a file? I thought that there's another way that
I can test this: I'll write a quick-and-dirty program that will send an
event to the ELK via TCP every 12ms which should result in events rate of
about 5,000 events per minute and I'll let you know if the events rate
continues to drop or not...
Thanks,
Yuval.
On Tuesday, February 10, 2015, Itamar Syn-Hershko itamar@code972.com
wrote:
I'd start by using logstash with input tcp and output fs and see how
the file behaves. Same for the fs inputs - see how their files behave. And
take it from there.
On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Great! How can I check that?
On Tuesday, February 10, 2015, Itamar Syn-Hershko itamar@code972.com
wrote:
The graphic you sent suggests the issue is with logstash - since the @timestamp field is being populated by logstash and is the one that is used
to display the date histogram graphics in Kibana. I would start there. I.e.
maybe SecurityOnion buffers writes etc, and then to check the logstash
shipper process stats.
On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi.
Absolutely (but since that in the past I also worked at
the helpdesk dept. I certainly understand why it is important to ask those
"Are you sure it's plugged in?" questions...). One of the logs is comming
from SecurityOnion which logs (via bro-conn) all the connections so it must
be sending data 24x7x365.
Thanks for the quick reply,
Yuval.
On Tuesday, February 10, 2015, Itamar Syn-Hershko < itamar@code972.com> wrote:
Are you sure your logs are generated linearly without bursts?
On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi,
We just installed an ELK server and configured the logstash
configuration to match the data that we send to it and until last month it
seems to be working fine but since then we see very strange behavior in the
Kibana, the event over time histogram shows the event rate at the normal
level for about a half an hour, then drops to about 20% of the normal rate
and then it continues to drop slowly for about two hours and then stops and
after a minute or two it returns to normal for the next half an hour or so
and the same behavior repeats. Needless to say that both the
/var/log/logstash and /var/log/elasticsearch both show nothing since the
service started and by using tcpdump we can verify that events keep coming
in at the same rate all time. I attached our logstash configuration, the
/var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and
a screenshot of our Kibana with no filter applied so that you can see the
weird behavior that we see.
Is there someone/somewhere that we can turn to to get some help
on the subject?
On Thu, Feb 12, 2015 at 11:28 AM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi,
I wrote that program and ran it and it did managed to keep a steady
rate of about 1,000 events per minute even when the Kibana's total events
per minute dropped from 60,000 to 6,000. However, when the
Kibana's total events per minute dropped to zero, my program got a
"connection refused" exception. I ran netstat -s and found out that every
time the Kibana's line hit zero the number of RX-DRP increased. At that
point I understood that I forgot to mention that this server has a 10GbE
nic. Is it possible that the packets are being dropped because of some
bufferis filling up? If so, how can I test it and verify that this is
actually the case? If it is, how can I solve it?
Thanks,
Yuval.
On Wednesday, February 11, 2015, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi.
When you say "see how the file behaves" I'm not quite sure what you
mean by that... As I mentioned earlier, it's not that events do not appear
at all but instead, the RATE at which they come decreases, so how can I
measure the events rate in a file? I thought that there's another way that
I can test this: I'll write a quick-and-dirty program that will send an
event to the ELK via TCP every 12ms which should result in events rate of
about 5,000 events per minute and I'll let you know if the events rate
continues to drop or not...
Thanks,
Yuval.
On Tuesday, February 10, 2015, Itamar Syn-Hershko itamar@code972.com
wrote:
I'd start by using logstash with input tcp and output fs and see how
the file behaves. Same for the fs inputs - see how their files behave. And
take it from there.
On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Great! How can I check that?
On Tuesday, February 10, 2015, Itamar Syn-Hershko < itamar@code972.com> wrote:
The graphic you sent suggests the issue is with logstash - since
the @timestamp field is being populated by logstash and is the one that is
used to display the date histogram graphics in Kibana. I would start there.
I.e. maybe SecurityOnion buffers writes etc, and then to check the logstash
shipper process stats.
On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi.
Absolutely (but since that in the past I also worked at
the helpdesk dept. I certainly understand why it is important to ask those
"Are you sure it's plugged in?" questions...). One of the logs is comming
from SecurityOnion which logs (via bro-conn) all the connections so it must
be sending data 24x7x365.
Thanks for the quick reply,
Yuval.
On Tuesday, February 10, 2015, Itamar Syn-Hershko < itamar@code972.com> wrote:
Are you sure your logs are generated linearly without bursts?
We just installed an ELK server and configured the logstash
configuration to match the data that we send to it and until last month it
seems to be working fine but since then we see very strange behavior in the
Kibana, the event over time histogram shows the event rate at the normal
level for about a half an hour, then drops to about 20% of the normal rate
and then it continues to drop slowly for about two hours and then stops and
after a minute or two it returns to normal for the next half an hour or so
and the same behavior repeats. Needless to say that both the
/var/log/logstash and /var/log/elasticsearch both show nothing since the
service started and by using tcpdump we can verify that events keep coming
in at the same rate all time. I attached our logstash configuration, the
/var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and
a screenshot of our Kibana with no filter applied so that you can see the
weird behavior that we see.
Is there someone/somewhere that we can turn to to get some help
on the subject?
Well SSD would also fix all the pains for my bank too... (-;
Are you sure it's caused by disk latency and not some sort of mis-tuned TCP
driver? I've read some blogs that recommeded to increase some of the
buffers at the sysctl.conf. Do you think so too?
On Thursday, February 12, 2015, Itamar Syn-Hershko itamar@code972.com
wrote:
Yes, make sure the disk is local and not low latency shared one (e.g.
SAN). Also SSD will probably fix all your pains.
On Thu, Feb 12, 2015 at 11:28 AM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi,
I wrote that program and ran it and it did managed to keep a steady
rate of about 1,000 events per minute even when the Kibana's total events
per minute dropped from 60,000 to 6,000. However, when the
Kibana's total events per minute dropped to zero, my program got a
"connection refused" exception. I ran netstat -s and found out that every
time the Kibana's line hit zero the number of RX-DRP increased. At that
point I understood that I forgot to mention that this server has a 10GbE
nic. Is it possible that the packets are being dropped because of some
bufferis filling up? If so, how can I test it and verify that this is
actually the case? If it is, how can I solve it?
Thanks,
Yuval.
On Wednesday, February 11, 2015, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi.
When you say "see how the file behaves" I'm not quite sure what you
mean by that... As I mentioned earlier, it's not that events do not appear
at all but instead, the RATE at which they come decreases, so how can I
measure the events rate in a file? I thought that there's another way that
I can test this: I'll write a quick-and-dirty program that will send an
event to the ELK via TCP every 12ms which should result in events rate of
about 5,000 events per minute and I'll let you know if the events rate
continues to drop or not...
Thanks,
Yuval.
On Tuesday, February 10, 2015, Itamar Syn-Hershko itamar@code972.com
wrote:
I'd start by using logstash with input tcp and output fs and see how
the file behaves. Same for the fs inputs - see how their files behave. And
take it from there.
On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Great! How can I check that?
On Tuesday, February 10, 2015, Itamar Syn-Hershko < itamar@code972.com> wrote:
The graphic you sent suggests the issue is with logstash - since
the @timestamp field is being populated by logstash and is the one that is
used to display the date histogram graphics in Kibana. I would start there.
I.e. maybe SecurityOnion buffers writes etc, and then to check the logstash
shipper process stats.
On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi.
Absolutely (but since that in the past I also worked at
the helpdesk dept. I certainly understand why it is important to ask those
"Are you sure it's plugged in?" questions...). One of the logs is comming
from SecurityOnion which logs (via bro-conn) all the connections so it must
be sending data 24x7x365.
Thanks for the quick reply,
Yuval.
On Tuesday, February 10, 2015, Itamar Syn-Hershko < itamar@code972.com> wrote:
Are you sure your logs are generated linearly without bursts?
We just installed an ELK server and configured the logstash
configuration to match the data that we send to it and until last month it
seems to be working fine but since then we see very strange behavior in the
Kibana, the event over time histogram shows the event rate at the normal
level for about a half an hour, then drops to about 20% of the normal rate
and then it continues to drop slowly for about two hours and then stops and
after a minute or two it returns to normal for the next half an hour or so
and the same behavior repeats. Needless to say that both the
/var/log/logstash and /var/log/elasticsearch both show nothing since the
service started and by using tcpdump we can verify that events keep coming
in at the same rate all time. I attached our logstash configuration, the
/var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and
a screenshot of our Kibana with no filter applied so that you can see the
weird behavior that we see.
Is there someone/somewhere that we can turn to to get some help
on the subject?
Well SSD would also fix all the pains for my bank too... (-;
Are you sure it's caused by disk latency and not some sort of mis-tuned TCP
driver? I've read some blogs that recommeded to increase some of the
buffers at the sysctl.conf. Do you think so too?
On Thursday, February 12, 2015, Itamar Syn-Hershko itamar@code972.com
wrote:
Yes, make sure the disk is local and not low latency shared one (e.g.
SAN). Also SSD will probably fix all your pains.
On Thu, Feb 12, 2015 at 11:28 AM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi,
I wrote that program and ran it and it did managed to keep a steady
rate of about 1,000 events per minute even when the Kibana's total events
per minute dropped from 60,000 to 6,000. However, when the
Kibana's total events per minute dropped to zero, my program got a
"connection refused" exception. I ran netstat -s and found out that every
time the Kibana's line hit zero the number of RX-DRP increased. At that
point I understood that I forgot to mention that this server has a 10GbE
nic. Is it possible that the packets are being dropped because of some
bufferis filling up? If so, how can I test it and verify that this is
actually the case? If it is, how can I solve it?
Thanks,
Yuval.
On Wednesday, February 11, 2015, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi.
When you say "see how the file behaves" I'm not quite sure what you
mean by that... As I mentioned earlier, it's not that events do not appear
at all but instead, the RATE at which they come decreases, so how can I
measure the events rate in a file? I thought that there's another way that
I can test this: I'll write a quick-and-dirty program that will send an
event to the ELK via TCP every 12ms which should result in events rate of
about 5,000 events per minute and I'll let you know if the events rate
continues to drop or not...
Thanks,
Yuval.
On Tuesday, February 10, 2015, Itamar Syn-Hershko itamar@code972.com
wrote:
I'd start by using logstash with input tcp and output fs and see how
the file behaves. Same for the fs inputs - see how their files behave. And
take it from there.
On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Great! How can I check that?
On Tuesday, February 10, 2015, Itamar Syn-Hershko < itamar@code972.com> wrote:
The graphic you sent suggests the issue is with logstash - since
the @timestamp field is being populated by logstash and is the one that is
used to display the date histogram graphics in Kibana. I would start there.
I.e. maybe SecurityOnion buffers writes etc, and then to check the logstash
shipper process stats.
On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi.
Absolutely (but since that in the past I also worked at
the helpdesk dept. I certainly understand why it is important to ask those
"Are you sure it's plugged in?" questions...). One of the logs is comming
from SecurityOnion which logs (via bro-conn) all the connections so it must
be sending data 24x7x365.
Thanks for the quick reply,
Yuval.
On Tuesday, February 10, 2015, Itamar Syn-Hershko < itamar@code972.com> wrote:
Are you sure your logs are generated linearly without bursts?
We just installed an ELK server and configured the logstash
configuration to match the data that we send to it and until last month it
seems to be working fine but since then we see very strange behavior in the
Kibana, the event over time histogram shows the event rate at the normal
level for about a half an hour, then drops to about 20% of the normal rate
and then it continues to drop slowly for about two hours and then stops and
after a minute or two it returns to normal for the next half an hour or so
and the same behavior repeats. Needless to say that both the
/var/log/logstash and /var/log/elasticsearch both show nothing since the
service started and by using tcpdump we can verify that events keep coming
in at the same rate all time. I attached our logstash configuration, the
/var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and
a screenshot of our Kibana with no filter applied so that you can see the
weird behavior that we see.
Is there someone/somewhere that we can turn to to get some help
on the subject?
I just wanted to let you all know that I think that I solved it... I found
out that one of the programs that we built that sent logs to the ELK
created new Tcp connection for each event which exausted the Tcp buffers on
the server (just like a DoS attack). When I modified that program to re-use
the same connection things started to return to norm.
Thanks for all your help,
Yuval.
On Thursday, February 12, 2015, Yuval Khalifa iyuvalk@gmail.com wrote:
Well SSD would also fix all the pains for my bank too... (-;
Are you sure it's caused by disk latency and not some sort of mis-tuned
TCP driver? I've read some blogs that recommeded to increase some of the
buffers at the sysctl.conf. Do you think so too?
On Thursday, February 12, 2015, Itamar Syn-Hershko <itamar@code972.com
<javascript:_e(%7B%7D,'cvml','itamar@code972.com');>> wrote:
Yes, make sure the disk is local and not low latency shared one (e.g.
SAN). Also SSD will probably fix all your pains.
On Thu, Feb 12, 2015 at 11:28 AM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi,
I wrote that program and ran it and it did managed to keep a steady
rate of about 1,000 events per minute even when the Kibana's total events
per minute dropped from 60,000 to 6,000. However, when the
Kibana's total events per minute dropped to zero, my program got a
"connection refused" exception. I ran netstat -s and found out that every
time the Kibana's line hit zero the number of RX-DRP increased. At that
point I understood that I forgot to mention that this server has a 10GbE
nic. Is it possible that the packets are being dropped because of some
bufferis filling up? If so, how can I test it and verify that this is
actually the case? If it is, how can I solve it?
Thanks,
Yuval.
On Wednesday, February 11, 2015, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi.
When you say "see how the file behaves" I'm not quite sure what you
mean by that... As I mentioned earlier, it's not that events do not appear
at all but instead, the RATE at which they come decreases, so how can I
measure the events rate in a file? I thought that there's another way that
I can test this: I'll write a quick-and-dirty program that will send an
event to the ELK via TCP every 12ms which should result in events rate of
about 5,000 events per minute and I'll let you know if the events rate
continues to drop or not...
Thanks,
Yuval.
On Tuesday, February 10, 2015, Itamar Syn-Hershko itamar@code972.com
wrote:
I'd start by using logstash with input tcp and output fs and see how
the file behaves. Same for the fs inputs - see how their files behave. And
take it from there.
On Tue, Feb 10, 2015 at 7:47 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Great! How can I check that?
On Tuesday, February 10, 2015, Itamar Syn-Hershko < itamar@code972.com> wrote:
The graphic you sent suggests the issue is with logstash - since
the @timestamp field is being populated by logstash and is the one that is
used to display the date histogram graphics in Kibana. I would start there.
I.e. maybe SecurityOnion buffers writes etc, and then to check the logstash
shipper process stats.
On Tue, Feb 10, 2015 at 7:07 PM, Yuval Khalifa iyuvalk@gmail.com
wrote:
Hi.
Absolutely (but since that in the past I also worked at
the helpdesk dept. I certainly understand why it is important to ask those
"Are you sure it's plugged in?" questions...). One of the logs is comming
from SecurityOnion which logs (via bro-conn) all the connections so it must
be sending data 24x7x365.
Thanks for the quick reply,
Yuval.
On Tuesday, February 10, 2015, Itamar Syn-Hershko < itamar@code972.com> wrote:
Are you sure your logs are generated linearly without bursts?
On Tue, Feb 10, 2015 at 6:29 PM, Yuval Khalifa < iyuvalk@gmail.com> wrote:
Hi,
We just installed an ELK server and configured the logstash
configuration to match the data that we send to it and until last month it
seems to be working fine but since then we see very strange behavior in the
Kibana, the event over time histogram shows the event rate at the normal
level for about a half an hour, then drops to about 20% of the normal rate
and then it continues to drop slowly for about two hours and then stops and
after a minute or two it returns to normal for the next half an hour or so
and the same behavior repeats. Needless to say that both the
/var/log/logstash and /var/log/elasticsearch both show nothing since the
service started and by using tcpdump we can verify that events keep coming
in at the same rate all time. I attached our logstash configuration, the
/var/logstash/logstash.log, the /var/log/elasticsearch/clustername.log and
a screenshot of our Kibana with no filter applied so that you can see the
weird behavior that we see.
Is there someone/somewhere that we can turn to to get some help
on the subject?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.