I'm wondering if anyone has used Elasticsearch to manage (provide
search for) large log file collections?
I'm looking for solutions to help with a semi-centralized log
management project. The logs would be sent in syslog-style format
from hundreds of servers/routers/firewalls and maintained on a few
dedicated log servers. I looked into Splunk, a popular log management
solution which has the ability to scale horizontally (add more servers
for more storage and performance). I assume it uses some sort of
NoSQL technology. Unfortunately, their solution is too expensive and
after searching for an open-source equivalent and not finding it, I'm
looking into possibly building a home-grown solution.
I came across the following blog post series on log management with
Hadoop: http://blog.mgm-tp.com/series/scalable-log-data-management-with-hadoop/.
In the comments of the 3rd blog post, Elasticsearch was mentioned as
a possibility and so I'm wondering if anyone out there has applied
Elasticsearch to log management?
I think solution for the use case you're describing is combination of Flume
and Elasticsearch. Flume provides great infrastructure to aggregate logs
(has a syslog receiver) and all the data can be indexed in Elasticsearch for
querying later. All that is needed is an Elasticsearch Sink, the component
that would interface with ES, in flume terminology, which should be quite
straight forward.
Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype
On Sun, Aug 15, 2010 at 9:30 AM, Mike Pilkington mpilking@gmail.com wrote:
Hi,
I'm wondering if anyone has used Elasticsearch to manage (provide
search for) large log file collections?
I'm looking for solutions to help with a semi-centralized log
management project. The logs would be sent in syslog-style format
from hundreds of servers/routers/firewalls and maintained on a few
dedicated log servers. I looked into Splunk, a popular log management
solution which has the ability to scale horizontally (add more servers
for more storage and performance). I assume it uses some sort of
NoSQL technology. Unfortunately, their solution is too expensive and
after searching for an open-source equivalent and not finding it, I'm
looking into possibly building a home-grown solution.
I came across the following blog post series on log management with
Hadoop: http://blog.mgm-tp.com/series/scalable-log-data-management-with-hadoop/.
In the comments of the 3rd blog post, Elasticsearch was mentioned as
a possibility and so I'm wondering if anyone out there has applied
Elasticsearch to log management?
This post was a few weeks ago, but I've started to put something
together for this on github, if anyone would like to collaborate.
Very basic structure now there, and works, still need to consider how
a fire hose stream of logs will cope, and whether one can perform
batch indexing to improve performance. It is very simple to do though
by the looks of it.
On Aug 15, 11:58 pm, Berkay Mollamustafaoglu mber...@gmail.com
wrote:
Hi Mike,
I think solution for the use case you're describing is combination of Flume
and Elasticsearch. Flume provides great infrastructure to aggregate logs
(has a syslog receiver) and all the data can be indexed in Elasticsearch for
querying later. All that is needed is an Elasticsearch Sink, the component
that would interface with ES, in flume terminology, which should be quite
straight forward.
Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype
On Sun, Aug 15, 2010 at 9:30 AM, Mike Pilkington mpilk...@gmail.com wrote:
Hi,
I'm wondering if anyone has used Elasticsearch to manage (provide
search for) large log file collections?
I'm looking for solutions to help with a semi-centralized log
management project. The logs would be sent in syslog-style format
from hundreds of servers/routers/firewalls and maintained on a few
dedicated log servers. I looked into Splunk, a popular log management
solution which has the ability to scale horizontally (add more servers
for more storage and performance). I assume it uses some sort of
NoSQL technology. Unfortunately, their solution is too expensive and
after searching for an open-source equivalent and not finding it, I'm
looking into possibly building a home-grown solution.
I came across the following blog post series on log management with
Hadoop: http://blog.mgm-tp.com/series/scalable-log-data-management-with-hadoop/.
In the comments of the 3rd blog post, Elasticsearch was mentioned as
a possibility and so I'm wondering if anyone out there has applied
Elasticsearch to log management?
This post was a few weeks ago, but I've started to put something
together for this on github, if anyone would like to collaborate.
Very basic structure now there, and works, still need to consider how
a fire hose stream of logs will cope, and whether one can perform
batch indexing to improve performance. It is very simple to do though
by the looks of it.
On Aug 15, 11:58 pm, Berkay Mollamustafaoglu mber...@gmail.com
wrote:
Hi Mike,
I think solution for the use case you're describing is combination of
Flume
and Elasticsearch. Flume provides great infrastructure to aggregate logs
(has a syslog receiver) and all the data can be indexed in Elasticsearch
for
querying later. All that is needed is an Elasticsearch Sink, the
component
that would interface with ES, in flume terminology, which should be quite
straight forward.
Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype
On Sun, Aug 15, 2010 at 9:30 AM, Mike Pilkington mpilk...@gmail.com
wrote:
Hi,
I'm wondering if anyone has used Elasticsearch to manage (provide
search for) large log file collections?
I'm looking for solutions to help with a semi-centralized log
management project. The logs would be sent in syslog-style format
from hundreds of servers/routers/firewalls and maintained on a few
dedicated log servers. I looked into Splunk, a popular log management
solution which has the ability to scale horizontally (add more servers
for more storage and performance). I assume it uses some sort of
NoSQL technology. Unfortunately, their solution is too expensive and
after searching for an open-source equivalent and not finding it, I'm
looking into possibly building a home-grown solution.
I came across the following blog post series on log management with
Hadoop: http://blog.mgm-tp.com/series/scalable-log-data-management-with-hadoop/
.
In the comments of the 3rd blog post, Elasticsearch was mentioned as
a possibility and so I'm wondering if anyone out there has applied
Elasticsearch to log management?
This post was a few weeks ago, but I've started to put something
together for this on github, if anyone would like to collaborate.
Very basic structure now there, and works, still need to consider how
a fire hose stream of logs will cope, and whether one can perform
batch indexing to improve performance. It is very simple to do though
by the looks of it.
On Aug 15, 11:58 pm, Berkay Mollamustafaoglu mber...@gmail.com
wrote:
Hi Mike,
I think solution for the use case you're describing is combination of
Flume
and Elasticsearch. Flume provides great infrastructure to aggregate logs
(has a syslog receiver) and all the data can be indexed in Elasticsearch
for
querying later. All that is needed is an Elasticsearch Sink, the
component
that would interface with ES, in flume terminology, which should be
quite
straight forward.
Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype
On Sun, Aug 15, 2010 at 9:30 AM, Mike Pilkington mpilk...@gmail.com
wrote:
Hi,
I'm wondering if anyone has used Elasticsearch to manage (provide
search for) large log file collections?
I'm looking for solutions to help with a semi-centralized log
management project. The logs would be sent in syslog-style format
from hundreds of servers/routers/firewalls and maintained on a few
dedicated log servers. I looked into Splunk, a popular log management
solution which has the ability to scale horizontally (add more servers
for more storage and performance). I assume it uses some sort of
NoSQL technology. Unfortunately, their solution is too expensive and
after searching for an open-source equivalent and not finding it, I'm
looking into possibly building a home-grown solution.
I came across the following blog post series on log management with
Hadoop:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.