Log management with Elasticsearch?

Mike_Pilkington · August 15, 2010, 1:30pm

Hi,

I'm wondering if anyone has used Elasticsearch to manage (provide
search for) large log file collections?

I'm looking for solutions to help with a semi-centralized log
management project. The logs would be sent in syslog-style format
from hundreds of servers/routers/firewalls and maintained on a few
dedicated log servers. I looked into Splunk, a popular log management
solution which has the ability to scale horizontally (add more servers
for more storage and performance). I assume it uses some sort of
NoSQL technology. Unfortunately, their solution is too expensive and
after searching for an open-source equivalent and not finding it, I'm
looking into possibly building a home-grown solution.

I came across the following blog post series on log management with
Hadoop: http://blog.mgm-tp.com/series/scalable-log-data-management-with-hadoop/.
In the comments of the 3rd blog post, Elasticsearch was mentioned as
a possibility and so I'm wondering if anyone out there has applied
Elasticsearch to log management?

Thanks,
Mike

Berkay_Mollamustafao · August 15, 2010, 1:58pm

Hi Mike,

I think solution for the use case you're describing is combination of Flume
and Elasticsearch. Flume provides great infrastructure to aggregate logs
(has a syslog receiver) and all the data can be indexed in Elasticsearch for
querying later. All that is needed is an Elasticsearch Sink, the component
that would interface with ES, in flume terminology, which should be quite
straight forward.

Flume is an open source project recently released by Cloudera folks, user
guide can be found here:
archive.cloudera.com/cdh/3/flume/UserGuide.html

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Sun, Aug 15, 2010 at 9:30 AM, Mike Pilkington mpilking@gmail.com wrote:

Hi,

I'm wondering if anyone has used Elasticsearch to manage (provide
search for) large log file collections?

I'm looking for solutions to help with a semi-centralized log
management project. The logs would be sent in syslog-style format
from hundreds of servers/routers/firewalls and maintained on a few
dedicated log servers. I looked into Splunk, a popular log management
solution which has the ability to scale horizontally (add more servers
for more storage and performance). I assume it uses some sort of
NoSQL technology. Unfortunately, their solution is too expensive and
after searching for an open-source equivalent and not finding it, I'm
looking into possibly building a home-grown solution.

I came across the following blog post series on log management with
Hadoop:
http://blog.mgm-tp.com/series/scalable-log-data-management-with-hadoop/.
In the comments of the 3rd blog post, Elasticsearch was mentioned as
a possibility and so I'm wondering if anyone out there has applied
Elasticsearch to log management?

Thanks,
Mike

Paul_Smith · September 11, 2010, 11:32pm

This post was a few weeks ago, but I've started to put something
together for this on github, if anyone would like to collaborate.
Very basic structure now there, and works, still need to consider how
a fire hose stream of logs will cope, and whether one can perform
batch indexing to improve performance. It is very simple to do though
by the looks of it.

http://github.com/tallpsmith/elasticflume

cheers,

Paul

On Aug 15, 11:58 pm, Berkay Mollamustafaoglu mber...@gmail.com
wrote:

Hi Mike,

I think solution for the use case you're describing is combination of Flume
and Elasticsearch. Flume provides great infrastructure to aggregate logs
(has a syslog receiver) and all the data can be indexed in Elasticsearch for
querying later. All that is needed is an Elasticsearch Sink, the component
that would interface with ES, in flume terminology, which should be quite
straight forward.

Flume is an open source project recently released by Cloudera folks, user
guide can be found here:
archive.cloudera.com/cdh/3/flume/UserGuide.html

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Sun, Aug 15, 2010 at 9:30 AM, Mike Pilkington mpilk...@gmail.com wrote:

Hi,

I'm wondering if anyone has used Elasticsearch to manage (provide
search for) large log file collections?

I'm looking for solutions to help with a semi-centralized log
management project. The logs would be sent in syslog-style format
from hundreds of servers/routers/firewalls and maintained on a few
dedicated log servers. I looked into Splunk, a popular log management
solution which has the ability to scale horizontally (add more servers
for more storage and performance). I assume it uses some sort of
NoSQL technology. Unfortunately, their solution is too expensive and
after searching for an open-source equivalent and not finding it, I'm
looking into possibly building a home-grown solution.

I came across the following blog post series on log management with
Hadoop:
http://blog.mgm-tp.com/series/scalable-log-data-management-with-hadoop/.
In the comments of the 3rd blog post, Elasticsearch was mentioned as
a possibility and so I'm wondering if anyone out there has applied
Elasticsearch to log management?

Thanks,
Mike

kimchy · September 12, 2010, 8:30am

Looks great!. I will add it to the projects page. I think this:
http://github.com/tallpsmith/elasticflume/blob/master/src/main/java/org/elasticsearch/flume/ElasticSearchSink.javaspeaks
volumes on the simplicity of both elasticsearch and flume!

On Sun, Sep 12, 2010 at 1:32 AM, tallpsmith tallpsmith@gmail.com wrote:

This post was a few weeks ago, but I've started to put something
together for this on github, if anyone would like to collaborate.
Very basic structure now there, and works, still need to consider how
a fire hose stream of logs will cope, and whether one can perform
batch indexing to improve performance. It is very simple to do though
by the looks of it.

http://github.com/tallpsmith/elasticflume

cheers,

Paul

On Aug 15, 11:58 pm, Berkay Mollamustafaoglu mber...@gmail.com
wrote:

Hi Mike,

I think solution for the use case you're describing is combination of
Flume
and Elasticsearch. Flume provides great infrastructure to aggregate logs
(has a syslog receiver) and all the data can be indexed in Elasticsearch
for
querying later. All that is needed is an Elasticsearch Sink, the
component
that would interface with ES, in flume terminology, which should be quite
straight forward.

Flume is an open source project recently released by Cloudera folks, user
guide can be found here:
archive.cloudera.com/cdh/3/flume/UserGuide.html

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Sun, Aug 15, 2010 at 9:30 AM, Mike Pilkington mpilk...@gmail.com
wrote:

Hi,

I'm wondering if anyone has used Elasticsearch to manage (provide
search for) large log file collections?

I'm looking for solutions to help with a semi-centralized log
management project. The logs would be sent in syslog-style format
from hundreds of servers/routers/firewalls and maintained on a few
dedicated log servers. I looked into Splunk, a popular log management
solution which has the ability to scale horizontally (add more servers
for more storage and performance). I assume it uses some sort of
NoSQL technology. Unfortunately, their solution is too expensive and
after searching for an open-source equivalent and not finding it, I'm
looking into possibly building a home-grown solution.

I came across the following blog post series on log management with
Hadoop:
http://blog.mgm-tp.com/series/scalable-log-data-management-with-hadoop/
.
In the comments of the 3rd blog post, Elasticsearch was mentioned as
a possibility and so I'm wondering if anyone out there has applied
Elasticsearch to log management?

Thanks,
Mike

Ted_Karmel · September 12, 2010, 12:45pm

+1 for Shay's comment...

Paul, I was considering this very option and just saw your email and
git on the subject.

On Sun, Sep 12, 2010 at 9:30 AM, Shay Banon
shay.banon@elasticsearch.com wrote:

Looks great!. I will add it to the projects page. I think this:
http://github.com/tallpsmith/elasticflume/blob/master/src/main/java/org/elasticsearch/flume/ElasticSearchSink.java
speaks volumes on the simplicity of both elasticsearch and flume!

On Sun, Sep 12, 2010 at 1:32 AM, tallpsmith tallpsmith@gmail.com wrote:

This post was a few weeks ago, but I've started to put something
together for this on github, if anyone would like to collaborate.
Very basic structure now there, and works, still need to consider how
a fire hose stream of logs will cope, and whether one can perform
batch indexing to improve performance. It is very simple to do though
by the looks of it.

http://github.com/tallpsmith/elasticflume

cheers,

Paul

On Aug 15, 11:58 pm, Berkay Mollamustafaoglu mber...@gmail.com
wrote:

Hi Mike,

I think solution for the use case you're describing is combination of
Flume
and Elasticsearch. Flume provides great infrastructure to aggregate logs
(has a syslog receiver) and all the data can be indexed in Elasticsearch
for
querying later. All that is needed is an Elasticsearch Sink, the
component
that would interface with ES, in flume terminology, which should be
quite
straight forward.

Flume is an open source project recently released by Cloudera folks,
user
guide can be found here:
archive.cloudera.com/cdh/3/flume/UserGuide.html

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Sun, Aug 15, 2010 at 9:30 AM, Mike Pilkington mpilk...@gmail.com
wrote:

Hi,

I'm wondering if anyone has used Elasticsearch to manage (provide
search for) large log file collections?

I'm looking for solutions to help with a semi-centralized log
management project. The logs would be sent in syslog-style format
from hundreds of servers/routers/firewalls and maintained on a few
dedicated log servers. I looked into Splunk, a popular log management
solution which has the ability to scale horizontally (add more servers
for more storage and performance). I assume it uses some sort of
NoSQL technology. Unfortunately, their solution is too expensive and
after searching for an open-source equivalent and not finding it, I'm
looking into possibly building a home-grown solution.

I came across the following blog post series on log management with
Hadoop:

http://blog.mgm-tp.com/series/scalable-log-data-management-with-hadoop/.
In the comments of the 3rd blog post, Elasticsearch was mentioned as
a possibility and so I'm wondering if anyone out there has applied
Elasticsearch to log management?

Thanks,
Mike

Topic		Replies	Views
Logstash, flume vs elasticsearch Elasticsearch	5	3375	July 6, 2017
ES indexing throughput and scalability Elasticsearch	7	1062	July 6, 2017
Is Elasticsearch capable of storing this amount of data? Elasticsearch	10	2547	July 6, 2017
Splunk vs. Elastic search performance? Elasticsearch	27	9872	July 6, 2017
Looking at ES logging Elasticsearch	7	421	July 6, 2017

Log management with Elasticsearch?

Related topics