I'm working on a free(mium) product that does this exactly this (you can
pick the configuration you want) and more, so I'll give you my opinion on
this.
All the solutions mentioned above have pros and cons. You should consider
whether you want a couple of data processors, collectors that are really
heavy, or lots of them (on all the clients). Most of the time this choice
is really simple, if you have hundreds of servers it might be lots and lots
easier to do a little work on lots of machines, compared to doing all that
work in a separate cluster.
The amount of work (e.g. processing, alerting, pattern detection, indexing,
searching) also shifts the point of where you consider a dedicated cluster
compared to the "all-for-one, one-for-all" strategy.
Best regards,
Robin Verlangen
Software engineer
*
*
W http://www.robinverlangen.nl
E robin@us2.nl
http://goo.gl/Lt7BC
Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.
On Thu, Nov 29, 2012 at 1:22 PM, Radu Gheorghe
radu.gheorghe@sematext.comwrote:
Hello Simon,
On Thu, Nov 29, 2012 at 12:52 PM, Simon Monecke simonmonecke@gmail.comwrote:
Hi Radu,
thanks for your detailed answer.
Or is there the problem of restarting flume to apply a new config?
Yes, the problem is to restart the flume agents. During this restart i
would lose messages (log4j cant connect and discard the messages).
log4j->flume-Agent-> | elasticsearch-data-node
log4j->flume-Agent-> |->flume-collector->elasticsearch-non-data-node
log4j->flume-Agent-> |->flume-collector->elasticsearch-non-data-node
log4j->flume-Agent-> | elasticsearch-data-node
Ok, this could work. The only problem is to chose the right number of
non-data-nodes to get no bottleneck. But i think the only solution is try
and error to find the best settings for my system.
Right. Trying is always the best thing to make sure
But since the non-data nodes would only basically serve as "routers", I
don't think it's likely that they would be the bottleneck. Unless the
machines are smaller than the data nodes, and you have lots of shards and
lots of data nodes.
Above I've suggested two collectors and non-data nodes for high
availability. So when one of them goes down, there's still a path for logs
to go through. But if you're OK with buffering on the client side when
there's an outage, you can also transform your design in something simpler,
like:
log4j->flume-Agent-> | elasticsearch-data-node
log4j->flume-Agent-> |->elasticsearch-non-data-node
log4j->flume-Agent-> | elasticsearch-data-node
log4j->flume-Agent-> | elasticsearch-data-node
Another possible solution, which also implies restarting Agent, might be
to use multiple destinations, possible all data nodes, and use a Load
Balancing Sink processor:
Flume 1.11.0 User Guide — Apache Flume
So the design becomes:
log4j->flume-Agent-> |->elasticsearch-data-node
log4j->flume-Agent-> |->elasticsearch-data-node
log4j->flume-Agent-> |->elasticsearch-data-node
log4j->flume-Agent-> |->elasticsearch-data-node
Best regards,
Radu
http://sematext.com/ -- Elasticsearch -- Solr -- Lucene
--
--