I am currently evaluating the benefits of replacing NXlog with winlogbeat as my primary service for remotely shipping logs from various windows servers to a linux logstash instance.
Could someone help me understand why people view winlogbeat and the elastic beats product overall as a superior form of log shipping to something like NXlog?
As beats developer I'm always excited about people considering to use beats. But first of all I think: if it ain't broken, don't fix it.
What is it you exactly use nxlog for? What are your requirements? winlogbeat is used to ship windows events logs. filebeat will ship log files. Both beats are designed to try not dropping any event. The output beats->events requires logstash to acknowledge all logs being send. Events not being acknowledge will be resend. Plus both beats keep track of published events. If beats are restarted, they will continue where they left off.
nxlog -> logstash normally uses tcp and you try to push as fast as possible. The thing about plain TCP is, published events are not ACKed by logstash. Without ACK you can not really tell how far logstash has been in processing your events. On the other hand nxlogs has some event processing features yet missing in beats (event processing support has just been added for 5.0).
I have used nxlog for shipping IIS logs and event logs. I've replaced the service on one of my test machines with a package of filebeat+topbeat+winlogbeat and it has worked nicely thus far. My main concern at this point is over speed/performance
Let's say I have 10 different windows servers shipping logs to a single linux logstash instance, which log shipper (nxlog/beats) would you say is a heavier burden on the logstash instance? Would this even be noticeable?
I can't really say anything about differences regarding performance, resource usage. I'd expect tcp input in logstash to be somewhat faster, as no additional protocol overhead is involved + no latencies for waiting for logstash to ACK events (unless OS/network buffers fill up generating back-pressure on nxlog). In newer beats, one can enable pipelining requests to overcome some network and encoding latencies. By default beats->logstash uses compression. This requires some more CPU+buffers for encoding/decoding, but reduces network overhead (depending on content maybe by a factor of 6 or 7 if you're lucky). In logstash 2.4 and upcoming 5.0 release, the beats plugin was rewritten with lumberjack protocol being reimplemented in java based on netty, whereas TCP plugin is still ruby based. Beats->logstash uses JSON + adds quite an amount of meta-data (which can be filtered out in 5.0 release), adding some additional encoding/decoding overhead. Not sure about nxlog here.
There are a many parameters/differences and more recently changes to logstash, I have a hard time giving any kind of forecast what you will see. I'd propose setting up some benchmarks and compare CPU/memory usage + network throughput in number of events, but also amount of bytes being transmitted. With benchmarks available one can start tuning, e.g. modify parameters in beats outputs or limit resources used by beats using cgroups (or containers, or taskset). Beats also support load-balancing to multiple logstash instances, or configure beats to use one logstash instance by random, in order to scale horizontally if required.
Thank you steffens, very helpful to understand the difference in protocols between nxlog and beats as they are also especially important to consider when writing the configuration files!
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.