Right now I'm just inputing custom apache logs through a listening port.
I've been using netcat to output the logs to the listening port. Here is my config file for logstash that I'm using. For example like this: cat log | nc localhost 4500
Also I've written a script to automate the process.
Everything seems to work fine, except for two problems, the input into logstash is being duplicated. For some reason I'm getting two entries in logstash for each 1 entry in the logs. Also for the 1st, 2nd, 3rd, and even sometimes for the 4th file. However eventually I get to the point where logstash freezes and locks up. Once it reaches this point the log stops inputing and I have to manually kill the logstash process and start it again before it will begin accepting logs. I'm not entirely sure how to troubleshoot this problem. When I get to the freezes I get this java exception in the logstash log. The exception is too big to put in this message. I'll put it in a follow up.
Any way to avoid this and fix the log entry repetition?
Exception in thread "|worker" java.nio.BufferOverflowException
at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:189)
at org.jruby.util.io.ChannelStream.bufferedWrite(ChannelStream.java:1100)
at org.jruby.util.io.ChannelStream.fwrite(ChannelStream.java:1277)
at org.jruby.RubyIO.fwrite(RubyIO.java:1541)
at org.jruby.RubyIO.write(RubyIO.java:1412)
at org.jruby.RubyIO$INVOKER$i$1$0$write.call(RubyIO$INVOKER$i$1$0$write.gen)
at org.jruby.RubyClass.finvoke(RubyClass.java:742)
at org.jruby.runtime.Helpers.invoke(Helpers.java:503)
at org.jruby.RubyBasicObject.callMethod(RubyBasicObject.java:363)
at org.jruby.RubyIO.write(RubyIO.java:2490)
at org.jruby.RubyIO.putsSingle(RubyIO.java:2478)
at org.jruby.RubyIO.puts1(RubyIO.java:2407)
at org.jruby.RubyIO.puts(RubyIO.java:2380)
at org.jruby.RubyIO$INVOKER$i$puts.call(RubyIO$INVOKER$i$puts.gen)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:168)
at rubyjit.Cabin::Outputs::IO$$\=\^\^_6269f884ef35189823c6682da4fcb5035fcb6e7233121026.block_0$RUBY$__file__(/opt/logstash/vendor/bundle/jruby/1.9/gems/cabin-0.7.1/lib/cabin/outputs/io.rb:52)
at rubyjit$Cabin::Outputs::IO$$\=\^\^_6269f884ef35189823c6682da4fcb5035fcb6e7233121026$block_0$RUBY$__file__.call(rubyjit$Cabin::Outputs::IO$$\=\^\^_6269f884ef35189823c6682da4fcb5035fcb6e7233121026$block_0$RUBY$__file__)
at org.jruby.runtime.CompiledBlock19.yield(CompiledBlock19.java:135)
at org.jruby.runtime.Block.yield(Block.java:142)
at org.jruby.ext.thread.Mutex.synchronize(Mutex.java:149)
at org.jruby.ext.thread.Mutex$INVOKER$i$0$0$synchronize.call(Mutex$INVOKER$i$0$0$synchronize.gen)
at org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:143)
at org.jruby.runtime.callsite.CachingCallSite.callIter(CachingCallSite.java:154)
at rubyjit.Cabin::Outputs::IO$$\=\^\^_6269f884ef35189823c6682da4fcb5035fcb6e7233121026.__file__(/opt/logstash/vendor/bundle/jruby/1.9/gems/cabin-0.7.1/lib/cabin/outputs/io.rb:50)
at rubyjit.Cabin::Outputs::IO$$\=\^\^_6269f884ef35189823c6682da4fcb5035fcb6e7233121026.__file__(/opt/logstash/vendor/bundle/jruby/1.9/gems/cabin-0.7.1/lib/cabin/outputs/io.rb)
at org.jruby.internal.runtime.methods.JittedMethod.call(JittedMethod.java:181)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:168)
at org.jruby.runtime.callsite.ShiftLeftCallSite.call(ShiftLeftCallSite.java:24)
at rubyjit.Cabin::Channel$$publish_81fc94d65b7d4fcad95b02a7f5b748a45eb5041d33121026.block_2$RUBY$__file__(/opt/logstash/vendor/bundle/jruby/1.9/gems/cabin-0.7.1/lib/cabin/channel.rb:176)
at rubyjit$Cabin::Channel$$publish_81fc94d65b7d4fcad95b02a7f5b748a45eb5041d33121026$block_2$RUBY$__file__.call(rubyjit$Cabin::Channel$$publish_81fc94d65b7d4fcad95b02a7f5b748a45eb5041d33121026$block_2$RUBY$__file__)
at org.jruby.runtime.CompiledBlock19.yield(CompiledBlock19.java:135)
at org.jruby.runtime.Block.yield(Block.java:142)
at org.jruby.RubyHash$13.visit(RubyHash.java:1354)
at org.jruby.RubyHash.visitLimited(RubyHash.java:648)
at org.jruby.RubyHash.visitAll(RubyHash.java:634)
at org.jruby.RubyHash.iteratorVisitAll(RubyHash.java:1305)
at org.jruby.RubyHash.each_pairCommon(RubyHash.java:1350)
at org.jruby.RubyHash.each19(RubyHash.java:1341)
at org.jruby.RubyHash$INVOKER$i$0$0$each19.call(RubyHash$INVO
I have not, how would that work with 40 or so logs? I'd want to run a script. Also I forgot to mention in my previous post, that I think for some reason there are are to entries in logstash of the same unique log entry. I'm not sure why this is happening.
If you provide either a list of files or a pattern that matches the file names to be processed as argument to 'cat', you should be able to process the files sequentially through a single Logstash instance using the stdin input plugin.
The pattern is not an issue, thats actually part of my script. Would I just run logstash -f configfile; logoutputscript.sh? I'm just trying to get the mechanics down. I have no issues with changing the config file for stdin, I know how to do that. I'm just not sure how I would output the log file to stdin while avoiding grok parse errors from extraneous output like the line I would use to run cat, or the line I would use to run my script.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.