Hi.
I'm new to this project and i am experiencing a big issue, since i added all the Clients (15 Server) for monitoring.
Since i added these server, the response Time grew massively and now it doesn't even respond. I am not which settings need to be changed. I've red that sometimes stdout = rubydebug causes high cpu usage. Uncommenting it didnt help. I've followed this guide for ubuntu http://www.itzgeek.com/how-tos/linux/ubuntu-how-tos/setup-elk-stack-ubuntu-16-04.html
I'm running elasticsearch/logstash in a VM on our Hyper-V to monitor our Windows Server.
VM Settings = 2 Virtual Cores + 6GB RAM
What wonders me is, that Linux is not able to split the work over both CPUs. Either CPU 1 is constantly on 100% and CPU2 on 0-2% or vice versa.
I already added more RAM. Tried first 16GB (100% usage) and also 32 (also 100% usage) I dont want to go higher, because there have to be other ways optimizing it.
What can i do to make the server handle the incomming traffic? Java is using all RAM for its own processes.
Most suspicious Log entries:
[winlogbeat-2017.08.16][[winlogbeat-2017.08.16][2]] CreateFailedEngineException[Create failed for [wineventlog#AV3wgGvwYERlAy38-ALy]]; nested: OutOfMemoryError[Java heap space];
at org.elasticsearch.index.engine.InternalEngine.create(InternalEngine.java:351)
at org.elasticsearch.index.shard.IndexShard.create(IndexShard.java:549)
at org.elasticsearch.index.engine.Engine$Create.execute(Engine.java:810)
at org.elasticsearch.action.index.TransportIndexAction.executeIndexRequestOnPrimary(TransportIndexAction.java:236)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:327)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:120)
at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:68)
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase.doRun(TransportReplicationAction.java:657)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:287)
at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:279)
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:77)
at org.elasticsearch.transport.TransportService$4.doRun(TransportService.java:378)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.OutOfMemoryError: Java heap space
ProcessClusterEventTimeoutException[failed to process cluster event (put-mapping [wineventlog]) within 30s]
at org.elasticsearch.cluster.service.InternalClusterService$2$1.run(InternalClusterService.java:361)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2017-08-17 16:26:27,147][DEBUG][action.admin.indices.mapping.put] [Sagittarius] failed to put mappings on indices [[winlogbeat-2016.11.04]], type [wineventlog]
ProcessClusterEventTimeoutException[failed to process cluster event (put-mapping [wineventlog]) within 30s]
at org.elasticsearch.cluster.service.InternalClusterService$2$1.run(InternalClusterService.java:361)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{:timestamp=>"2017-08-18T08:42:09.455000+0200", :message=>"Beats input: the pipeline is blocked, temporary refusing new connection.", :reconnect_backoff_sleep=>0.5, :level=>:warn}
{:timestamp=>"2017-08-18T08:42:50.026000+0200", :message=>"CircuitBreaker::rescuing exceptions", :name=>"Beats input", :exception=>LogStash::Inputs::Beats::InsertingToQueueTakeTooLong, :level=>:warn}
{:timestamp=>"2017-08-18T08:42:50.026000+0200", :message=>"Beats input: The circuit breaker has detected a slowdown or stall in the pipeline, the input is closing the current connection and rejecting new connection until the pipeline recover.", :exception=>LogStash::Inputs::BeatsSupport::CircuitBreaker::HalfOpenBreaker, :level=>:warn}
[2017-08-18 08:43:17,216][ERROR][cluster.action.shard ] [Argus] unexpected failure during [shard-started ([winlogbeat-2017.08.16][4], node[fit1KJnrQS6QI4PftbbS7g], [P], v[11], s[INITIALIZING], a[id=GKXVDgJaRdeGAMsZn5w-Ag], unassigned_info[[reason=CLUSTER_RECOVERED], at[2017-08-18T06:02:05.660Z]]), reason [master {Argus}{fit1KJnrQS6QI4PftbbS7g}{127.0.0.1}{127.0.0.1:9300} marked shard as initializing, but shard state is [POST_RECOVERY], mark shard as started]]
java.lang.OutOfMemoryError: Java heap space