I'm running Elastic and kibana on a clean linode instance with 8 GB RAM.
Only imported a csv file through logstash into Elasticsearch in a index. Plain text fields with two Dates and some numeric ones.
About the index, it has 2 shards, 425948 docs, size is: 283.7mb
For learning how to use kibana, we created two visualizations, and when trying to use them, the whole thing becomes very very slow and kibana shows timeouts, elasticsearch do impressive GC work until Java heap memory out-of-memory arrives. It eats my heap immediately.
Here an extract from my elasticsearch logs:
[33.2mb]}{[old] [3.6gb]->[3.6gb]/[3.6gb]}
[2018-07-05T07:01:16,926][WARN ][o.e.m.j.JvmGcMonitorService] [vzbqXTQ] [gc][52287] overhead, spent [11.6s] collecting in the last [11.6s]
[2018-07-05T07:01:28,246][INFO ][o.e.m.j.JvmGcMonitorService] [vzbqXTQ] [gc][old][52288][63] duration [25.7s], collections [3]/[25.8s], total [25.7s]/[9.1m], memory [3.9gb]->[3.9gb]/[3.9gb], all_pools {[young] [266.2mb]->[266.2mb]/[266.2mb]}{[survivor] [30.4mb]->[29.6mb]/[33.2mb]}{[old] [3.6gb]->[3.6gb]/[3.6gb]}
[2018-07-05T07:01:28,248][WARN ][o.e.m.j.JvmGcMonitorService] [vzbqXTQ] [gc][52288] overhead, spent [25.7s] collecting in the last [25.8s]
[2018-07-05T07:02:20,940][INFO ][o.e.m.j.JvmGcMonitorService] [vzbqXTQ] [gc][old][52289][69] duration [52.6s], collections [6]/[41s], total [52.6s]/[9.9m], memory [3.9gb]->[3.9gb]/[3.9gb], all_pools {[young] [266.2mb]->[266.2mb]/[266.2mb]}{[survivor] [29.6mb]->[31mb]/[33.2mb]}{[old] [3.6gb]->[3.6gb]/[3.6gb]}
[2018-07-05T07:02:20,940][WARN ][o.e.m.j.JvmGcMonitorService] [vzbqXTQ] [gc][52289] overhead, spent [52.6s] collecting in the last [41s]
[2018-07-05T07:02:43,093][INFO ][o.e.m.j.JvmGcMonitorService] [vzbqXTQ] [gc][old][52290][72] duration [22.1s], collections [3]/[33.7s], total [22.1s]/[10.3m], memory [3.9gb]->[3.9gb]/[3.9gb], all_pools {[young] [266.2mb]->[266.2mb]/[266.2mb]}{[survivor] [31mb]->[32.1mb]/[33.2mb]}{[old] [3.6gb]->[3.6gb]/[3.6gb]}
[2018-07-05T07:02:43,093][WARN ][o.e.m.j.JvmGcMonitorService] [vzbqXTQ] [gc][52290] overhead, spent [22.1s] collecting in the last [33.7s]
...
[2018-07-05T07:03:02,210][ERROR][o.e.x.m.c.i.IndexStatsCollector] [vzbqXTQ] collector [index-stats] timed out when collecting data
[2018-07-05T07:03:39,028][ERROR][o.e.x.m.c.c.ClusterStatsCollector] [vzbqXTQ] collector [cluster_stats] timed out when collecting data
[2018-07-05T07:03:52,852][ERROR][o.e.x.m.c.n.NodeStatsCollector] [vzbqXTQ] collector [node_stats] timed out when collecting data
...
[2018-07-05T07:20:51,870][ERROR][o.e.t.n.Netty4Utils ] fatal error on the network layer
at org.elasticsearch.transport.netty4.Netty4Utils.maybeDie(Netty4Utils.java:182)
at
This is my jvm.options:
-Xms4g
-Xmx4g
-XX:-OmitStackTraceInFastThrow
# flags to configure Netty
-Dio.netty.noUnsafe=true
-Dio.netty.noKeySetOptimization=true
-Dio.netty.recycler.maxCapacityPerThread=0
# log4j 2
-Dlog4j.shutdownHookEnabled=false
-Dlog4j2.disable.jmx=true
-Djava.io.tmpdir=${ES_TMPDIR}
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=data
# specify an alternative path for JVM fatal error logs
-XX:ErrorFile=logs/hs_err_pid%p.log
## JDK 8 GC logging
8:-XX:+PrintGCDetails
8:-XX:+PrintGCDateStamps
8:-XX:+PrintTenuringDistribution
8:-XX:+PrintGCApplicationStoppedTime
8:-Xloggc:logs/gc.log
8:-XX:+UseGCLogFileRotation
8:-XX:NumberOfGCLogFiles=32
8:-XX:GCLogFileSize=64m
# JDK 9+ GC logging
9-:-Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m
# due to internationalization enhancements in JDK 9 Elasticsearch need to set the provider to COMPAT otherwise
# time/date parsing will break in an incompatible way for some date patterns and locals
9-:-Djava.locale.providers=COMPAT
Here some realtime monitoring info:
This happens specially when trying to se a visualization, whose settings are this:
And this is its log:
[vzbqXTQ] collector [cluster_stats] timed out when collecting data
[2018-07-05T08:08:42,220][DEBUG][o.e.a.s.TransportSearchAction] [vzbqXTQ] [wh_rates-07.04.2018][1], node[vzbqXTQNToSz2RCb5IZhTw], [P], s[STARTED], a[id=h6cXtncYQbyI6eg4InHyiA]: Failed to execute [SearchRequest{searchType=QUERY_THEN_FETCH, indices=[wh_rates-*], indicesOptions=IndicesOptions[id=39, ignore_unavailable=true, allow_no_indices=true, expand_wildcards_open=true, expand_wildcards_closed=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false], types=[], routing='null', preference='1530777866498', requestCache=null, scroll=null, maxConcurrentShardRequests=5, batchedReduceSize=512, preFilterShardSize=64, allowPartialSearchResults=true, source={"size":0,"query":{"bool":{"must":[{"range":{"Date":{"from":1467705903138,"to":1530777903138,"include_lower":true,"include_upper":true,"format":"epoch_millis","boost":1.0}}}],"filter":[{"match_all":{"boost":1.0}}],"adjust_pure_negative":true,"boost":1.0}},"_source":{"includes":[],"excludes":[]},"stored_fields":"*","docvalue_fields":["@timestamp","Date","Date From Rates"],"script_fields":{},"aggregations":{"7":{"terms":{"field":"SiteId.keyword","size":50000,"min_doc_count":1,"shard_min_doc_count":0,"show_term_doc_count_error":false,"order":[{"1":"desc"},{"_key":"asc"}]},"aggregations":{"1":{"sum":{"field":"Installs"}},"5":{"terms":{"field":"SubPublisherIdentifier.keyword","size":50000,"min_doc_count":1,"shard_min_doc_count":0,"show_term_doc_count_error":false,"order":[{"4":"desc"},{"_key":"asc"}]},"aggregations":{"4":{"sum":{"field":"Cost"}},"6":{"terms":{"field":"Parent Campaign.keyword","size":50000,"min_doc_count":1,"shard_min_doc_count":0,"show_term_doc_count_error":false,"order":[{"4":"desc"},{"_key":"asc"}]},"aggregations":{"1":{"sum":{"field":"Installs"}},"2":{"sum":{"field":"Registrations"}},"3":{"sum":{"field":"FDs"}},"4":{"sum":{"field":"Cost"}}}}}}}}}}}] lastShard [true]
org.elasticsearch.transport.TransportException: transport stopped, action: indices:data/read/search[phase/query]
I'm pretty much lost and I'm new to ELK, can anybody help me out? Am I doing something wrong?
It doesn't look to me that just one index with 250k documents is too much for a 8GB node.
Thanks a lot
Alejandro