How to configure Elasticsearch with single node?

Hi,

So I keep having a problem where once I start doing stuff on Kibana elasticsearch crashes and goes Red. That's my real issue but I am wondering how elasticsearch should be configured using a single node as I haven't really found any documentation on the subject. I've seen stuff about multiple nodes, data/master, many shards etc, but I'm totally unclear on what my setup should be for a single (beefy) node.

I've got a server with 24 cores and 256 GB ram. I really don't think things should be crashing as the system isn't anywhere near these limits. I've configured Elasticsearch with a 30GB heap and Logstash with a 12GB heap. I haven't changed the Kibana heap size but I don't think kibana is the problem?

I'm on 6.0.1, and I've got x-pack installed as I am trying to monitor performance and tune appropriately. My logstash config has 24 workers and a 250 batch size.

I'm hoping I'm just missing something and I'm thinking that it might be number of shards? Not sure what the recommended practice is for a single node.

This thread seems to suggest that there should be 1 shard for single node: How change the number of shards and replicas in Elasticsearch

But this one seems to suggest there should be many shards for single node: Kibana crashes Elasticsearch

I haven't changed the shard settings at all from defaults.

What direction should I go?

I think my elasticsearch log is just filled with xpack warnings and errors (at least before I just tried to restart using systemd):

[2018-03-23T10:52:59,627][ERROR][o.e.x.w.e.ExecutionService] [79_mMrx] failed to update watch record [y5CXg5PLRyqRmunOdim9iA_xpack_license_expiration_df577cd9-0e47-488f-aa4f-cdbabdca629b-2018-03-23T16:42:36.426Z]
org.elasticsearch.ElasticsearchTimeoutException: java.util.concurrent.TimeoutException: Timeout waiting for task.
        at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:68) ~[elasticsearch-6.0.1.jar:6.0.1]
        at org.elasticsearch.xpack.watcher.history.HistoryStore.put(HistoryStore.java:100) ~[x-pack-6.0.1.jar:6.0.1]
        at org.elasticsearch.xpack.watcher.execution.ExecutionService.execute(ExecutionService.java:333) ~[x-pack-6.0.1.jar:6.0.1]
        at org.elasticsearch.xpack.watcher.execution.ExecutionService.lambda$executeAsync$7(ExecutionService.java:416) ~[x-pack-6.0.1.jar:6.0.1]
        at org.elasticsearch.xpack.watcher.execution.ExecutionService$WatchExecutionTask.run(ExecutionService.java:568) [x-pack-6.0.1.jar:6.0.1]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-6.0.1.jar:6.0.1]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_151]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_151]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_151]
Caused by: java.util.concurrent.TimeoutException: Timeout waiting for task.
        at org.elasticsearch.common.util.concurrent.BaseFuture$Sync.get(BaseFuture.java:235) ~[elasticsearch-6.0.1.jar:6.0.1]
        at org.elasticsearch.common.util.concurrent.BaseFuture.get(BaseFuture.java:69) ~[elasticsearch-6.0.1.jar:6.0.1]
        at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:66) ~[elasticsearch-6.0.1.jar:6.0.1]
        ... 8 more
[2018-03-23T10:52:59,658][WARN ][o.e.m.j.JvmGcMonitorService] [79_mMrx] [gc][old][230532][47] duration [1.2m], collections [1]/[1.2m], total [1.2m]/[41.7m], memory [29.8gb]->[29.8gb]/[29.8gb], all_pools {[young] [1.1gb]->[1.1gb]/[1.1gb]}{[survivor] [102.1mb]->[108.9mb]/[149.7mb]}{[old] [28.5gb]->[28.5gb]/[28.5gb]}

Thanks!
Dave

your last log entry is bad news: garbage collection took over a minute (!) and you have ~30GB of data in your old generation.

How many shards and indices do you have?

For a single node cluster, there's nothing special to do. Just leave elasticsearch.yml with the defaults, and leave jvm.options as is with the exception of heap size.

1 Like

Hmmm, ok, then how am I supposed to get it working?

Looking at the monitoring tab on Kibana I have 254 indices, 2180 shards, and 1090 unassigned shards.

If this is all date-based indices (e.g., logstash-YYYY-mm-dd) and you are going to store the data on one machine, then I would suggest sticking it in 1 index made up of 1 shard, and set 0 replicas. I think you'll get better behavior that way.

Yes I am using date-based indices (myindex-YYYY-mm-dd) and it is on one machine. Are you suggesting that I remove the date-based part so that all data lives in one index (myindex)? This would then require me to re-index/re-ingest everything correct?

Yes and yes.

The point here would be to prove to yourself that the over-sharding is causing the problems. If so, you could investigate using a Rollover Index for a long-term solution.

Ok, you touched on what I was just thinking about which is how do we age out data if it's not an index-per-day. I hadn't heard of the Rollover index functionality before, although after reading about it I'm not totally sure I see the benefit.

I want to age out data after 6 months, which in order to be useful (I think, if I'm interpreting correctly?) means I should probably set my rollover to create a new index every day. But then how is this any better than date based indices? Ultimately I end up with the same number of indices. I could try to define it by document number or size (and base this on an average for a day), which I suppose would balance things better. But does that balancing matter in a single node/single shard situation (based on the rollover article I'm not convinced)?

I can definitely set 1 shard and 0 replicas in my index template and reindex the data and see if this helps resolve the problem.

Thanks for all the help! I really appreciate it!

You could do 7 monthly indices, 1 shard each, no replicas (so your 1-node cluster is green), and drop the old index every month. Much more sane than several thousand shards!

You can also use _shrink to reduce the shard count.

Good point, I think we'll try sticking with daily with 1 shard per index then going to 1 index per week, then month if things aren't playing nicely.

Thanks for the reference on _shrink, it seems very fast. One question, can I delete the old index after I have performed the shrink operation to the new index? I read something about this operation hard linking so I am a little wary.

Thanks!
Dave

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.