Thanks for the response. We're only having performance issues with
replication/recovery. When the cluster is green our system flies(inserting
2-3 TB per 24 hours). But something is either throttled or just flat out
stuck when re-initializing/assigning replica shards. I'm assuming throttled
because I see the RateLimiter$SImpleRateLimiter.pause being called for a
thread showing [recovery_stream]. I'd like to turn this off completely if
possible.
If I turn off replication on the cluster, let all the replicas drop and
space free, and then turn it back on (quiet system otherwise)......almost
nothing happens. We start merging at 0.4MB, CPU is running 99% idle and
iostat shows almost zero usage.
Looking at the _cluster/state I have
recovery.concurrent_streams: 8
recovery.Max_bytes_per_sec: 2147483648
recovery.translog_ops: 500000
recovery.file_chunk_size: 1048576
store.throttle_type: none
Is there something else I need to set?
PS. Yes, we are using the Deadline Scheduler.
On Tuesday, November 5, 2013 2:10:21 PM UTC-5, Zachary Tong wrote:
Make sure your OS is configured to us appropriate scheduling for SSDs.
The Noop or Deadline scheduler will perform much faster than the
default CFQ (completely fair queuing), on the order of 300-500x!
Checkout this presentation by Drew Raines for more details:
https://speakerdeck.com/drewr/life-after-ec2
Do you have metrics about the utilization of your disk/network vs the
merge rate?
Semi-related, you may want to set gateway.recovery_after_nodes
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-gateway.html#recover-afterto help speed up full cluster restarts. This will prevent allocation from
happening until n nodes are in the cluster, which can prevent unneeded
allocation thrashing while nodes reboot. Only useful for full cluster
restarts however.
One more question. In the event of node failure, and the standby shards
are activated, are those shards then replicated someone else, so they have
a standby? If so, is this configurable? With the amount of data each node
is managing(locally), I think we would like to avoid this.
Yep, if a primary shard disappears from the cluster (machine catches on
fire, etc), then one of the replicas will be promoted to primary.
Elasticsearch will then recognize that it is missing one of its replicas
and begin allocating/copying a replica somewhere else.
You can control this with various allocation awareness settingshttp://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html,
depending on how you want your cluster to behave when nodes disappear.
-Zach
On Tuesday, November 5, 2013 1:15:12 PM UTC-5, Ryan S wrote:
I still need some additional help here. After changing the
max_bytes_per_sec to 2GB and concurrent_streams to 12 we are still moving
extremely slow. We are merging at 0.4MB/sec. Just closing an index and
opening it (application down) will put the replicas into recovery.......and
take days to initialize and get assigned. Any ideas? Thank you.
On Friday, November 1, 2013 1:27:42 PM UTC-4, Ryan S wrote:
Ivan,
Thank you for the comments. I was unaware we needed to do a flush
before shutdown. The defaults do look pretty low, so I will tinker with
those.
One more question. In the event of node failure, and the standby shards
are activated, are those shards then replicated someone else, so they have
a standby? If so, is this configurable? With the amount of data each node
is managing(locally), I think we would like to avoid this.
Thanks again.
On Friday, November 1, 2013 2:37:28 AM UTC-4, Ivan Brusic wrote:
A few comments:
- You should always execute a flush before shutting down any nodes,
This action will clear the transaction logs and commit all operations to
segments.
- If you are doing rolling restarts, consider disabling allocation.
- Elasticsearch 0.90+ will throttle shard recovery in order not to
consume IO bandwidth. The defaults are pretty low. More info is here:
Elasticsearch Platform — Find real-time answers at scale | Elastic
- Elasticsearch will only recover 2 shards at a time by default. If you
have a heavily sharded environment, you might want to increase this value.
The last two changes will heavily affect IO performance. Increase the
values without overwhelming your system. Much of it will depend on your
system. SSDs, platters or virtualized environments with shared storage?
Cheers,
Ivan
On Thu, Oct 31, 2013 at 11:51 AM, Ryan S ryan.s...@gmail.com wrote:
Sorry, version 90.3
On Thursday, October 31, 2013 2:50:03 PM UTC-4, Ryan S wrote:
We've seen extremely slow startup/initialization/**assignment of
replica shards during startup. I can shutdown the cluster cleanly(from a
green state), and then start it back up a few minutes later. It might
take 16-24 hours to reach a "green" status with the logs saying replica
recovery is happening. If the cluster was shutdown cleanly and started 10
minutes later, what recovery needs to occur? Second, is there anything we
can tune to speed this up? I have similar concern on failover, it seems
the shard relocation happens at a snails pace. Our servers can write
4GB+/sec to the storage, but we are writing data much slower than that.
Each data node is hosting about 8TB of data.
A little background about our cluster:
8 nodes
6 data nodes
1 master
1 query node
All are 16 core boxes, with 96GB Ram, 8TB of FusionIO SSD and
everything is connected via Infiniband.
When this is occurring our insert rates run at a degraded
performance(at least 10-15%) which is a big deal for us.
Thanks.
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.