We're trying to figure out why we seem to have hit an ingestion limit in our ES cluster.
Use case: Logging
Version: 5.1.1
Master nodes: 3
HTTP nodes: 3
Data nodes: 50
The servers are OpenStack VMs running on bare metal. We do not over provision the bare metals and there is only one ES data node per bare metal.
Server VM Specs
VCPU: 10
RAM: 64 GB
Disk: 2 TB spinning disk
ES Config
bootstrap.memory_lock: true
cluster.name: elasticsearch_1
discovery.zen.ping_timeout: 60s
discovery.zen.ping.unicast.hosts: 10.173.188.214,10.173.188.215,10.173.188.216
http.cors.allow-origin: /.*/
http.cors.enabled: true
http.port: 9200
indices.fielddata.cache.size: 25%
indices.memory.index_buffer_size: 20%
network.host: _eth0_
node.attr.box_type: hot
node.data: true
node.master: false
node.name: 10.173.131.213-hot
path.conf: /etc/elasticsearch/hot
path.data: /opt/elasticsearch/data
path.logs: /var/log/elasticsearch/10.173.131.213-hot
thread_pool.bulk.queue_size: 2000
thread_pool.search.size: 10
transport.tcp.port: 9300
Note that even though the box_type is "hot", this is not a hot/warm configuration.
jvm.options
-Dfile.encoding=UTF-8
-Dio.netty.noKeySetOptimization=true
-Dio.netty.noUnsafe=true
-Djava.awt.headless=true
-Djna.nosys=true
-Dlog4j2.disable.jmx=true
-Dlog4j.shutdownHookEnabled=false
-Dlog4j.skipJansi=true
-server
-XX:+AlwaysPreTouch
-XX:CMSInitiatingOccupancyFraction=75
-XX:+DisableExplicitGC
-XX:+HeapDumpOnOutOfMemoryError
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+UseConcMarkSweepGC
ES_JAVA_OPTS="-Xms31g -Xmx31g"
Output of localhost:9200/
{
"name": "10.173.188.218-http",
"cluster_name": "elasticsearch_1",
"cluster_uuid": "_LQl2KtyQAyxL-D69YnmOw",
"version": {
"number": "5.1.1",
"build_hash": "5395e21",
"build_date": "2016-12-06T12:36:15.409Z",
"build_snapshot": false,
"lucene_version": "6.3.0"
},
"tagline": "You Know, for Search"
}
Output of localhost:9200/_cluster/health
{
"cluster_name": "elasticsearch_1",
"status": "green",
"timed_out": false,
"number_of_nodes": 56,
"number_of_data_nodes": 50,
"active_primary_shards": 3806,
"active_shards": 7612,
"relocating_shards": 10,
"initializing_shards": 0,
"unassigned_shards": 0,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 100
}
Output of localhost:9200/_cluster/settings
{
"persistent": {
"action": {
"search": {
"shard_count": {
"limit": "5000"
}
}
},
"cluster": {
"routing": {
"rebalance": {
"enable": "none"
},
"allocation": {
"node_concurrent_recoveries": "2",
"disk": {
"threshold_enabled": "true",
"watermark": {
"low": "85%",
"high": "95%"
}
},
"node_initial_primaries_recoveries": "8",
"enable": "all"
}
},
"info": {
"update": {
"interval": "60s"
}
}
},
"indices": {
"recovery": {
"max_bytes_per_sec": "500mb"
}
}
},
"transient": {
"cluster": {
"routing": {
"rebalance": {
"enable": "all"
},
"allocation": {
"cluster_concurrent_rebalance": "10",
"node_concurrent_recoveries": "6"
}
}
}
}
}
We're using bi-hourly indices (hours 01-12 instead of 00-23) and keeping 7 days worth of indices. Our indices have 45 primary shards and 1 replica shard per primary.
Our sustained ingestion rate is around 20K logs per second. This had been fine up to this week when the team that uses this cluster started sending around 40K logs per second and the cluster can't keep up with the incoming logs.
We're considering scaling out the cluster and adding more data nodes, but we're not sure that would help. We'd like to look at configuration changes first before spending more money on hardware.
Any suggestions on what changes we can make to improve performance would be appreciated. If there's additional information that would help analyze the situation, please let me know.
Thanks,
Ray