Identifying hot shards to address uneven load

David_O_Dell · September 26, 2013, 8:34pm

I have read many posts in this group about uneven load and hot shards.
We are experiencing the same symptoms where one data node out of 8 has 100%
CPU usage and the other 7 nodes operate at 40%.

My question is how do I know identify the volume of searches per shard?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Boaz_Leskes · September 27, 2013, 6:53am

Hi Daveo,

Are you using the routing parameter in your searches? If not it's highly
unlikely search distribution wouldn't be evenly spread across shards. What
is possible is that a node contains more shards then others.

It feels like the node is doing something else or got stuck on an unlucky
query - to verify - can you please post the result of the hot threads API
(Elasticsearch Platform — Find real-time answers at scale | Elastic
) on this node? this will help figure out what it's doing? Also - can post
the memory usage on that node?

Cheers,
Boaz

On Thursday, September 26, 2013 10:34:37 PM UTC+2, da...@weheartit.com
wrote:

I have read many posts in this group about uneven load and hot shards.
We are experiencing the same symptoms where one data node out of 8 has
100% CPU usage and the other 7 nodes operate at 40%.

My question is how do I know identify the volume of searches per shard?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

David_O_Dell · September 27, 2013, 5:00pm

Here is the results of the hot_threads.

gist.github.com

https://gist.github.com/dodizzle/6731643

gistfile1.txt

::: [search02.weheartit.com][BhLFmPTJRYSGh2LrecepMw][inet[/10.84.30.10:9300]]{max_local_storage_nodes=1}
   
   107.7% (538.3ms out of 500ms) cpu usage by thread 'elasticsearch[search02.weheartit.com][search][T#86]'
     2/10 snapshots sharing following 28 elements
       sun.nio.ch.NativeThread.current(Native Method)
       sun.nio.ch.NativeThreadSet.add(NativeThreadSet.java:46)
       sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:670)
       org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:176)
       org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:272)
       org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:138)

This file has been truncated. show original

For memory use the system has 64GB RAM, jvm settings are.

-Xms30720m -Xmx30720m -Xss256k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
-XX:+HeapDumpOnOutOfMemoryError

We are running 0.90.5

Thanks for getting back to me.

On 9/26/13 11:53 PM, Boaz Leskes wrote:

Hi Daveo,

Are you using the routing parameter in your searches? If not it's
highly unlikely search distribution wouldn't be evenly spread across
shards. What is possible is that a node contains more shards then others.

It feels like the node is doing something else or got stuck on an
unlucky query - to verify - can you please post the result of the hot
threads API
(Elasticsearch Platform — Find real-time answers at scale | Elastic
) on this node? this will help figure out what it's doing? Also - can
post the memory usage on that node?

Cheers,
Boaz

On Thursday, September 26, 2013 10:34:37 PM UTC+2, da...@weheartit.com
wrote:
I have read many posts in this group about uneven load and hot
shards.
We are experiencing the same symptoms where one data node out of 8
has 100% CPU usage and the other 7 nodes operate at 40%.

My question is how do I know identify the volume of searches per
shard?
--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/XISMbne5eRg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Boaz_Leskes · September 27, 2013, 6:42pm

Hmm, all the threads are busy searching. Are you running any heavy queries?
are you sure the node gets an equal amount of traffic to the rest?

Besides the above can you also share the output of

curl -s localhost:9200/_stats?all
curl -s localhost:9200/_nodes?all

These call give cluster statistic plus the cluster topology - perhaps we
can see something there.

Cheers,
Boaz

On Fri, Sep 27, 2013 at 7:00 PM, David O'Dell daveo@weheartit.com wrote:

Here is the results of the hot_threads.
hot threads search02 · GitHub

For memory use the system has 64GB RAM, jvm settings are.

-Xms30720m -Xmx30720m -Xss256k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
-XX:+HeapDumpOnOutOfMemoryError

We are running 0.90.5

Thanks for getting back to me.

On 9/26/13 11:53 PM, Boaz Leskes wrote:

Hi Daveo,

Are you using the routing parameter in your searches? If not it's highly
unlikely search distribution wouldn't be evenly spread across shards. What
is possible is that a node contains more shards then others.

It feels like the node is doing something else or got stuck on an
unlucky query - to verify - can you please post the result of the hot
threads API (
Elasticsearch Platform — Find real-time answers at scale | Elastic) on this node? this will help figure out what it's doing? Also - can post
the memory usage on that node?

Cheers,
Boaz

On Thursday, September 26, 2013 10:34:37 PM UTC+2, da...@weheartit.comwrote:

I have read many posts in this group about uneven load and hot shards.
We are experiencing the same symptoms where one data node out of 8 has
100% CPU usage and the other 7 nodes operate at 40%.

My question is how do I know identify the volume of searches per shard?

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/XISMbne5eRg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/XISMbne5eRg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

David_O_Dell · September 27, 2013, 6:50pm

results of stats all

gist.github.com

https://gist.github.com/dodizzle/6733355

gistfile1.txt

{"ok":true,"_shards":{"total":72,"successful":72,"failed":0},"_all":{"primaries":{"docs":{"count":91923482,"deleted":3455871},"store":{"size":"52.1gb","size_in_bytes":56031408449,"throttle_time":"13.7m","throttle_time_in_millis":826392},"indexing":{"index_total":18756808,"index_time":"4.3h","index_time_in_millis":15677810,"index_current":0,"delete_total":318,"delete_time":"66ms","delete_time_in_millis":66,"delete_current":0},"get":{"total":0,"get_time":"0s","time_in_millis":0,"exists_total":0,"exists_time":"0s","exists_time_in_millis":0,"missing_total":0,"missing_time":"0s","missing_time_in_millis":0,"current":0},"search":{"open_contexts":2170,"query_total":99496761,"query_time":"22.6d","query_time_in_millis":1954587020,"query_current":183,"fetch_total":87698215,"fetch_time":"6.6h","fetch_time_in_millis":24002054,"fetch_current":2},"merges":{"current":2,"current_docs":4787,"current_size":"11.1mb","current_size_in_bytes":11667197,"total":26652,"total_time":"6.1h","total_time_in_millis":22033477,"total_docs":113860498,"total_size":"73.2gb","total_size_in_bytes":78633705637},"refresh":{"total":235799,"total_time":"3.6h","total_time_in_millis":13025280},"flush":{"total":3707,"total_time":"1.6h","total_time_in_millis":6106876},"warmer":{"current":0,"total":236318,"total_time":"1.7m","total_time_in_millis":103911},"filter_cache":{"memory_size":"12.3gb","memory_size_in_bytes":13301397635,"evictions":1675040},"id_cache":{"memory_size":"0b","memory_size_in_bytes":0},"fielddata":{"memory_size":"908.2mb","memory_size_in_bytes":952377972,"evictions":0},"completion":{"size":"394.9mb","size_in_bytes":414152905}},"total":{"docs":{"count":275770446,"deleted":10724552},"store":{"size":"156.9gb","size_in_bytes":168543805383,"throttle_time":"34m","throttle_time_in_millis":2045083},"indexing":{"index_total":50999185,"index_time":"10.9h","index_time_in_millis":39585041,"index_current":0,"delete_total":916,"delete_time":"220ms","delete_time_in_millis":220,"delete_current":0},"get":{"total":0,"get_time":"0s","time_in_millis":0,"exists_total":0,"exists_time":"0s","exists_time_in_millis":0,"missing_total":0,"missing_time":"0s","missing_time_in_millis":0,"current":0},"search":{"open_contexts":6105,"query_total":283042821,"query_time":"66.5d","query_time_in_millis":5746690370,"query_current":484,"fetch_total":249278149,"fetch_time":"18.8h","fetch_time_in_millis":67763007,"fetch_current":2},"merges":{"current":4,"current_docs":82556,"current_size":"128.3mb","current_size_in_bytes":134562768,"total":72095,"total_time":"15.2h","total_time_in_millis":54863390,"total_docs":316015290,"total_size":"182.7gb","total_size_in_bytes":196242966831},"refresh":{"total":637875,"total_time":"9.5h","total_time_in_millis":34313049},"flush":{"total":10094,"total_time":"3.9h","total_time_in_millis":14392583},"warmer":{"current":0,"total":639230,"total_time":"5.1m","total_time_in_millis":307759},"filter_cache":{"memory_size":"41.7gb","memory_size_in_bytes":44776187249,"evictions":4523230},"id_cache":{"memory_size":"0b","memory_size_in_bytes":0},"fielddata":{"memory_size":"2.6gb","memory_size_in_bytes":2858548980,"evictions":0},"completion":{"size":"1.1gb","size_in_bytes":1249036540}}},"indices":{"entries_v2":{"primaries":{"docs":{"count":78155130,"deleted":1528887},"store":{"size":"48.5gb","size_in_bytes":52167552333,"throttle_time":"10.3m","throttle_time_in_millis":623272},"indexing":{"index_total":6630471,"index_time":"3.3h","index_time_in_millis":12148310,"index_current":0,"delete_total":196,"delete_time":"43ms","delete_time_in_millis":43,"delete_current":0},"get":{"total":0,"get_time":"0s","time_in_millis":0,"exists_total":0,"exists_time":"0s","exists_time_in_millis":0,"missing_total":0,"missing_time":"0s","missing_time_in_millis":0,"current":0},"search":{"open_contexts":2170,"query_total":99447256,"query_time":"22.6d","query_time_in_millis":1953903726,"query_current":183,"fetch_total":87685360,"fetch_time":"6.6h","fetch_time_in_millis":23986089,"fetch_current":2},"merges":{"current":2,"current_docs":4787,"current_size":"11.1mb","current_size_in_bytes":11667197,"total":13742,"total_time":"4.3h","total_time_in_millis":15694174,"total_docs":24183619,"total_size":"53.3gb","total_size_in_bytes":57251046437},"refresh":{"total":123457,"total_time":"2.7h","total_time_in_millis":9955172},"flush":{"total":1320,"total_time":"48.8m","total_time_in_millis":2932188},"warmer":{"current":0,"total":123697,"total_time":"54s","total_time_in_millis":54042},"filter_cache":{"memory_size":"12.3gb","memory_size_in_bytes":13299428491,"evictions":1675040},"id_cache":{"memory_size":"0b","memory_size_in_bytes":0},"fielddata":{"memory_size":"908.2mb","memory_size_in_bytes":952377972,"evictions":0},"completion":{"size":"394.9mb","size_in_bytes":414152905}},"total":{"docs":{"count":234465390,"deleted":4661891},"store":{"size":"146gb","size_in_bytes":156867688908,"throttle_time":"24.2m","throttle_time_in_millis":1455183},"indexing":{"index_total":15681512,"index_time":"8.2h","index_time_in_millis":29584165,"index_current":0,"delete_total":557,"delete_time":"146ms","delete_time_in_millis":146,"delete_current":0},"get":{"total":0,"get_time":"0s","time_in_millis":0,"exists_total":0,"exists_time":"0s","exists_time_in_millis":0,"missing_total":0,"missing_time":"0s","missing_time_in_millis":0,"current":0},"search":{"open_contexts":6105,"query_total":282898695,"query_time":"66.4d","query_time_in_millis":5744468852,"query_current":484,"fetch_total":249240785,"fetch_time":"18.8h","fetch_time_in_millis":67714771,"fetch_current":2},"merges":{"current":4,"current_docs":82556,"current_size":"128.3mb","current_size_in_bytes":134562768,"total":34431,"total_time":"10.3h","total_time_in_millis":37171561,"total_docs":56851070,"total_size":"125gb","total_size_in_bytes":134244262079},"refresh":{"total":309679,"total_time":"7h","total_time_in_millis":25304564},"flush":{"total":3139,"total_time":"1.6h","total_time_in_millis":6069756},"warmer":{"current":0,"total":310146,"total_time":"2.5m","total_time_in_millis":155394},"filter_cache":{"memory_size":"41.6gb","memory_size_in_bytes":44770240209,"evictions":4523230},"id_cache":{"memory_size":"0b","memory_size_in_bytes":0},"fielddata":{"memory_size":"2.6gb","memory_size_in_bytes":2858548980,"evictions":0},"completion":{"size":"1.1gb","size_in_bytes":1249036540}}},"users_v2":{"primaries":{"docs":{"count":13768352,"deleted":1926984},"store":{"size":"3.5gb","size_in_bytes":3863856116,"throttle_time":"3.3m","throttle_time_in_millis":203120},"indexing":{"index_total":12126337,"index_time":"58.8m","index_time_in_millis":3529500,"index_current":0,"delete_total":122,"delete_time":"23ms","delete_time_in_millis":23,"delete_current":0},"get":{"total":0,"get_time":"0s","time_in_millis":0,"exists_total":0,"exists_time":"0s","exists_time_in_millis":0,"missing_total":0,"missing_time":"0s","missing_time_in_millis":0,"current":0},"search":{"open_contexts":0,"query_total":49505,"query_time":"11.3m","query_time_in_millis":683294,"query_current":0,"fetch_total":12855,"fetch_time":"15.9s","fetch_time_in_millis":15965,"fetch_current":0},"merges":{"current":0,"current_docs":0,"current_size":"0b","current_size_in_bytes":0,"total":12910,"total_time":"1.7h","total_time_in_millis":6339303,"total_docs":89676879,"total_size":"19.9gb","total_size_in_bytes":21382659200},"refresh":{"total":112342,"total_time":"51.1m","total_time_in_millis":3070108},"flush":{"total":2387,"total_time":"52.9m","total_time_in_millis":3174688},"warmer":{"current":0,"total":112621,"total_time":"49.8s","total_time_in_millis":49869},"filter_cache":{"memory_size":"1.8mb","memory_size_in_bytes":1969144,"evictions":0},"id_cache":{"memory_size":"0b","memory_size_in_bytes":0},"fielddata":{"memory_size":"0b","memory_size_in_bytes":0,"evictions":0},"completion":{"size":"0b","size_in_bytes":0}},"total":{"docs":{"count":41305056,"deleted":6062661},"store":{"size":"10.8gb","size_in_bytes":11676116475,"throttle_time":"9.8m","throttle_time_in_millis":589900},"indexing":{"index_total":35317673,"index_time":"2.7h","index_time_in_millis":10000876,"index_current":0,"delete_total":359,"delete_time":"74ms","delete_time_in_millis":74,"delete_current":0},"get":{"total":0,"get_time":"0s","time_in_millis":0,"exists_total":0,"exists_time":"0s","exists_time_in_millis":0,"missing_total":0,"missing_time":"0s","missing_time_in_millis":0,"current":0},"search":{"open_contexts":0,"query_total":144126,"query_time":"37m","query_time_in_millis":2221518,"query_current":0,"fetch_total":37364,"fetch_time":"48.2s","fetch_time_in_millis":48236,"fetch_current":0},"merges":{"current":0,"current_docs":0,"current_size":"0b","current_size_in_bytes":0,"total":37664,"total_time":"4.9h","total_time_in_millis":17691829,"total_docs":259164220,"total_size":"57.7gb","total_size_in_bytes":61998704752},"refresh":{"total":328196,"total_time":"2.5h","total_time_in_millis":9008485},"flush":{"total":6955,"total_time":"2.3h","total_time_in_millis":8322827},"warmer":{"current":0,"total":329084,"total_time":"2.5m","total_time_in_millis":152365},"filter_cache":{"memory_size":"5.6mb","memory_size_in_bytes":5947040,"evictions":0},"id_cache":{"memory_size":"0b","memory_size_in_bytes":0},"fielddata":{"memory_size":"0b","memory_size_in_bytes":0,"evictions":0},"completion":{"size":"0b","size_in_bytes":0}}}}}

results of nodes all

gist.github.com

https://gist.github.com/dodizzle/6733369

gistfile1.txt

{"ok":true,"cluster_name":"prod","nodes":{"nXGqcVzQQvKrM7sQ1eoBMg":{"name":"search05.weheartit.com","transport_address":"inet[/10.84.30.30:9300]","hostname":"search05.weheartit.com","version":"0.90.5","http_address":"inet[/10.84.30.30:9200]","attributes":{"max_local_storage_nodes":"1"},"settings":{"path.home":"//elasticsearch","pidfile":"/var/run/elasticsearch/search05_weheartit_com.pid","config":"/etc/elasticsearch/elasticsearch.yml","action.disable_delete_all_indices":"true","gateway.type":"local","node.max_local_storage_nodes":"1","bootstrap.mlockall":"true","path.data":"/var/data/elasticsearch","cluster.name":"prod","index.mapper.dynamic":"true","path.conf":"/etc/elasticsearch","discovery.zen.minimum_master_nodes":"1","network.host":"10.84.30.30","node.data":"true","gateway.expected_nodes":"1","node.name":"search05.weheartit.com","http.enabled":"true","path.logs":"/var/log/elasticsearch","discovery.zen.ping.unicast.hosts":"10.84.30.32,10.84.30.10,10.84.30.56,10.84.30.60,10.84.30.30,10.84.30.58,10.84.30.28,10.84.30.34,10.84.30.36,10.40.100.4","discovery.zen.ping.multicast.enabled":"false","action.auto_create_index":"true","name":"search05.weheartit.com"},"os":{"refresh_interval":1000,"available_processors":80,"cpu":{"vendor":"Intel","model":"Xeon","mhz":1066,"total_cores":80,"total_sockets":1,"cores_per_socket":64,"cache_size":"24kb","cache_size_in_bytes":24576},"mem":{"total":"62.9gb","total_in_bytes":67546271744},"swap":{"total":"975.9mb","total_in_bytes":1023406080}},"process":{"refresh_interval":1000,"id":15153,"max_file_descriptors":64000},"jvm":{"pid":15153,"version":"1.7.0_25","vm_name":"Java HotSpot(TM) 64-Bit Server VM","vm_version":"23.25-b01","vm_vendor":"Oracle Corporation","start_time":1380231088892,"mem":{"heap_init":"30gb","heap_init_in_bytes":32212254720,"heap_max":"29.5gb","heap_max_in_bytes":31749898240,"non_heap_init":"23.1mb","non_heap_init_in_bytes":24313856,"non_heap_max":"130mb","non_heap_max_in_bytes":136314880,"direct_max":"29.5gb","direct_max_in_bytes":31749898240}},"thread_pool":{"generic":{"type":"cached","keep_alive":"30s"},"index":{"type":"fixed","min":32,"max":32},"get":{"type":"fixed","min":32,"max":32},"snapshot":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"merge":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"suggest":{"type":"fixed","min":32,"max":32,"queue_size":"1k"},"bulk":{"type":"fixed","min":32,"max":32},"optimize":{"type":"fixed","min":1,"max":1},"warmer":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"flush":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"search":{"type":"fixed","min":96,"max":96,"queue_size":"1k"},"percolate":{"type":"fixed","min":32,"max":32,"queue_size":"1k"},"management":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"refresh":{"type":"scaling","min":1,"max":10,"keep_alive":"5m"}},"network":{"refresh_interval":5000,"primary_interface":{"address":"10.84.30.30","name":"bond0","mac_address":"00:25:90:58:8E:1C"}},"transport":{"bound_address":"inet[/10.84.30.30:9300]","publish_address":"inet[/10.84.30.30:9300]"},"http":{"bound_address":"inet[/10.84.30.30:9200]","publish_address":"inet[/10.84.30.30:9200]","max_content_length":"100mb","max_content_length_in_bytes":104857600},"plugins":[{"name":"HQ","description":"No description found for HQ.","url":"/_plugin/HQ/","jvm":false,"site":true},{"name":"bigdesk","description":"No description found for bigdesk.","url":"/_plugin/bigdesk/","jvm":false,"site":true}]},"CMgLvlagSqWs3MIUI9A08g":{"name":"search03.weheartit.com","transport_address":"inet[/10.84.30.56:9300]","hostname":"search03.weheartit.com","version":"0.90.5","http_address":"inet[/10.84.30.56:9200]","attributes":{"max_local_storage_nodes":"1"},"settings":{"path.home":"//elasticsearch","pidfile":"/var/run/elasticsearch/search03_weheartit_com.pid","config":"/etc/elasticsearch/elasticsearch.yml","action.disable_delete_all_indices":"true","gateway.type":"local","node.max_local_storage_nodes":"1","bootstrap.mlockall":"true","path.data":"/var/data/elasticsearch","cluster.name":"prod","index.mapper.dynamic":"true","path.conf":"/etc/elasticsearch","discovery.zen.minimum_master_nodes":"1","network.host":"10.84.30.56","node.data":"true","gateway.expected_nodes":"1","node.name":"search03.weheartit.com","http.enabled":"true","path.logs":"/var/log/elasticsearch","discovery.zen.ping.unicast.hosts":"10.84.30.32,10.84.30.10,10.84.30.56,10.84.30.60,10.84.30.30,10.84.30.58,10.84.30.28,10.84.30.34,10.84.30.36,10.40.100.4","discovery.zen.ping.multicast.enabled":"false","action.auto_create_index":"true","name":"search03.weheartit.com"},"os":{"refresh_interval":1000,"available_processors":80,"cpu":{"vendor":"Intel","model":"Xeon","mhz":1868,"total_cores":80,"total_sockets":1,"cores_per_socket":64,"cache_size":"24kb","cache_size_in_bytes":24576},"mem":{"total":"62.9gb","total_in_bytes":67546271744},"swap":{"total":"975.9mb","total_in_bytes":1023406080}},"process":{"refresh_interval":1000,"id":71441,"max_file_descriptors":64000},"jvm":{"pid":71441,"version":"1.7.0_25","vm_name":"Java HotSpot(TM) 64-Bit Server VM","vm_version":"23.25-b01","vm_vendor":"Oracle Corporation","start_time":1380228826034,"mem":{"heap_init":"16gb","heap_init_in_bytes":17179869184,"heap_max":"15.5gb","heap_max_in_bytes":16717512704,"non_heap_init":"23.1mb","non_heap_init_in_bytes":24313856,"non_heap_max":"130mb","non_heap_max_in_bytes":136314880,"direct_max":"15.5gb","direct_max_in_bytes":16717512704}},"thread_pool":{"generic":{"type":"cached","keep_alive":"30s"},"index":{"type":"fixed","min":32,"max":32},"get":{"type":"fixed","min":32,"max":32},"snapshot":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"merge":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"suggest":{"type":"fixed","min":32,"max":32,"queue_size":"1k"},"bulk":{"type":"fixed","min":32,"max":32},"optimize":{"type":"fixed","min":1,"max":1},"warmer":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"flush":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"search":{"type":"fixed","min":96,"max":96,"queue_size":"1k"},"percolate":{"type":"fixed","min":32,"max":32,"queue_size":"1k"},"management":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"refresh":{"type":"scaling","min":1,"max":10,"keep_alive":"5m"}},"network":{"refresh_interval":5000,"primary_interface":{"address":"10.84.30.56","name":"bond0","mac_address":"00:25:90:58:3A:48"}},"transport":{"bound_address":"inet[/10.84.30.56:9300]","publish_address":"inet[/10.84.30.56:9300]"},"http":{"bound_address":"inet[/10.84.30.56:9200]","publish_address":"inet[/10.84.30.56:9200]","max_content_length":"100mb","max_content_length_in_bytes":104857600},"plugins":[{"name":"bigdesk","description":"No description found for bigdesk.","url":"/_plugin/bigdesk/","jvm":false,"site":true},{"name":"HQ","description":"No description found for HQ.","url":"/_plugin/HQ/","jvm":false,"site":true}]},"DlrsmFBHTeatLxf6OUrF2A":{"name":"search06.weheartit.com","transport_address":"inet[/10.84.30.58:9300]","hostname":"search06.weheartit.com","version":"0.90.5","http_address":"inet[/10.84.30.58:9200]","attributes":{"max_local_storage_nodes":"1"},"settings":{"path.home":"//elasticsearch","pidfile":"/var/run/elasticsearch/search06_weheartit_com.pid","config":"/etc/elasticsearch/elasticsearch.yml","action.disable_delete_all_indices":"true","gateway.type":"local","node.max_local_storage_nodes":"1","bootstrap.mlockall":"true","path.data":"/var/data/elasticsearch","cluster.name":"prod","index.mapper.dynamic":"true","path.conf":"/etc/elasticsearch","discovery.zen.minimum_master_nodes":"1","network.host":"10.84.30.58","node.data":"true","gateway.expected_nodes":"1","node.name":"search06.weheartit.com","http.enabled":"true","path.logs":"/var/log/elasticsearch","discovery.zen.ping.unicast.hosts":"10.84.30.32,10.84.30.10,10.84.30.56,10.84.30.60,10.84.30.30,10.84.30.58,10.84.30.28,10.84.30.34,10.84.30.36,10.40.100.4","discovery.zen.ping.multicast.enabled":"false","action.auto_create_index":"true","name":"search06.weheartit.com"},"os":{"refresh_interval":1000,"available_processors":80,"cpu":{"vendor":"Intel","model":"Xeon","mhz":2001,"total_cores":80,"total_sockets":1,"cores_per_socket":64,"cache_size":"24kb","cache_size_in_bytes":24576},"mem":{"total":"62.9gb","total_in_bytes":67546271744},"swap":{"total":"975.9mb","total_in_bytes":1023406080}},"process":{"refresh_interval":1000,"id":13115,"max_file_descriptors":64000},"jvm":{"pid":13115,"version":"1.7.0_25","vm_name":"Java HotSpot(TM) 64-Bit Server VM","vm_version":"23.25-b01","vm_vendor":"Oracle Corporation","start_time":1380234604143,"mem":{"heap_init":"30gb","heap_init_in_bytes":32212254720,"heap_max":"29.5gb","heap_max_in_bytes":31749898240,"non_heap_init":"23.1mb","non_heap_init_in_bytes":24313856,"non_heap_max":"130mb","non_heap_max_in_bytes":136314880,"direct_max":"29.5gb","direct_max_in_bytes":31749898240}},"thread_pool":{"generic":{"type":"cached","keep_alive":"30s"},"index":{"type":"fixed","min":32,"max":32},"get":{"type":"fixed","min":32,"max":32},"snapshot":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"merge":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"suggest":{"type":"fixed","min":32,"max":32,"queue_size":"1k"},"bulk":{"type":"fixed","min":32,"max":32},"optimize":{"type":"fixed","min":1,"max":1},"warmer":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"flush":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"search":{"type":"fixed","min":96,"max":96,"queue_size":"1k"},"percolate":{"type":"fixed","min":32,"max":32,"queue_size":"1k"},"management":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"refresh":{"type":"scaling","min":1,"max":10,"keep_alive":"5m"}},"network":{"refresh_interval":5000,"primary_interface":{"address":"10.84.30.58","name":"bond0","mac_address":"00:25:90:58:8E:54"}},"transport":{"bound_address":"inet[/10.84.30.58:9300]","publish_address":"inet[/10.84.30.58:9300]"},"http":{"bound_address":"inet[/10.84.30.58:9200]","publish_address":"inet[/10.84.30.58:9200]","max_content_length":"100mb","max_content_length_in_bytes":104857600},"plugins":[{"name":"HQ","description":"No description found for HQ.","url":"/_plugin/HQ/","jvm":false,"site":true},{"name":"bigdesk","description":"No description found for bigdesk.","url":"/_plugin/bigdesk/","jvm":false,"site":true}]},"sQ1b9BxJQO2tes7Hd8H6ig":{"name":"search08.weheartit.com","transport_address":"inet[/10.84.30.34:9300]","hostname":"search08.weheartit.com","version":"0.90.5","http_address":"inet[/10.84.30.34:9200]","attributes":{"data":"false","max_local_storage_nodes":"1"},"settings":{"path.home":"//elasticsearch","pidfile":"/var/run/elasticsearch/search08_weheartit_com.pid","config":"/etc/elasticsearch/elasticsearch.yml","action.disable_delete_all_indices":"true","gateway.type":"local","node.max_local_storage_nodes":"1","bootstrap.mlockall":"true","path.data":"/var/data/elasticsearch","cluster.name":"prod","index.mapper.dynamic":"true","path.conf":"/etc/elasticsearch","discovery.zen.minimum_master_nodes":"1","network.host":"10.84.30.34","node.data":"false","gateway.expected_nodes":"1","node.name":"search08.weheartit.com","http.enabled":"true","path.logs":"/var/log/elasticsearch","discovery.zen.ping.unicast.hosts":"10.84.30.32,10.84.30.10,10.84.30.56,10.84.30.60,10.84.30.30,10.84.30.58,10.84.30.28,10.84.30.34,10.84.30.36","discovery.zen.ping.multicast.enabled":"false","action.auto_create_index":"true","name":"search08.weheartit.com"},"os":{"refresh_interval":1000,"available_processors":80,"cpu":{"vendor":"Intel","model":"Xeon","mhz":2001,"total_cores":80,"total_sockets":1,"cores_per_socket":64,"cache_size":"24kb","cache_size_in_bytes":24576},"mem":{"total":"125.9gb","total_in_bytes":135191969792},"swap":{"total":"975.9mb","total_in_bytes":1023406080}},"process":{"refresh_interval":1000,"id":50711,"max_file_descriptors":64000},"jvm":{"pid":50711,"version":"1.7.0_25","vm_name":"Java HotSpot(TM) 64-Bit Server VM","vm_version":"23.25-b01","vm_vendor":"Oracle Corporation","start_time":1380144562400,"mem":{"heap_init":"75.5gb","heap_init_in_bytes":81114693632,"heap_max":"75.1gb","heap_max_in_bytes":80653385728,"non_heap_init":"23.1mb","non_heap_init_in_bytes":24313856,"non_heap_max":"130mb","non_heap_max_in_bytes":136314880,"direct_max":"75.1gb","direct_max_in_bytes":80653385728}},"thread_pool":{"generic":{"type":"cached","keep_alive":"30s"},"index":{"type":"fixed","min":32,"max":32},"get":{"type":"fixed","min":32,"max":32},"snapshot":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"merge":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"suggest":{"type":"fixed","min":32,"max":32,"queue_size":"1k"},"bulk":{"type":"fixed","min":32,"max":32},"optimize":{"type":"fixed","min":1,"max":1},"warmer":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"flush":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"search":{"type":"fixed","min":96,"max":96,"queue_size":"1k"},"percolate":{"type":"fixed","min":32,"max":32,"queue_size":"1k"},"management":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"refresh":{"type":"scaling","min":1,"max":10,"keep_alive":"5m"}},"network":{"refresh_interval":5000,"primary_interface":{"address":"10.84.30.34","name":"bond0","mac_address":"00:30:48:FF:B2:A4"}},"transport":{"bound_address":"inet[/10.84.30.34:9300]","publish_address":"inet[/10.84.30.34:9300]"},"http":{"bound_address":"inet[/10.84.30.34:9200]","publish_address":"inet[/10.84.30.34:9200]","max_content_length":"100mb","max_content_length_in_bytes":104857600},"plugins":[{"name":"HQ","description":"No description found for HQ.","url":"/_plugin/HQ/","jvm":false,"site":true},{"name":"bigdesk","description":"No description found for bigdesk.","url":"/_plugin/bigdesk/","jvm":false,"site":true}]},"J6ryKBGhRwW214a387Q4yw":{"name":"search01.weheartit.com","transport_address":"inet[/10.84.30.32:9300]","hostname":"search01.weheartit.com","version":"0.90.5","http_address":"inet[/10.84.30.32:9200]","attributes":{"max_local_storage_nodes":"1"},"settings":{"path.home":"//elasticsearch","pidfile":"/var/run/elasticsearch/search01_weheartit_com.pid","config":"/etc/elasticsearch/elasticsearch.yml","action.disable_delete_all_indices":"true","gateway.type":"local","node.max_local_storage_nodes":"1","bootstrap.mlockall":"true","path.data":"/var/data/elasticsearch","cluster.name":"prod","index.mapper.dynamic":"true","path.conf":"/etc/elasticsearch","discovery.zen.minimum_master_nodes":"1","network.host":"10.84.30.32","node.data":"true","gateway.expected_nodes":"1","node.name":"search01.weheartit.com","http.enabled":"true","path.logs":"/var/log/elasticsearch","discovery.zen.ping.unicast.hosts":"10.84.30.32,10.84.30.10,10.84.30.56,10.84.30.60,10.84.30.30,10.84.30.58,10.84.30.28,10.84.30.34,10.84.30.36,10.40.100.4","discovery.zen.ping.multicast.enabled":"false","action.auto_create_index":"true","name":"search01.weheartit.com"},"os":{"refresh_interval":1000,"available_processors":80,"cpu":{"vendor":"Intel","model":"Xeon","mhz":2001,"total_cores":80,"total_sockets":1,"cores_per_socket":64,"cache_size":"24kb","cache_size_in_bytes":24576},"mem":{"total":"62.9gb","total_in_bytes":67546271744},"swap":{"total":"975.9mb","total_in_bytes":1023406080}},"process":{"refresh_interval":1000,"id":50476,"max_file_descriptors":64000},"jvm":{"pid":50476,"version":"1.7.0_25","vm_name":"Java HotSpot(TM) 64-Bit Server VM","vm_version":"23.25-b01","vm_vendor":"Oracle Corporation","start_time":1380231431795,"mem":{"heap_init":"30gb","heap_init_in_bytes":32212254720,"heap_max":"29.5gb","heap_max_in_bytes":31749898240,"non_heap_init":"23.1mb","non_heap_init_in_bytes":24313856,"non_heap_max":"130mb","non_heap_max_in_bytes":136314880,"direct_max":"29.5gb","direct_max_in_bytes":31749898240}},"thread_pool":{"generic":{"type":"cached","keep_alive":"30s"},"index":{"type":"fixed","min":32,"max":32},"get":{"type":"fixed","min":32,"max":32},"snapshot":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"merge":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"suggest":{"type":"fixed","min":32,"max":32,"queue_size":"1k"},"bulk":{"type":"fixed","min":32,"max":32},"optimize":{"type":"fixed","min":1,"max":1},"warmer":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"flush":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"search":{"type":"fixed","min":96,"max":96,"queue_size":"1k"},"percolate":{"type":"fixed","min":32,"max":32,"queue_size":"1k"},"management":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"refresh":{"type":"scaling","min":1,"max":10,"keep_alive":"5m"}},"network":{"refresh_interval":5000,"primary_interface":{"address":"10.84.30.32","name":"bond0","mac_address":"00:25:90:58:8E:8E"}},"transport":{"bound_address":"inet[/10.84.30.32:9300]","publish_address":"inet[/10.84.30.32:9300]"},"http":{"bound_address":"inet[/10.84.30.32:9200]","publish_address":"inet[/10.84.30.32:9200]","max_content_length":"100mb","max_content_length_in_bytes":104857600},"plugins":[{"name":"bigdesk","description":"No description found for bigdesk.","url":"/_plugin/bigdesk/","jvm":false,"site":true},{"name":"HQ","description":"No description found for HQ.","url":"/_plugin/HQ/","jvm":false,"site":true}]},"A7lo1q0bSE-uxkgamRWrRw":{"name":"search09.weheartit.com","transport_address":"inet[search09.whi/10.84.30.36:9300]","hostname":"search09.weheartit.com","version":"0.90.5","http_address":"inet[/10.84.30.36:9200]","attributes":{"data":"false","max_local_storage_nodes":"1"},"settings":{"path.home":"//elasticsearch","pidfile":"/var/run/elasticsearch/search09_weheartit_com.pid","config":"/etc/elasticsearch/elasticsearch.yml","action.disable_delete_all_indices":"true","gateway.type":"local","node.max_local_storage_nodes":"1","bootstrap.mlockall":"true","path.data":"/var/data/elasticsearch","cluster.name":"prod","index.mapper.dynamic":"true","path.conf":"/etc/elasticsearch","discovery.zen.minimum_master_nodes":"1","network.host":"10.84.30.36","node.data":"false","gateway.expected_nodes":"1","node.name":"search09.weheartit.com","http.enabled":"true","path.logs":"/var/log/elasticsearch","discovery.zen.ping.unicast.hosts":"10.84.30.32,10.84.30.10,10.84.30.56,10.84.30.60,10.84.30.30,10.84.30.58,10.84.30.28,10.84.30.36","discovery.zen.ping.multicast.enabled":"false","action.auto_create_index":"true","name":"search09.weheartit.com"},"os":{"refresh_interval":1000,"available_processors":80,"cpu":{"vendor":"Intel","model":"Xeon","mhz":1066,"total_cores":80,"total_sockets":1,"cores_per_socket":64,"cache_size":"24kb","cache_size_in_bytes":24576},"mem":{"total":"125.9gb","total_in_bytes":135191969792},"swap":{"total":"975.9mb","total_in_bytes":1023406080}},"process":{"refresh_interval":1000,"id":48365,"max_file_descriptors":64000},"jvm":{"pid":48365,"version":"1.7.0_25","vm_name":"Java HotSpot(TM) 64-Bit Server VM","vm_version":"23.25-b01","vm_vendor":"Oracle Corporation","start_time":1380142390878,"mem":{"heap_init":"75.5gb","heap_init_in_bytes":81114693632,"heap_max":"75.1gb","heap_max_in_bytes":80653385728,"non_heap_init":"23.1mb","non_heap_init_in_bytes":24313856,"non_heap_max":"130mb","non_heap_max_in_bytes":136314880,"direct_max":"75.1gb","direct_max_in_bytes":80653385728}},"thread_pool":{"generic":{"type":"cached","keep_alive":"30s"},"index":{"type":"fixed","min":32,"max":32},"get":{"type":"fixed","min":32,"max":32},"snapshot":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"merge":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"suggest":{"type":"fixed","min":32,"max":32,"queue_size":"1k"},"bulk":{"type":"fixed","min":32,"max":32},"optimize":{"type":"fixed","min":1,"max":1},"warmer":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"flush":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"search":{"type":"fixed","min":96,"max":96,"queue_size":"1k"},"percolate":{"type":"fixed","min":32,"max":32,"queue_size":"1k"},"management":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"refresh":{"type":"scaling","min":1,"max":10,"keep_alive":"5m"}},"network":{"refresh_interval":5000,"primary_interface":{"address":"10.84.30.36","name":"bond0","mac_address":"00:30:48:FF:0C:F0"}},"transport":{"bound_address":"inet[/10.84.30.36:9300]","publish_address":"inet[search09.whi/10.84.30.36:9300]"},"http":{"bound_address":"inet[/10.84.30.36:9200]","publish_address":"inet[/10.84.30.36:9200]","max_content_length":"100mb","max_content_length_in_bytes":104857600},"plugins":[{"name":"paramedic","description":"No description found for paramedic.","url":"/_plugin/paramedic/","jvm":false,"site":true},{"name":"bigdesk","description":"No description found for bigdesk.","url":"/_plugin/bigdesk/","jvm":false,"site":true},{"name":"HQ","description":"No description found for HQ.","url":"/_plugin/HQ/","jvm":false,"site":true}]},"rjuQW1-BRaqpx_Xd0qkRog":{"name":"search10.weheartit.com","transport_address":"inet[/10.40.100.4:9300]","hostname":"search10.weheartit.com","version":"0.90.5","http_address":"inet[/10.40.100.4:9200]","attributes":{"max_local_storage_nodes":"1"},"settings":{"path.home":"//elasticsearch","pidfile":"/var/run/elasticsearch/search10_weheartit_com.pid","config":"/etc/elasticsearch/elasticsearch.yml","action.disable_delete_all_indices":"true","gateway.type":"local","node.max_local_storage_nodes":"1","bootstrap.mlockall":"true","path.data":"/var/data/elasticsearch","cluster.name":"prod","index.mapper.dynamic":"true","path.conf":"/etc/elasticsearch","discovery.zen.minimum_master_nodes":"1","network.host":"10.40.100.4","node.data":"true","gateway.expected_nodes":"1","node.name":"search10.weheartit.com","http.enabled":"true","path.logs":"/var/log/elasticsearch","discovery.zen.ping.unicast.hosts":"10.84.30.32,10.84.30.10,10.84.30.56,10.84.30.60,10.84.30.30,10.84.30.58,10.84.30.28,10.84.30.34,10.84.30.36,10.40.100.4","discovery.zen.ping.multicast.enabled":"false","action.auto_create_index":"true","name":"search10.weheartit.com"},"os":{"refresh_interval":1000,"available_processors":80,"cpu":{"vendor":"Intel","model":"Xeon","mhz":2001,"total_cores":80,"total_sockets":1,"cores_per_socket":64,"cache_size":"24kb","cache_size_in_bytes":24576},"mem":{"total":"31.4gb","total_in_bytes":33723400192},"swap":{"total":"975.9mb","total_in_bytes":1023406080}},"process":{"refresh_interval":1000,"id":34008,"max_file_descriptors":64000},"jvm":{"pid":34008,"version":"1.7.0_25","vm_name":"Java HotSpot(TM) 64-Bit Server VM","vm_version":"23.25-b01","vm_vendor":"Oracle Corporation","start_time":1380236522162,"mem":{"heap_init":"16gb","heap_init_in_bytes":17179869184,"heap_max":"15.5gb","heap_max_in_bytes":16717512704,"non_heap_init":"23.1mb","non_heap_init_in_bytes":24313856,"non_heap_max":"130mb","non_heap_max_in_bytes":136314880,"direct_max":"15.5gb","direct_max_in_bytes":16717512704}},"thread_pool":{"generic":{"type":"cached","keep_alive":"30s"},"index":{"type":"fixed","min":32,"max":32},"get":{"type":"fixed","min":32,"max":32},"snapshot":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"merge":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"suggest":{"type":"fixed","min":32,"max":32,"queue_size":"1k"},"bulk":{"type":"fixed","min":32,"max":32},"optimize":{"type":"fixed","min":1,"max":1},"warmer":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"flush":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"search":{"type":"fixed","min":96,"max":96,"queue_size":"1k"},"percolate":{"type":"fixed","min":32,"max":32,"queue_size":"1k"},"management":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"refresh":{"type":"scaling","min":1,"max":10,"keep_alive":"5m"}},"network":{"refresh_interval":5000,"primary_interface":{"address":"10.40.100.4","name":"bond0","mac_address":"00:30:48:FF:AA:B8"}},"transport":{"bound_address":"inet[/10.40.100.4:9300]","publish_address":"inet[/10.40.100.4:9300]"},"http":{"bound_address":"inet[/10.40.100.4:9200]","publish_address":"inet[/10.40.100.4:9200]","max_content_length":"100mb","max_content_length_in_bytes":104857600},"plugins":[{"name":"HQ","description":"No description found for HQ.","url":"/_plugin/HQ/","jvm":false,"site":true},{"name":"bigdesk","description":"No description found for bigdesk.","url":"/_plugin/bigdesk/","jvm":false,"site":true}]},"E5sz8iDNRAiQPykqXcjv-w":{"name":"search07.weheartit.com","transport_address":"inet[/10.84.30.28:9300]","hostname":"search07.weheartit.com","version":"0.90.5","http_address":"inet[/10.84.30.28:9200]","attributes":{"max_local_storage_nodes":"1"},"settings":{"path.home":"//elasticsearch","pidfile":"/var/run/elasticsearch/search07_weheartit_com.pid","config":"/etc/elasticsearch/elasticsearch.yml","action.disable_delete_all_indices":"true","gateway.type":"local","node.max_local_storage_nodes":"1","bootstrap.mlockall":"true","path.data":"/var/data/elasticsearch","cluster.name":"prod","index.mapper.dynamic":"true","path.conf":"/etc/elasticsearch","discovery.zen.minimum_master_nodes":"1","network.host":"10.84.30.28","node.data":"true","gateway.expected_nodes":"1","node.name":"search07.weheartit.com","http.enabled":"true","path.logs":"/var/log/elasticsearch","discovery.zen.ping.unicast.hosts":"10.84.30.32,10.84.30.10,10.84.30.56,10.84.30.60,10.84.30.30,10.84.30.58,10.84.30.28,10.84.30.34,10.84.30.36,10.40.100.4","discovery.zen.ping.multicast.enabled":"false","action.auto_create_index":"true","name":"search07.weheartit.com"},"os":{"refresh_interval":1000,"available_processors":80,"cpu":{"vendor":"Intel","model":"Xeon","mhz":2001,"total_cores":80,"total_sockets":1,"cores_per_socket":64,"cache_size":"24kb","cache_size_in_bytes":24576},"mem":{"total":"62.9gb","total_in_bytes":67546271744},"swap":{"total":"975.9mb","total_in_bytes":1023406080}},"process":{"refresh_interval":1000,"id":14665,"max_file_descriptors":64000},"jvm":{"pid":14665,"version":"1.7.0_25","vm_name":"Java HotSpot(TM) 64-Bit Server VM","vm_version":"23.25-b01","vm_vendor":"Oracle Corporation","start_time":1380235624809,"mem":{"heap_init":"30gb","heap_init_in_bytes":32212254720,"heap_max":"29.5gb","heap_max_in_bytes":31749898240,"non_heap_init":"23.1mb","non_heap_init_in_bytes":24313856,"non_heap_max":"130mb","non_heap_max_in_bytes":136314880,"direct_max":"29.5gb","direct_max_in_bytes":31749898240}},"thread_pool":{"generic":{"type":"cached","keep_alive":"30s"},"index":{"type":"fixed","min":32,"max":32},"get":{"type":"fixed","min":32,"max":32},"snapshot":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"merge":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"suggest":{"type":"fixed","min":32,"max":32,"queue_size":"1k"},"bulk":{"type":"fixed","min":32,"max":32},"optimize":{"type":"fixed","min":1,"max":1},"warmer":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"flush":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"search":{"type":"fixed","min":96,"max":96,"queue_size":"1k"},"percolate":{"type":"fixed","min":32,"max":32,"queue_size":"1k"},"management":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"refresh":{"type":"scaling","min":1,"max":10,"keep_alive":"5m"}},"network":{"refresh_interval":5000,"primary_interface":{"address":"10.84.30.28","name":"bond0","mac_address":"00:25:90:58:61:D6"}},"transport":{"bound_address":"inet[/10.84.30.28:9300]","publish_address":"inet[/10.84.30.28:9300]"},"http":{"bound_address":"inet[/10.84.30.28:9200]","publish_address":"inet[/10.84.30.28:9200]","max_content_length":"100mb","max_content_length_in_bytes":104857600},"plugins":[{"name":"HQ","description":"No description found for HQ.","url":"/_plugin/HQ/","jvm":false,"site":true},{"name":"bigdesk","description":"No description found for bigdesk.","url":"/_plugin/bigdesk/","jvm":false,"site":true}]},"BhLFmPTJRYSGh2LrecepMw":{"name":"search02.weheartit.com","transport_address":"inet[/10.84.30.10:9300]","hostname":"search02.weheartit.com","version":"0.90.5","http_address":"inet[/10.84.30.10:9200]","attributes":{"max_local_storage_nodes":"1"},"settings":{"path.home":"//elasticsearch","pidfile":"/var/run/elasticsearch/search02_weheartit_com.pid","config":"/etc/elasticsearch/elasticsearch.yml","action.disable_delete_all_indices":"true","gateway.type":"local","node.max_local_storage_nodes":"1","bootstrap.mlockall":"true","path.data":"/var/data/elasticsearch","cluster.name":"prod","index.mapper.dynamic":"true","path.conf":"/etc/elasticsearch","discovery.zen.minimum_master_nodes":"1","network.host":"10.84.30.10","node.data":"true","gateway.expected_nodes":"1","node.name":"search02.weheartit.com","http.enabled":"true","path.logs":"/var/log/elasticsearch","discovery.zen.ping.unicast.hosts":"10.84.30.32,10.84.30.10,10.84.30.56,10.84.30.60,10.84.30.30,10.84.30.58,10.84.30.28,10.84.30.34,10.84.30.36,10.40.100.4","discovery.zen.ping.multicast.enabled":"false","action.auto_create_index":"true","name":"search02.weheartit.com"},"os":{"refresh_interval":1000,"available_processors":80,"cpu":{"vendor":"Intel","model":"Xeon","mhz":2001,"total_cores":80,"total_sockets":1,"cores_per_socket":64,"cache_size":"24kb","cache_size_in_bytes":24576},"mem":{"total":"62.9gb","total_in_bytes":67546271744},"swap":{"total":"975.9mb","total_in_bytes":1023406080}},"process":{"refresh_interval":1000,"id":62858,"max_file_descriptors":64000},"jvm":{"pid":62858,"version":"1.7.0_25","vm_name":"Java HotSpot(TM) 64-Bit Server VM","vm_version":"23.25-b01","vm_vendor":"Oracle Corporation","start_time":1380232348621,"mem":{"heap_init":"30gb","heap_init_in_bytes":32212254720,"heap_max":"29.5gb","heap_max_in_bytes":31749898240,"non_heap_init":"23.1mb","non_heap_init_in_bytes":24313856,"non_heap_max":"130mb","non_heap_max_in_bytes":136314880,"direct_max":"29.5gb","direct_max_in_bytes":31749898240}},"thread_pool":{"generic":{"type":"cached","keep_alive":"30s"},"index":{"type":"fixed","min":32,"max":32},"get":{"type":"fixed","min":32,"max":32},"snapshot":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"merge":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"suggest":{"type":"fixed","min":32,"max":32,"queue_size":"1k"},"bulk":{"type":"fixed","min":32,"max":32},"optimize":{"type":"fixed","min":1,"max":1},"warmer":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"flush":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"search":{"type":"fixed","min":96,"max":96,"queue_size":"1k"},"percolate":{"type":"fixed","min":32,"max":32,"queue_size":"1k"},"management":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"refresh":{"type":"scaling","min":1,"max":10,"keep_alive":"5m"}},"network":{"refresh_interval":5000,"primary_interface":{"address":"10.84.30.10","name":"bond0","mac_address":"00:30:48:FE:DD:DC"}},"transport":{"bound_address":"inet[/10.84.30.10:9300]","publish_address":"inet[/10.84.30.10:9300]"},"http":{"bound_address":"inet[/10.84.30.10:9200]","publish_address":"inet[/10.84.30.10:9200]","max_content_length":"100mb","max_content_length_in_bytes":104857600},"plugins":[{"name":"HQ","description":"No description found for HQ.","url":"/_plugin/HQ/","jvm":false,"site":true},{"name":"bigdesk","description":"No description found for bigdesk.","url":"/_plugin/bigdesk/","jvm":false,"site":true}]},"F1l8LWzIQjSjq4Pcb1X2QQ":{"name":"search04.weheartit.com","transport_address":"inet[/10.84.30.60:9300]","hostname":"search04.weheartit.com","version":"0.90.5","http_address":"inet[/10.84.30.60:9200]","attributes":{"max_local_storage_nodes":"1"},"settings":{"path.home":"//elasticsearch","pidfile":"/var/run/elasticsearch/search04_weheartit_com.pid","config":"/etc/elasticsearch/elasticsearch.yml","action.disable_delete_all_indices":"true","gateway.type":"local","node.max_local_storage_nodes":"1","bootstrap.mlockall":"true","path.data":"/var/data/elasticsearch","cluster.name":"prod","index.mapper.dynamic":"true","path.conf":"/etc/elasticsearch","discovery.zen.minimum_master_nodes":"1","network.host":"10.84.30.60","node.data":"true","gateway.expected_nodes":"1","node.name":"search04.weheartit.com","http.enabled":"true","path.logs":"/var/log/elasticsearch","discovery.zen.ping.unicast.hosts":"10.84.30.32,10.84.30.10,10.84.30.56,10.84.30.60,10.84.30.30,10.84.30.58,10.84.30.28,10.84.30.34,10.84.30.36,10.40.100.4","discovery.zen.ping.multicast.enabled":"false","action.auto_create_index":"true","name":"search04.weheartit.com"},"os":{"refresh_interval":1000,"available_processors":80,"cpu":{"vendor":"Intel","model":"Xeon","mhz":1999,"total_cores":80,"total_sockets":1,"cores_per_socket":64,"cache_size":"24kb","cache_size_in_bytes":24576},"mem":{"total":"62.9gb","total_in_bytes":67546271744},"swap":{"total":"975.9mb","total_in_bytes":1023406080}},"process":{"refresh_interval":1000,"id":16432,"max_file_descriptors":64000},"jvm":{"pid":16432,"version":"1.7.0_25","vm_name":"Java HotSpot(TM) 64-Bit Server VM","vm_version":"23.25-b01","vm_vendor":"Oracle Corporation","start_time":1380233486267,"mem":{"heap_init":"30gb","heap_init_in_bytes":32212254720,"heap_max":"29.5gb","heap_max_in_bytes":31749898240,"non_heap_init":"23.1mb","non_heap_init_in_bytes":24313856,"non_heap_max":"130mb","non_heap_max_in_bytes":136314880,"direct_max":"29.5gb","direct_max_in_bytes":31749898240}},"thread_pool":{"generic":{"type":"cached","keep_alive":"30s"},"index":{"type":"fixed","min":32,"max":32},"get":{"type":"fixed","min":32,"max":32},"snapshot":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"merge":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"suggest":{"type":"fixed","min":32,"max":32,"queue_size":"1k"},"bulk":{"type":"fixed","min":32,"max":32},"optimize":{"type":"fixed","min":1,"max":1},"warmer":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"flush":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"search":{"type":"fixed","min":96,"max":96,"queue_size":"1k"},"percolate":{"type":"fixed","min":32,"max":32,"queue_size":"1k"},"management":{"type":"scaling","min":1,"max":5,"keep_alive":"5m"},"refresh":{"type":"scaling","min":1,"max":10,"keep_alive":"5m"}},"network":{"refresh_interval":5000,"primary_interface":{"address":"10.84.30.60","name":"bond0","mac_address":"00:25:90:58:BA:4A"}},"transport":{"bound_address":"inet[/10.84.30.60:9300]","publish_address":"inet[/10.84.30.60:9300]"},"http":{"bound_address":"inet[/10.84.30.60:9200]","publish_address":"inet[/10.84.30.60:9200]","max_content_length":"100mb","max_content_length_in_bytes":104857600},"plugins":[{"name":"HQ","description":"No description found for HQ.","url":"/_plugin/HQ/","jvm":false,"site":true},{"name":"bigdesk","description":"No description found for bigdesk.","url":"/_plugin/bigdesk/","jvm":false,"site":true}]}}}

I've disabled autobalancing and have manually moved shards to try and
even them out but there is always one or two nodes who's cpu usage is
much higher than the rest.
BTW we are using routing with 2 servers (search08/09) running http and
taking all the queries from haproxy.

On 9/27/13 11:42 AM, Boaz Leskes wrote:

Hmm, all the threads are busy searching. Are you running any heavy
queries? are you sure the node gets an equal amount of traffic to the
rest?

Besides the above can you also share the output of

curl -s localhost:9200/_stats?all
curl -s localhost:9200/_nodes?all

These call give cluster statistic plus the cluster topology - perhaps
we can see something there.

Cheers,
Boaz

On Fri, Sep 27, 2013 at 7:00 PM, David O'Dell <daveo@weheartit.com
mailto:daveo@weheartit.com> wrote:

Here is the results of the hot_threads.
https://gist.github.com/dodizzle/6731643

For memory use the system has 64GB RAM, jvm settings are.

-Xms30720m -Xmx30720m -Xss256k -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError


We are running 0.90.5

Thanks for getting back to me.

On 9/26/13 11:53 PM, Boaz Leskes wrote:

Hi Daveo,

Are you using the routing parameter in your searches? If not it's
highly unlikely search distribution wouldn't be evenly spread
across shards. What is possible is that a node contains more
shards then others.

It feels like the node is doing something else or got stuck on an
unlucky query - to verify - can you please post the result of the
hot threads API
(http://www.elasticsearch.org/guide/reference/api/admin-cluster-nodes-hot-threads/
) on this node? this will help figure out what it's doing? Also -
can post the memory usage on that node?

Cheers,
Boaz


On Thursday, September 26, 2013 10:34:37 PM UTC+2,
da...@weheartit.com <mailto:da...@weheartit.com> wrote:

    I have read many posts in this group about uneven load and
    hot shards.
    We are experiencing the same symptoms where one data node out
    of 8 has 100% CPU usage and the other 7 nodes operate at 40%.

    My question is how do I know identify the volume of searches
    per shard?

-- 
You received this message because you are subscribed to a topic
in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/XISMbne5eRg/unsubscribe.
To unsubscribe from this group and all its topics, send an email
to elasticsearch+unsubscribe@googlegroups.com
<mailto:elasticsearch+unsubscribe@googlegroups.com>.
For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to a topic in
the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/XISMbne5eRg/unsubscribe.
To unsubscribe from this group and all its topics, send an email
to elasticsearch+unsubscribe@googlegroups.com
<mailto:elasticsearch%2Bunsubscribe@googlegroups.com>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/XISMbne5eRg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Boaz_Leskes · September 27, 2013, 8:16pm

Hi David,

About routing: I meant the routing option of the index and search api - as
explained here: Elasticsearch Platform — Find real-time answers at scale | Elastic .

Good to see you can influence the cpu usage by moving shards around - that
means it is related to the content of the shards and not the machine it
self. Should make it easier to trace. Did you notice any correlation
between the presence of a specific shard and high cpu usage?

Can you also get these two (slightly different):

curl -XGET "http://localhost:9200/_nodes/stats?all"

curl -XGET "http://localhost:9200/_cluster/state"

Cheers,
Boaz

On Fri, Sep 27, 2013 at 8:50 PM, David O'Dell daveo@weheartit.com wrote:

results of stats all
stats all · GitHub

results of nodes all
nodes all · GitHub

I've disabled autobalancing and have manually moved shards to try and even
them out but there is always one or two nodes who's cpu usage is much
higher than the rest.
BTW we are using routing with 2 servers (search08/09) running http and
taking all the queries from haproxy.

On 9/27/13 11:42 AM, Boaz Leskes wrote:

Hmm, all the threads are busy searching. Are you running any heavy
queries? are you sure the node gets an equal amount of traffic to the
rest?

Besides the above can you also share the output of

curl -s localhost:9200/_stats?all

curl -s localhost:9200/_nodes?all

These call give cluster statistic plus the cluster topology - perhaps we
can see something there.

Cheers,
Boaz

On Fri, Sep 27, 2013 at 7:00 PM, David O'Dell daveo@weheartit.com wrote:

Here is the results of the hot_threads.
hot threads search02 · GitHub

For memory use the system has 64GB RAM, jvm settings are.

-Xms30720m -Xmx30720m -Xss256k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
-XX:+HeapDumpOnOutOfMemoryError

We are running 0.90.5

Thanks for getting back to me.

On 9/26/13 11:53 PM, Boaz Leskes wrote:

Hi Daveo,

Are you using the routing parameter in your searches? If not it's
highly unlikely search distribution wouldn't be evenly spread across
shards. What is possible is that a node contains more shards then others.

It feels like the node is doing something else or got stuck on an
unlucky query - to verify - can you please post the result of the hot
threads API (
Elasticsearch Platform — Find real-time answers at scale | Elastic) on this node? this will help figure out what it's doing? Also - can post
the memory usage on that node?

Cheers,
Boaz

On Thursday, September 26, 2013 10:34:37 PM UTC+2, da...@weheartit.comwrote:

I have read many posts in this group about uneven load and hot shards.
We are experiencing the same symptoms where one data node out of 8 has
100% CPU usage and the other 7 nodes operate at 40%.

My question is how do I know identify the volume of searches per shard?

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/XISMbne5eRg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/XISMbne5eRg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/XISMbne5eRg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/XISMbne5eRg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

David_O_Dell · September 30, 2013, 5:51pm

Boaz just checked on routing and we are not using the routing option.

On 9/27/13 1:16 PM, Boaz Leskes wrote:

Hi David,

About routing: I meant the routing option of the index and search api

as explained here:
Elasticsearch Platform — Find real-time answers at scale | Elastic .

Good to see you can influence the cpu usage by moving shards around -
that means it is related to the content of the shards and not the
machine it self. Should make it easier to trace. Did you notice any
correlation between the presence of a specific shard and high cpu usage?

Can you also get these two (slightly different):

curl -XGET "http://localhost:9200/_nodes/stats?all"

curl -XGET "http://localhost:9200/_cluster/state"

Cheers,
Boaz

On Fri, Sep 27, 2013 at 8:50 PM, David O'Dell <daveo@weheartit.com
mailto:daveo@weheartit.com> wrote:

results of stats all
https://gist.github.com/dodizzle/6733355

results of nodes all
https://gist.github.com/dodizzle/6733369

I've disabled autobalancing and have manually moved shards to try
and even them out but there is always one or two nodes who's cpu
usage is much higher than the rest.
BTW we are using routing with 2 servers (search08/09) running http
and taking all the queries from haproxy.



On 9/27/13 11:42 AM, Boaz Leskes wrote:

Hmm, all the threads are busy searching. Are you running any
heavy queries? are you sure the node gets an equal amount of
traffic to the rest?

Besides the above can you also share the output of

* curl -s localhost:9200/_stats?all
* curl -s localhost:9200/_nodes?all

These call give cluster statistic plus the cluster topology -
perhaps we can see something there.

Cheers,
Boaz



On Fri, Sep 27, 2013 at 7:00 PM, David O'Dell
<daveo@weheartit.com <mailto:daveo@weheartit.com>> wrote:

    Here is the results of the hot_threads.
    https://gist.github.com/dodizzle/6731643

    For memory use the system has 64GB RAM, jvm settings are.

    -Xms30720m -Xmx30720m -Xss256k -XX:+UseParNewGC
    -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75
    -XX:+UseCMSInitiatingOccupancyOnly
    -XX:+HeapDumpOnOutOfMemoryError


    We are running 0.90.5

    Thanks for getting back to me.

    On 9/26/13 11:53 PM, Boaz Leskes wrote:

    Hi Daveo,

    Are you using the routing parameter in your searches? If not
    it's highly unlikely search distribution wouldn't be evenly
    spread across shards. What is possible is that a node
    contains more shards then others.

    It feels like the node is doing something else or got stuck
    on an unlucky query - to verify - can you please post the
    result of the hot threads API
    (http://www.elasticsearch.org/guide/reference/api/admin-cluster-nodes-hot-threads/
    ) on this node? this will help figure out what it's doing?
    Also - can post the memory usage on that node?

    Cheers,
    Boaz


    On Thursday, September 26, 2013 10:34:37 PM UTC+2,
    da...@weheartit.com <mailto:da...@weheartit.com> wrote:

        I have read many posts in this group about uneven load
        and hot shards.
        We are experiencing the same symptoms where one data
        node out of 8 has 100% CPU usage and the other 7 nodes
        operate at 40%.

        My question is how do I know identify the volume of
        searches per shard?

    -- 
    You received this message because you are subscribed to a
    topic in the Google Groups "elasticsearch" group.
    To unsubscribe from this topic, visit
    https://groups.google.com/d/topic/elasticsearch/XISMbne5eRg/unsubscribe.
    To unsubscribe from this group and all its topics, send an
    email to elasticsearch+unsubscribe@googlegroups.com
    <mailto:elasticsearch+unsubscribe@googlegroups.com>.
    For more options, visit
    https://groups.google.com/groups/opt_out.

    -- 
    You received this message because you are subscribed to a
    topic in the Google Groups "elasticsearch" group.
    To unsubscribe from this topic, visit
    https://groups.google.com/d/topic/elasticsearch/XISMbne5eRg/unsubscribe.
    To unsubscribe from this group and all its topics, send an
    email to elasticsearch+unsubscribe@googlegroups.com
    <mailto:elasticsearch%2Bunsubscribe@googlegroups.com>.
    For more options, visit https://groups.google.com/groups/opt_out.


-- 
You received this message because you are subscribed to a topic
in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/XISMbne5eRg/unsubscribe.
To unsubscribe from this group and all its topics, send an email
to elasticsearch+unsubscribe@googlegroups.com
<mailto:elasticsearch+unsubscribe@googlegroups.com>.
For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to a topic in
the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/XISMbne5eRg/unsubscribe.
To unsubscribe from this group and all its topics, send an email
to elasticsearch+unsubscribe@googlegroups.com
<mailto:elasticsearch%2Bunsubscribe@googlegroups.com>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/XISMbne5eRg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Boaz_Leskes · September 30, 2013, 7:38pm

OK. Good to know.

Did you manage to correlate CPU with a specific index? Also, can you post
the output of those two extra APIs ? (indicate what node had cpu load at
the time pls).

Cheers,
Boaz

On Mon, Sep 30, 2013 at 7:51 PM, David O'Dell daveo@weheartit.com wrote:

Boaz just checked on routing and we are not using the routing option.

On 9/27/13 1:16 PM, Boaz Leskes wrote:

Hi David,

About routing: I meant the routing option of the index and search api -
as explained here:
Elasticsearch Platform — Find real-time answers at scale | Elastic .

Good to see you can influence the cpu usage by moving shards around -
that means it is related to the content of the shards and not the machine
it self. Should make it easier to trace. Did you notice any correlation
between the presence of a specific shard and high cpu usage?

Can you also get these two (slightly different):

curl -XGET "http://localhost:9200/_nodes/stats?all"

curl -XGET "http://localhost:9200/_cluster/state"

Cheers,
Boaz

On Fri, Sep 27, 2013 at 8:50 PM, David O'Dell daveo@weheartit.com wrote:

results of stats all
stats all · GitHub

results of nodes all
nodes all · GitHub

I've disabled autobalancing and have manually moved shards to try and
even them out but there is always one or two nodes who's cpu usage is much
higher than the rest.
BTW we are using routing with 2 servers (search08/09) running http and
taking all the queries from haproxy.

On 9/27/13 11:42 AM, Boaz Leskes wrote:

Hmm, all the threads are busy searching. Are you running any heavy
queries? are you sure the node gets an equal amount of traffic to the
rest?

Besides the above can you also share the output of

curl -s localhost:9200/_stats?all

curl -s localhost:9200/_nodes?all

These call give cluster statistic plus the cluster topology - perhaps
we can see something there.

Cheers,
Boaz

On Fri, Sep 27, 2013 at 7:00 PM, David O'Dell daveo@weheartit.comwrote:

Here is the results of the hot_threads.
hot threads search02 · GitHub

For memory use the system has 64GB RAM, jvm settings are.

-Xms30720m -Xmx30720m -Xss256k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
-XX:+HeapDumpOnOutOfMemoryError

We are running 0.90.5

Thanks for getting back to me.

On 9/26/13 11:53 PM, Boaz Leskes wrote:

Hi Daveo,

Are you using the routing parameter in your searches? If not it's
highly unlikely search distribution wouldn't be evenly spread across
shards. What is possible is that a node contains more shards then others.

It feels like the node is doing something else or got stuck on an
unlucky query - to verify - can you please post the result of the hot
threads API (
Elasticsearch Platform — Find real-time answers at scale | Elastic) on this node? this will help figure out what it's doing? Also - can post
the memory usage on that node?

Cheers,
Boaz

On Thursday, September 26, 2013 10:34:37 PM UTC+2, da...@weheartit.comwrote:

I have read many posts in this group about uneven load and hot shards.
We are experiencing the same symptoms where one data node out of 8 has
100% CPU usage and the other 7 nodes operate at 40%.

My question is how do I know identify the volume of searches per
shard?

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/XISMbne5eRg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/XISMbne5eRg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/XISMbne5eRg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/XISMbne5eRg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/XISMbne5eRg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/XISMbne5eRg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

otisg · September 30, 2013, 9:53pm

Hi David,

SPM for Elasticaearch -
Elasticsearch Monitoring should let
you see all this stuff visually, as time series, so you can see changes
over time.

Otis

On Thursday, September 26, 2013 4:34:37 PM UTC-4, da...@weheartit.com wrote:

I have read many posts in this group about uneven load and hot shards.
We are experiencing the same symptoms where one data node out of 8 has
100% CPU usage and the other 7 nodes operate at 40%.

My question is how do I know identify the volume of searches per shard?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

David_O_Dell · October 3, 2013, 9:36pm

Result:
I manually moved shards around, attempting to get an even mix of primary
and secondary shards per box.
Now I have a semi even load on my data nodes.
The unfortunate part of this was that it took manual intervention and I was
moving shards by guessing as I had no information around # of queries per
shard.

On Thursday, September 26, 2013 1:34:37 PM UTC-7, da...@weheartit.com wrote:

I have read many posts in this group about uneven load and hot shards.
We are experiencing the same symptoms where one data node out of 8 has
100% CPU usage and the other 7 nodes operate at 40%.

My question is how do I know identify the volume of searches per shard?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

otisg · October 15, 2013, 5:11am

David,

Check the attachment. That's how you can see which shard is on which host
and how big it is.

Otis

ELASTICSEARCH Performance Monitoring - Sematext Monitoring | Infrastructure Monitoring Service
Search Analytics - Cloud Monitoring Tools & Services | Sematext

On Thursday, October 3, 2013 5:36:46 PM UTC-4, da...@weheartit.com wrote:

Result:
I manually moved shards around, attempting to get an even mix of primary
and secondary shards per box.
Now I have a semi even load on my data nodes.
The unfortunate part of this was that it took manual intervention and I
was moving shards by guessing as I had no information around # of queries
per shard.

On Thursday, September 26, 2013 1:34:37 PM UTC-7, da...@weheartit.comwrote:

I have read many posts in this group about uneven load and hot shards.
We are experiencing the same symptoms where one data node out of 8 has
100% CPU usage and the other 7 nodes operate at 40%.

My question is how do I know identify the volume of searches per shard?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

otisg · October 15, 2013, 5:14am

Hi,

Ah, I see you were asking about the number of queries on each shard, not
the # of docs. I won't mark up a new screenshot, but if you look at the
one I sent you'll see a tab labeled "Search" where you can get this info.

Otis

On Tuesday, October 15, 2013 1:11:08 AM UTC-4, Otis Gospodnetic wrote:

David,

Check the attachment. That's how you can see which shard is on which host
and how big it is.

Otis

ELASTICSEARCH Performance Monitoring - Sematext Monitoring | Infrastructure Monitoring Service
Search Analytics - Cloud Monitoring Tools & Services | Sematext

On Thursday, October 3, 2013 5:36:46 PM UTC-4, da...@weheartit.com wrote:

Result:
I manually moved shards around, attempting to get an even mix of primary
and secondary shards per box.
Now I have a semi even load on my data nodes.
The unfortunate part of this was that it took manual intervention and I
was moving shards by guessing as I had no information around # of queries
per shard.

On Thursday, September 26, 2013 1:34:37 PM UTC-7, da...@weheartit.comwrote:

I have read many posts in this group about uneven load and hot shards.
We are experiencing the same symptoms where one data node out of 8 has
100% CPU usage and the other 7 nodes operate at 40%.

My question is how do I know identify the volume of searches per shard?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Questions per shard Elasticsearch	2	326	July 6, 2017
Uneven node load Elasticsearch	7	1934	July 5, 2017
High cpu usage (90-100%) on elastic search servers Elasticsearch	22	16694	August 17, 2021
1 of 10 nodes CPU bound Elasticsearch	5	620	January 17, 2017
Uneven search requests distribution among nodes Elasticsearch	3	480	March 25, 2020

Identifying hot shards to address uneven load

Otis

Otis

Related topics