What's using memory in ElasticSearch? (Details to follow...)


(Adam Georgiou) #1

I'm curious as to what is actually using memory in my cluster.

Over the course of a day or two we'll slowly progress to about 75% heap
usage, at which point we get stuck in a garbage collect loop (i.e.,
apparently there's nothing to free). However the slow creep has me baffled.
It seems to me that the nature of our cluster should make it such that we
either have enough memory to handle our load or we don't.

I've been looking into the issue using the _nodes/stats endpoint and
can't exactly get the numbers to align in order to make an education guess
as to where my problem might lie.

For example, at the moment, on one of three nodes, we happen to be using:

  • ~50% of our heap space, which comes out to *15.9703 gb of used heap. *
  • 'segments' are at 0.685781 gb
  • 'fielddata' is at 0.0926561 gb
  • 'filter_cache' is at* 0.176308 gb. *
  • *Total *Data is at around 12.6 terabytes.

The other two nodes are very similar.

These are the data items I have in my head as relevant, but obviously they
don't add up to even close to the heap size. Does anyone see what I'm
missing? Does anyone have a guess as to why memory would slowly creep up?
Seems to me our memory usage should be relative to the size of our data and
the frequency of our queries. However, while we are indexing in real time,
our memory growth doesn't seem linearly proportional to the total amount of
data (i.e., if 12 TB were too much for our hardware, i'd expect to see the
cluster die way closer to start up, not 1-3 days after startup); and while
we are also querying in real time, I don't see how this could effect memory
so slowly, as the rate of our queries is pretty much stagnant across the
course of a day.

Any ideas?

Running on java version 1.7.0_55 (Open JDK)

Full dump of the _nodes/stats output:

{
"cluster_name" : "some_cluster",
"nodes" : {
"eVinxIXhRmG8ZaI1pjJRRA" : {
"timestamp" : 1401816162553,
"name" : "newsdataa29",
"transport_address" : "inet[/164.55.96.188:9300]",
"host" : "newsdataa29",
"ip" : [ "inet[/164.55.96.188:9300]", "NONE" ],
"attributes" : {
"max_local_storage_nodes" : "1",
"master" : "true"
},
"indices" : {
"docs" : {
"count" : 14896392,
"deleted" : 1957354
},
"store" : {
"size_in_bytes" : 593188367560,
"throttle_time_in_millis" : 424927
},
"indexing" : {
"index_total" : 41194,
"index_time_in_millis" : 743544,
"index_current" : 0,
"delete_total" : 0,
"delete_time_in_millis" : 0,
"delete_current" : 0
},
"get" : {
"total" : 193630,
"time_in_millis" : 18639,
"exists_total" : 192744,
"exists_time_in_millis" : 18567,
"missing_total" : 886,
"missing_time_in_millis" : 72,
"current" : 0
},
"search" : {
"open_contexts" : 0,
"query_total" : 9666,
"query_time_in_millis" : 2491171,
"query_current" : 0,
"fetch_total" : 1149,
"fetch_time_in_millis" : 63100,
"fetch_current" : 0
},
"merges" : {
"current" : 0,
"current_docs" : 0,
"current_size_in_bytes" : 0,
"total" : 1588,
"total_time_in_millis" : 626274,
"total_docs" : 463596,
"total_size_in_bytes" : 6263961105
},
"refresh" : {
"total" : 14459,
"total_time_in_millis" : 186342
},
"flush" : {
"total" : 183,
"total_time_in_millis" : 3338
},
"warmer" : {
"current" : 0,
"total" : 16490,
"total_time_in_millis" : 7715
},
"filter_cache" : {
"memory_size_in_bytes" : 189309489,
"evictions" : 0
},
"id_cache" : {
"memory_size_in_bytes" : 0
},
"fielddata" : {
"memory_size_in_bytes" : 99488720,
"evictions" : 0
},
"percolate" : {
"total" : 5113,
"time_in_millis" : 735585,
"current" : 0,
"memory_size_in_bytes" : -1,
"memory_size" : "-1b",
"queries" : 518706
},
"completion" : {
"size_in_bytes" : 0
},
"segments" : {
"count" : 2462,
"memory_in_bytes" : 736351751
},
"translog" : {
"operations" : 974,
"size_in_bytes" : 0
}
},
"os" : {
"timestamp" : 1401816162607,
"uptime_in_millis" : 4494603,
"load_average" : [ 0.12, 0.19, 0.21 ],
"cpu" : {
"sys" : 0,
"user" : 0,
"idle" : 99,
"usage" : 0,
"stolen" : 0
},
"mem" : {
"free_in_bytes" : 1461686272,
"used_in_bytes" : 66043281408,
"free_percent" : 47,
"used_percent" : 52,
"actual_free_in_bytes" : 32198017024,
"actual_used_in_bytes" : 35306950656
},
"swap" : {
"used_in_bytes" : 288641024,
"free_in_bytes" : 1808502784
}
},
"process" : {
"timestamp" : 1401816162608,
"open_file_descriptors" : 687,
"cpu" : {
"percent" : 0,
"sys_in_millis" : 148090,
"user_in_millis" : 3837300,
"total_in_millis" : 3985390
},
"mem" : {
"resident_in_bytes" : 36110393344,
"share_in_bytes" : 2666815488,
"total_virtual_in_bytes" : 643081678848
}
},
"jvm" : {
"timestamp" : 1401816162608,
"uptime_in_millis" : 15236927,
"mem" : {
"heap_used_in_bytes" : 17147996128,
"heap_used_percent" : 53,
"heap_committed_in_bytes" : 32011649024,
"heap_max_in_bytes" : 32011649024,
"non_heap_used_in_bytes" : 62292832,
"non_heap_committed_in_bytes" : 92192768,
"pools" : {
"young" : {
"used_in_bytes" : 933064416,
"max_in_bytes" : 1605304320,
"peak_used_in_bytes" : 1605304320,
"peak_max_in_bytes" : 1605304320
},
"survivor" : {
"used_in_bytes" : 110211016,
"max_in_bytes" : 200605696,
"peak_used_in_bytes" : 200605696,
"peak_max_in_bytes" : 200605696
},
"old" : {
"used_in_bytes" : 16104720696,
"max_in_bytes" : 30205739008,
"peak_used_in_bytes" : 22759852448,
"peak_max_in_bytes" : 30205739008
}
}
},
"threads" : {
"count" : 414,
"peak_count" : 418
},
"gc" : {
"collectors" : {
"young" : {
"collection_count" : 984,
"collection_time_in_millis" : 52609
},
"old" : {
"collection_count" : 2,
"collection_time_in_millis" : 31494
}
}
},
"buffer_pools" : {
"direct" : {
"count" : 461,
"used_in_bytes" : 94253910,
"total_capacity_in_bytes" : 94253910
},
"mapped" : {
"count" : 10411,
"used_in_bytes" : 592575795266,
"total_capacity_in_bytes" : 592575795266
}
}
},
"thread_pool" : {
"generic" : {
"threads" : 2,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 14,
"completed" : 20474
},
"index" : {
"threads" : 32,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 32,
"completed" : 46688
},
"get" : {
"threads" : 32,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 32,
"completed" : 148
},
"snapshot" : {
"threads" : 5,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 5,
"completed" : 14728
},
"merge" : {
"threads" : 5,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 5,
"completed" : 39031
},
"suggest" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
},
"bulk" : {
"threads" : 32,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 32,
"completed" : 243
},
"optimize" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
},
"warmer" : {
"threads" : 3,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 3,
"completed" : 16490
},
"flush" : {
"threads" : 1,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 2,
"completed" : 183
},
"search" : {
"threads" : 96,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 96,
"completed" : 10816
},
"percolate" : {
"threads" : 32,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 32,
"completed" : 5113
},
"management" : {
"threads" : 5,
"queue" : 0,
"active" : 1,
"rejected" : 0,
"largest" : 5,
"completed" : 32577
},
"refresh" : {
"threads" : 8,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 8,
"completed" : 14420
}
},
"network" : {
"tcp" : {
"active_opens" : 184270,
"passive_opens" : 7400580,
"curr_estab" : 115,
"in_segs" : 462492148,
"out_segs" : 341861048,
"retrans_segs" : 16346,
"estab_resets" : 1341,
"attempt_fails" : 72,
"in_errs" : 21,
"out_rsts" : 57676
}
},
"fs" : {
"timestamp" : 1401816162608,
"total" : {
"total_in_bytes" : 1731616178176,
"free_in_bytes" : 1136018505728,
"available_in_bytes" : 1048082395136
},
"data" : [ {
"path" : "somedirectory",
"mount" : "somedirectory",
"dev" : "somedirectory",
"total_in_bytes" : 1731616178176,
"free_in_bytes" : 1136018505728,
"available_in_bytes" : 1048082395136
} ]
},
"transport" : {
"server_open" : 39,
"rx_count" : 178775,
"rx_size_in_bytes" : 699521746,
"tx_count" : 175506,
"tx_size_in_bytes" : 589101063
},
"http" : {
"current_open" : 0,
"total_opened" : 0
},
"fielddata_breaker" : {
"maximum_size_in_bytes" : 25609319219,
"maximum_size" : "23.8gb",
"estimated_size_in_bytes" : 99488720,
"estimated_size" : "94.8mb",
"overhead" : 1.03
}
},
"eL4pkScqTfa_7OUE6MWszA" : {
"timestamp" : 1401816162553,
"name" : "newsdataa30",
"transport_address" : "inet[/164.55.96.189:9300]",
"host" : "newsdataa30",
"ip" : [ "inet[/164.55.96.189:9300]", "NONE" ],
"attributes" : {
"max_local_storage_nodes" : "1",
"master" : "true"
},
"indices" : {
"docs" : {
"count" : 15355700,
"deleted" : 2194810
},
"store" : {
"size_in_bytes" : 630611669017,
"throttle_time_in_millis" : 243569
},
"indexing" : {
"index_total" : 42160,
"index_time_in_millis" : 640546,
"index_current" : 0,
"delete_total" : 0,
"delete_time_in_millis" : 0,
"delete_current" : 0
},
"get" : {
"total" : 425528,
"time_in_millis" : 31415,
"exists_total" : 424809,
"exists_time_in_millis" : 31388,
"missing_total" : 719,
"missing_time_in_millis" : 27,
"current" : 0
},
"search" : {
"open_contexts" : 0,
"query_total" : 9147,
"query_time_in_millis" : 2560550,
"query_current" : 0,
"fetch_total" : 1143,
"fetch_time_in_millis" : 58306,
"fetch_current" : 0
},
"merges" : {
"current" : 0,
"current_docs" : 0,
"current_size_in_bytes" : 0,
"total" : 1392,
"total_time_in_millis" : 373704,
"total_docs" : 325316,
"total_size_in_bytes" : 3561830482
},
"refresh" : {
"total" : 12959,
"total_time_in_millis" : 139769
},
"flush" : {
"total" : 179,
"total_time_in_millis" : 2505
},
"warmer" : {
"current" : 0,
"total" : 14888,
"total_time_in_millis" : 7423
},
"filter_cache" : {
"memory_size_in_bytes" : 182403055,
"evictions" : 0
},
"id_cache" : {
"memory_size_in_bytes" : 0
},
"fielddata" : {
"memory_size_in_bytes" : 111598640,
"evictions" : 0
},
"percolate" : {
"total" : 6817,
"time_in_millis" : 1126769,
"current" : 0,
"memory_size_in_bytes" : -1,
"memory_size" : "-1b",
"queries" : 691375
},
"completion" : {
"size_in_bytes" : 0
},
"segments" : {
"count" : 2680,
"memory_in_bytes" : 761091350
},
"translog" : {
"operations" : 922,
"size_in_bytes" : 0
}
},
"os" : {
"timestamp" : 1401816162638,
"uptime_in_millis" : 4494521,
"load_average" : [ 0.52, 0.31, 0.23 ],
"cpu" : {
"sys" : 0,
"user" : 0,
"idle" : 98,
"usage" : 0,
"stolen" : 0
},
"mem" : {
"free_in_bytes" : 1023049728,
"used_in_bytes" : 66481917952,
"free_percent" : 47,
"used_percent" : 52,
"actual_free_in_bytes" : 32155111424,
"actual_used_in_bytes" : 35349856256
},
"swap" : {
"used_in_bytes" : 280563712,
"free_in_bytes" : 1816580096
}
},
"process" : {
"timestamp" : 1401816162638,
"open_file_descriptors" : 697,
"cpu" : {
"percent" : 0,
"sys_in_millis" : 138590,
"user_in_millis" : 4471520,
"total_in_millis" : 4610110
},
"mem" : {
"resident_in_bytes" : 36342210560,
"share_in_bytes" : 2912124928,
"total_virtual_in_bytes" : 680464547840
}
},
"jvm" : {
"timestamp" : 1401816162639,
"uptime_in_millis" : 15236713,
"mem" : {
"heap_used_in_bytes" : 21984266408,
"heap_used_percent" : 68,
"heap_committed_in_bytes" : 32011649024,
"heap_max_in_bytes" : 32011649024,
"non_heap_used_in_bytes" : 62779736,
"non_heap_committed_in_bytes" : 92708864,
"pools" : {
"young" : {
"used_in_bytes" : 1164777768,
"max_in_bytes" : 1605304320,
"peak_used_in_bytes" : 1605304320,
"peak_max_in_bytes" : 1605304320
},
"survivor" : {
"used_in_bytes" : 115874928,
"max_in_bytes" : 200605696,
"peak_used_in_bytes" : 200605696,
"peak_max_in_bytes" : 200605696
},
"old" : {
"used_in_bytes" : 20703613712,
"max_in_bytes" : 30205739008,
"peak_used_in_bytes" : 25235641144,
"peak_max_in_bytes" : 30205739008
}
}
},
"threads" : {
"count" : 417,
"peak_count" : 419
},
"gc" : {
"collectors" : {
"young" : {
"collection_count" : 1220,
"collection_time_in_millis" : 60254
},
"old" : {
"collection_count" : 4,
"collection_time_in_millis" : 2923
}
}
},
"buffer_pools" : {
"direct" : {
"count" : 501,
"used_in_bytes" : 96409873,
"total_capacity_in_bytes" : 96409873
},
"mapped" : {
"count" : 10879,
"used_in_bytes" : 629950077811,
"total_capacity_in_bytes" : 629950077811
}
}
},
"thread_pool" : {
"generic" : {
"threads" : 1,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 15,
"completed" : 20553
},
"index" : {
"threads" : 32,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 32,
"completed" : 48962
},
"get" : {
"threads" : 32,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 32,
"completed" : 221
},
"snapshot" : {
"threads" : 5,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 5,
"completed" : 13376
},
"merge" : {
"threads" : 5,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 5,
"completed" : 39881
},
"suggest" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
},
"bulk" : {
"threads" : 32,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 32,
"completed" : 320
},
"optimize" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
},
"warmer" : {
"threads" : 4,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 4,
"completed" : 14888
},
"flush" : {
"threads" : 1,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 1,
"completed" : 179
},
"search" : {
"threads" : 96,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 96,
"completed" : 10291
},
"percolate" : {
"threads" : 32,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 32,
"completed" : 6817
},
"management" : {
"threads" : 5,
"queue" : 0,
"active" : 1,
"rejected" : 0,
"largest" : 5,
"completed" : 32526
},
"refresh" : {
"threads" : 9,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 9,
"completed" : 12923
}
},
"network" : {
"tcp" : {
"active_opens" : 187257,
"passive_opens" : 7405296,
"curr_estab" : 125,
"in_segs" : 444968606,
"out_segs" : 336592894,
"retrans_segs" : 31766,
"estab_resets" : 1347,
"attempt_fails" : 94,
"in_errs" : 26,
"out_rsts" : 56295
}
},
"fs" : {
"timestamp" : 1401816162639,
"total" : {
"total_in_bytes" : 1731616178176,
"free_in_bytes" : 1098595561472,
"available_in_bytes" : 1010659450880
},
"data" : [ {
"path" : "somedirectory",
"mount" : "somedirectory",
"dev" : "somedirectory",
"total_in_bytes" : 1731616178176,
"free_in_bytes" : 1098595561472,
"available_in_bytes" : 1010659450880
} ]
},
"transport" : {
"server_open" : 39,
"rx_count" : 187263,
"rx_size_in_bytes" : 725020029,
"tx_count" : 184359,
"tx_size_in_bytes" : 476329370
},
"http" : {
"current_open" : 0,
"total_opened" : 0
},
"fielddata_breaker" : {
"maximum_size_in_bytes" : 25609319219,
"maximum_size" : "23.8gb",
"estimated_size_in_bytes" : 111598640,
"estimated_size" : "106.4mb",
"overhead" : 1.03
}
},
"V8RVx-3PQE2i6x4Jz-KyjQ" : {
"timestamp" : 1401816162553,
"name" : "newsdataa28",
"transport_address" : "inet[newsdataa28/164.55.96.187:9300]",
"host" : "newsdataa28",
"ip" : [ "inet[newsdataa28/164.55.96.187:9300]", "NONE" ],
"attributes" : {
"max_local_storage_nodes" : "1",
"master" : "true"
},
"indices" : {
"docs" : {
"count" : 15620937,
"deleted" : 2080647
},
"store" : {
"size_in_bytes" : 661959225415,
"throttle_time_in_millis" : 376994
},
"indexing" : {
"index_total" : 36962,
"index_time_in_millis" : 712752,
"index_current" : 0,
"delete_total" : 0,
"delete_time_in_millis" : 0,
"delete_current" : 0
},
"get" : {
"total" : 211415,
"time_in_millis" : 16067,
"exists_total" : 210827,
"exists_time_in_millis" : 16035,
"missing_total" : 588,
"missing_time_in_millis" : 32,
"current" : 0
},
"search" : {
"open_contexts" : 0,
"query_total" : 9211,
"query_time_in_millis" : 2462277,
"query_current" : 0,
"fetch_total" : 1125,
"fetch_time_in_millis" : 72893,
"fetch_current" : 0
},
"merges" : {
"current" : 0,
"current_docs" : 0,
"current_size_in_bytes" : 0,
"total" : 1367,
"total_time_in_millis" : 550372,
"total_docs" : 378105,
"total_size_in_bytes" : 5630682978
},
"refresh" : {
"total" : 12317,
"total_time_in_millis" : 188853
},
"flush" : {
"total" : 170,
"total_time_in_millis" : 2947
},
"warmer" : {
"current" : 0,
"total" : 14126,
"total_time_in_millis" : 6972
},
"filter_cache" : {
"memory_size_in_bytes" : 183982668,
"evictions" : 0
},
"id_cache" : {
"memory_size_in_bytes" : 0
},
"fielddata" : {
"memory_size_in_bytes" : 107389383,
"evictions" : 0
},
"percolate" : {
"total" : 5120,
"time_in_millis" : 1225884,
"current" : 0,
"memory_size_in_bytes" : -1,
"memory_size" : "-1b",
"queries" : 518255
},
"completion" : {
"size_in_bytes" : 0
},
"segments" : {
"count" : 1865,
"memory_in_bytes" : 788006138
},
"translog" : {
"operations" : 823,
"size_in_bytes" : 0
}
},
"os" : {
"timestamp" : 1401816162605,
"uptime_in_millis" : 4496476,
"load_average" : [ 0.09, 0.13, 0.17 ],
"cpu" : {
"sys" : 0,
"user" : 0,
"idle" : 98,
"usage" : 0,
"stolen" : 0
},
"mem" : {
"free_in_bytes" : 2885521408,
"used_in_bytes" : 64619446272,
"free_percent" : 47,
"used_percent" : 52,
"actual_free_in_bytes" : 32229896192,
"actual_used_in_bytes" : 35275071488
},
"swap" : {
"used_in_bytes" : 311009280,
"free_in_bytes" : 1786134528
}
},
"process" : {
"timestamp" : 1401816162605,
"open_file_descriptors" : 696,
"cpu" : {
"percent" : 0,
"sys_in_millis" : 168280,
"user_in_millis" : 4273640,
"total_in_millis" : 4441920
},
"mem" : {
"resident_in_bytes" : 36114485248,
"share_in_bytes" : 2677698560,
"total_virtual_in_bytes" : 711765032960
}
},
"jvm" : {
"timestamp" : 1401816162606,
"uptime_in_millis" : 15237177,
"mem" : {
"heap_used_in_bytes" : 17395618416,
"heap_used_percent" : 54,
"heap_committed_in_bytes" : 32011649024,
"heap_max_in_bytes" : 32011649024,
"non_heap_used_in_bytes" : 62692400,
"non_heap_committed_in_bytes" : 92835840,
"pools" : {
"young" : {
"used_in_bytes" : 829437384,
"max_in_bytes" : 1605304320,
"peak_used_in_bytes" : 1605304320,
"peak_max_in_bytes" : 1605304320
},
"survivor" : {
"used_in_bytes" : 86595544,
"max_in_bytes" : 200605696,
"peak_used_in_bytes" : 200605696,
"peak_max_in_bytes" : 200605696
},
"old" : {
"used_in_bytes" : 16479585488,
"max_in_bytes" : 30205739008,
"peak_used_in_bytes" : 23011882528,
"peak_max_in_bytes" : 30205739008
}
}
},
"threads" : {
"count" : 415,
"peak_count" : 418
},
"gc" : {
"collectors" : {
"young" : {
"collection_count" : 1269,
"collection_time_in_millis" : 56592
},
"old" : {
"collection_count" : 1,
"collection_time_in_millis" : 370
}
}
},
"buffer_pools" : {
"direct" : {
"count" : 519,
"used_in_bytes" : 101124103,
"total_capacity_in_bytes" : 101124103
},
"mapped" : {
"count" : 9738,
"used_in_bytes" : 661268222992,
"total_capacity_in_bytes" : 661268222992
}
}
},
"thread_pool" : {
"generic" : {
"threads" : 2,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 13,
"completed" : 23343
},
"index" : {
"threads" : 32,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 32,
"completed" : 42212
},
"get" : {
"threads" : 32,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 32,
"completed" : 151
},
"snapshot" : {
"threads" : 5,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 5,
"completed" : 12569
},
"merge" : {
"threads" : 5,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 5,
"completed" : 42057
},
"suggest" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
},
"bulk" : {
"threads" : 32,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 32,
"completed" : 247
},
"optimize" : {
"threads" : 0,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 0,
"completed" : 0
},
"warmer" : {
"threads" : 3,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 3,
"completed" : 14126
},
"flush" : {
"threads" : 1,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 2,
"completed" : 170
},
"search" : {
"threads" : 96,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 96,
"completed" : 10336
},
"percolate" : {
"threads" : 32,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 32,
"completed" : 5120
},
"management" : {
"threads" : 5,
"queue" : 0,
"active" : 1,
"rejected" : 0,
"largest" : 5,
"completed" : 33230
},
"refresh" : {
"threads" : 9,
"queue" : 0,
"active" : 0,
"rejected" : 0,
"largest" : 9,
"completed" : 12289
}
},
"network" : {
"tcp" : {
"active_opens" : 200151,
"passive_opens" : 7522949,
"curr_estab" : 125,
"in_segs" : 508598169,
"out_segs" : 397943897,
"retrans_segs" : 75727,
"estab_resets" : 3559,
"attempt_fails" : 135,
"in_errs" : 24,
"out_rsts" : 60541
}
},
"fs" : {
"timestamp" : 1401816162606,
"total" : {
"total_in_bytes" : 1731616178176,
"free_in_bytes" : 1067249401856,
"available_in_bytes" : 979313291264
},
"data" : [ {
"path" : "somediretory",
"mount" : "somedirectory",
"dev" : "somedirectory",
"total_in_bytes" : 1731616178176,
"free_in_bytes" : 1067249401856,
"available_in_bytes" : 979313291264
} ]
},
"transport" : {
"server_open" : 39,
"rx_count" : 222658,
"rx_size_in_bytes" : 511559857,
"tx_count" : 221087,
"tx_size_in_bytes" : 870671199
},
"http" : {
"current_open" : 0,
"total_opened" : 0
},
"fielddata_breaker" : {
"maximum_size_in_bytes" : 25609319219,
"maximum_size" : "23.8gb",
"estimated_size_in_bytes" : 107389383,
"estimated_size" : "102.4mb",
"overhead" : 1.03
}
}
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a524088e-b78f-45cd-b2e7-ed98ac682bf3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #2

What ES version is this?

Your segment count is very high (>1000) which is not efficient.

Maybe index.codec.bloom.load: false can help reducing heap mem usage.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-codec.html

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGLMa1zV-8JM5zejxsVoTHmW93Tbt_Vs9hjD5t0NcOd4A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Adam Georgiou) #3

Thanks for the response, Jörg.

The version is 1.1.0, and I'll take a look at that bloom filter setting.

-Adam

On Tuesday, June 3, 2014 3:48:37 PM UTC-4, Jörg Prante wrote:

What ES version is this?

Your segment count is very high (>1000) which is not efficient.

Maybe index.codec.bloom.load: false can help reducing heap mem usage.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-codec.html

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/aeaf66df-76a0-41ab-a17d-f41a01473912%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(jegansp) #4

Is this related to https://github.com/elasticsearch/elasticsearch/issues/5779

We are also facing this issue. Our version is also 1.1.0. After running fine for few days, the cluster slowly starts facing memory issues. At some point of time it spends almost 90% of the time in garbage collection with full GCs with each on taking more than 15 secs. Tuning jvm parameters didn't help us either.

We have a two node cluster with 120+ indexes. In our case segments count is too high (one node has 16000+ and other has 25000+).

Optimizing indexes API doesn't return either as indicated here https://groups.google.com/forum/#!topic/elasticsearch/kqTRRADQBwc

I am going to try with these settings to see if the situation improves.

But, not sure if this will merge existing segments.


(Mark Walkom) #5

Heap use is very dependant on your setup.

How big are your indexes?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 5 June 2014 15:30, jegansp jegansp@gmail.com wrote:

Is this related to
https://github.com/elasticsearch/elasticsearch/issues/5779

We are also facing this issue. Our version is also 1.1.0. After running
fine
for few days, the cluster slowly starts facing memory issues. At some point
of time it spends almost 90% of the time in garbage collection with full
GCs
with each on taking more than 15 secs. Tuning jvm parameters didn't help us
either.

We have a two node cluster with 120+ indexes. In our case segments count is
too high (one node has 16000+ and other has 25000+).

Optimizing indexes API doesn't return either as indicated here
https://groups.google.com/forum/#!topic/elasticsearch/kqTRRADQBwc

I am going to try with these settings to see if the situation improves.

https://gist.github.com/jprante/10666960

But, not sure if this will merge existing segments.

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/What-s-using-memory-in-ElasticSearch-Details-to-follow-tp4056989p4057097.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1401946225557-4057097.post%40n3.nabble.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624ZNcSnB0VD-OqcrUDzN24_aS02K4PkCxD7Wwj8XxUAu8Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #6

No, the settings will not merge existing segments unless you call _optimize
action via API.

And take some patience. Thousands of segments take time - also, they need
quite few memory resources to merge...

I suggest backup your data first, to stay safe if the merging fails /
aborts...

Jörg

On Thu, Jun 5, 2014 at 7:30 AM, jegansp jegansp@gmail.com wrote:

Is this related to
https://github.com/elasticsearch/elasticsearch/issues/5779

We are also facing this issue. Our version is also 1.1.0. After running
fine
for few days, the cluster slowly starts facing memory issues. At some point
of time it spends almost 90% of the time in garbage collection with full
GCs
with each on taking more than 15 secs. Tuning jvm parameters didn't help us
either.

We have a two node cluster with 120+ indexes. In our case segments count is
too high (one node has 16000+ and other has 25000+).

Optimizing indexes API doesn't return either as indicated here
https://groups.google.com/forum/#!topic/elasticsearch/kqTRRADQBwc

I am going to try with these settings to see if the situation improves.

https://gist.github.com/jprante/10666960

But, not sure if this will merge existing segments.

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/What-s-using-memory-in-ElasticSearch-Details-to-follow-tp4056989p4057097.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1401946225557-4057097.post%40n3.nabble.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHyMHc242rWrLAihDEkz6GUfKBJbUWJEkqcsU2_VmDxoA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(jegansp) #7

Thanks for your replies Mark and Jorg.

My index size is around 500GB.

After using the settings (provided in my last post) initially the segments count came down to just around 200 (without doing optimize calls), but after some time it started increasing and now it stands around 2000+ in each node. Is this expected?

I would surely try with the optimise APIs for each index. Is there any other settings I need to be aware of?

Thanks,

Jegan


(Jörg Prante) #8

Maybe the segment count is just counting new segments as they are
created... can you look into the data folders to examine if the segment
file count is still high?

And can you verify if the settings are really active... not sure what's
going on without seeing details.

The _optimize call takes a parameter max_num_segments, you should maybe
start with 50 or so if you have 1000s of segments. When _optimize runs, you
can check the number of active segment merge threads in the nodes info for
monitoring progress (or use a monitoring tool)

Jörg

On Fri, Jun 6, 2014 at 4:08 PM, jegansp jegansp@gmail.com wrote:

Thanks for your replies Mark and Jorg.

My index size is around 500GB.

After using the settings (provided in my last post) initially the segments
count came down to just around 200 (without doing optimize calls), but
after
some time it started increasing and now it stands around 2000+ in each
node.
Is this expected?

I would surely try with the optimise APIs for each index. Is there any
other
settings I need to be aware of?

Thanks,

Jegan

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/What-s-using-memory-in-ElasticSearch-Details-to-follow-tp4056989p4057223.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1402063709309-4057223.post%40n3.nabble.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFM7GF_bZ_-b_mS0tFj1jGntM%2BV8LAgTgF6wD4Z5SE7sQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #9