Out of memory on cluster with almost no data

Wojciech_Durczynski · September 26, 2011, 1:40pm

Hello Shay.

I have an out of memory problem in ES 0.17.5.
My cluster contains two data nodes and a lot of short living "test nodes",
that doesn't store any data only connect to cluster to do some operations on
it.
Every "test node" creates its own index with some types and complex mapping
and cleans this index before shutdown.
After a while my cluster is broken - one of the data nodes throws OOM. It
contains then ~100 empty indices.
I analyzed heap dump of broken node and almost whole memory is used by
thread "elasticsearch[Manikin]clusterService#updateTask-pool-11-thread-1",
variable "workQueue" which is of type LinkedBlockingQueue and contains 1798
items of type org.elasticsearch.cluster.service.InternalClusterService$2,
with size ~640kB each
where updateTask.newState.metaData.indices has size ~491kB and
updateTask.newState.routingTable.indicesRouting has size ~102kB.
If this update task contains information about all indices and its mapping
then its size is ok, but why there are so many update tasks in this queue?

kimchy · September 26, 2011, 5:30pm

What exactly do you do on each "test node" that connects? Also, is that a
client node that connects to the cluster?

The thread mentioned is the one responsible for applying cluster wide
changes on the master node. For example, when an index is created, it
handles the data structures associated with it, and shard placements (not
actually moving / placing them, just changing in memory data structures
representing them).

2011/9/26 Wojciech Durczyński wojciech.durczynski@comarch.com

Hello Shay.

I have an out of memory problem in ES 0.17.5.
My cluster contains two data nodes and a lot of short living "test nodes",
that doesn't store any data only connect to cluster to do some operations on
it.
Every "test node" creates its own index with some types and complex mapping
and cleans this index before shutdown.
After a while my cluster is broken - one of the data nodes throws OOM. It
contains then ~100 empty indices.
I analyzed heap dump of broken node and almost whole memory is used by
thread "elasticsearch[Manikin]clusterService#updateTask-pool-11-thread-1",
variable "workQueue" which is of type LinkedBlockingQueue and contains 1798
items of type org.elasticsearch.cluster.service.InternalClusterService$2,
with size ~640kB each
where updateTask.newState.metaData.indices has size ~491kB and
updateTask.newState.routingTable.indicesRouting has size ~102kB.
If this update task contains information about all indices and its mapping
then its size is ok, but why there are so many update tasks in this queue?

Wojciech_Durczynski · September 27, 2011, 7:31am

"Test nodes" are of course client nodes.
They connect to cluster, create one or two indices with about 10 types. Then
index documents there (5000 at most), executes some queries, deletes data in
created indices and disconnects.
After ~100 similar operations cluster dies with OOM.

kimchy · September 27, 2011, 8:49am

How do they delete the data? Do they delete the content, or the indices?

2011/9/27 Wojciech Durczyński wojciech.durczynski@comarch.com

"Test nodes" are of course client nodes.
They connect to cluster, create one or two indices with about 10 types.
Then index documents there (5000 at most), executes some queries, deletes
data in created indices and disconnects.
After ~100 similar operations cluster dies with OOM.

Wojciech_Durczynski · September 27, 2011, 9:52am

They delete content only.

kimchy · September 27, 2011, 8:59pm

can you post the code you use? or a step by step actions you do?

2011/9/27 Wojciech Durczyński wojciech.durczynski@comarch.com

They delete content only.

Topic		Replies	Views
Simultaneous OutOfMemoryErrors across multiple nodes in cluster Elasticsearch	4	358	July 6, 2017
Getting out of memory Elasticsearch	2	306	July 6, 2017
Memory problems during data index Elasticsearch	13	1560	July 6, 2017
Out of memory of data nodes Elasticsearch	5	1268	February 23, 2018
Lack of memory? Elasticsearch	11	804	July 6, 2017

Out of memory on cluster with almost no data

Related topics