I am trying to understand Flush API and some JVM related issues I am seeing.
The Flush API in the guide says:
The flush API allows to flush one or more indices through an API. The flush
process of an index basically frees memory from the index by flushing data
to the index storage and clearing the internal transaction log. By default,
ElasticSearch uses memory heuristics in order to automatically trigger
flush operations as required in order to clear memory.
Does this mean that if I make a Flush request manually, everything
transaction log related is ready for garbage collection?
I am trying to understand Flush API and some JVM related issues I am
seeing.
The Flush API in the guide says:
The flush API allows to flush one or more indices through an API. The
flush process of an index basically frees memory from the index by flushing
data to the index storage and clearing the internal transaction log. By
default, Elasticsearch uses memory heuristics in order to automatically
trigger flush operations as required in order to clear memory.
Does this mean that if I make a Flush request manually, everything
transaction log related is ready for garbage collection?
So here is what I am noticing and it is giving me a lot of trouble. I have
two nodes with 32 GB RAM, out of which half is allocated to ES. I am just
building out something so I wanted to see how fast can I write to ES, so at
the moment I am just indexing in ES and not querying at all. After a couple
of hours, I saw the heap usage to be about 97% and the GC was taking really
long to run (in the order of many seconds) and was running frequently
without really freeing much memory out of the heap for reuse. Then I
stopped indexing and was doing nothing. Then I manually the flushed the
index and waited for GC to free up some memory. Sadly, that's not what I
observed.
Since I am new to ES, and after having read whatever I could so far, I am
not able to understand what else might ES be using the heap for, especially
when I am not indexing anything, not using the nodes for querying as well
and have manually flushed the index using the Flush API. What else might be
causing such high usage of heap memory?
This concerns me because the write speed drastically drops in such
situations.
I am trying to understand Flush API and some JVM related issues I am
seeing.
The Flush API in the guide says:
The flush API allows to flush one or more indices through an API. The
flush process of an index basically frees memory from the index by flushing
data to the index storage and clearing the internal transaction log. By
default, Elasticsearch uses memory heuristics in order to automatically
trigger flush operations as required in order to clear memory.
Does this mean that if I make a Flush request manually, everything
transaction log related is ready for garbage collection?
The smallest part of a shard is a segment and lucene caches data at that
level, which is likely to be what you are seeing residing in your heap. ES
does aggressively cache data so that queries are as fast as possible.
(This is obviously dependent on your data set size.)
So here is what I am noticing and it is giving me a lot of trouble. I have
two nodes with 32 GB RAM, out of which half is allocated to ES. I am just
building out something so I wanted to see how fast can I write to ES, so at
the moment I am just indexing in ES and not querying at all. After a couple
of hours, I saw the heap usage to be about 97% and the GC was taking really
long to run (in the order of many seconds) and was running frequently
without really freeing much memory out of the heap for reuse. Then I
stopped indexing and was doing nothing. Then I manually the flushed the
index and waited for GC to free up some memory. Sadly, that's not what I
observed.
Since I am new to ES, and after having read whatever I could so far, I am
not able to understand what else might ES be using the heap for, especially
when I am not indexing anything, not using the nodes for querying as well
and have manually flushed the index using the Flush API. What else might be
causing such high usage of heap memory?
This concerns me because the write speed drastically drops in such
situations.
I am trying to understand Flush API and some JVM related issues I am
seeing.
The Flush API in the guide says:
The flush API allows to flush one or more indices through an API. The
flush process of an index basically frees memory from the index by flushing
data to the index storage and clearing the internal transaction log. By
default, Elasticsearch uses memory heuristics in order to automatically
trigger flush operations as required in order to clear memory.
Does this mean that if I make a Flush request manually, everything
transaction log related is ready for garbage collection?
You do not mention the ES version, also not the heap size you use, and the
data volume you handle when indexing. So it is not easy to help.
Anyway, from the situation you describe, what you observe has not much to
do with flush or translog. Most probably it is the segment merging. After a
few hours of constant indexing your segments grow larger and larger, and
the re-loading of segments allocates the heap.
Note, the default segment maximum merge setting is 5G. It means, segments
may grow up to this size and loaded into the heap for merging. In bad
cases, it may take a long time, long enough for nodes to disconnect from
the cluster, not being able to report to other nodes to the heartbeat
signal.
You should try streamlining your indexing by choosing smaller maximum
segment sizes. Example:
You can also try experimenting with the number of shards per node. The more
shards, the longer it takes before segments get big. But, more shards also
mean more resource consumption per node.
before tuning, knowing what is in the heap would be handy along with its
size. You can use the monitoring APIs to gather more information while the
heap is filling during indexing... Also there might be log entries about
slow garbage collections.
You do not mention the ES version, also not the heap size you use, and the
data volume you handle when indexing. So it is not easy to help.
Anyway, from the situation you describe, what you observe has not much to
do with flush or translog. Most probably it is the segment merging. After a
few hours of constant indexing your segments grow larger and larger, and
the re-loading of segments allocates the heap.
Note, the default segment maximum merge setting is 5G. It means, segments
may grow up to this size and loaded into the heap for merging. In bad
cases, it may take a long time, long enough for nodes to disconnect from
the cluster, not being able to report to other nodes to the heartbeat
signal.
You should try streamlining your indexing by choosing smaller maximum
segment sizes. Example:
You can also try experimenting with the number of shards per node. The
more shards, the longer it takes before segments get big. But, more shards
also mean more resource consumption per node.
=> You can also try experimenting with the number of shards per node. The
more shards, the longer it takes before segments get big. But, more shards
also mean more resource consumption per node.
What are the resource consumption on per nodes? Any good indicator (e.g.
api or monitoring tools)?
You do not mention the ES version, also not the heap size you use, and the
data volume you handle when indexing. So it is not easy to help.
Anyway, from the situation you describe, what you observe has not much to
do with flush or translog. Most probably it is the segment merging. After a
few hours of constant indexing your segments grow larger and larger, and
the re-loading of segments allocates the heap.
Note, the default segment maximum merge setting is 5G. It means, segments
may grow up to this size and loaded into the heap for merging. In bad
cases, it may take a long time, long enough for nodes to disconnect from
the cluster, not being able to report to other nodes to the heartbeat
signal.
You should try streamlining your indexing by choosing smaller maximum
segment sizes. Example:
You can also try experimenting with the number of shards per node. The
more shards, the longer it takes before segments get big. But, more shards
also mean more resource consumption per node.
=> You can also try experimenting with the number of shards per node. The
more shards, the longer it takes before segments get big. But, more shards
also mean more resource consumption per node.
What are the resource consumption on per nodes? Any good indicator (e.g.
api or monitoring tools)?
You do not mention the ES version, also not the heap size you use, and
the data volume you handle when indexing. So it is not easy to help.
Anyway, from the situation you describe, what you observe has not much to
do with flush or translog. Most probably it is the segment merging. After a
few hours of constant indexing your segments grow larger and larger, and
the re-loading of segments allocates the heap.
Note, the default segment maximum merge setting is 5G. It means, segments
may grow up to this size and loaded into the heap for merging. In bad
cases, it may take a long time, long enough for nodes to disconnect from
the cluster, not being able to report to other nodes to the heartbeat
signal.
You should try streamlining your indexing by choosing smaller maximum
segment sizes. Example:
You can also try experimenting with the number of shards per node. The
more shards, the longer it takes before segments get big. But, more shards
also mean more resource consumption per node.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.