Dec 6th, 2019 [EN] Understanding the `jvm.options` file in Elasticsearch

When configuring Elasticsearch, you will find one page about configuring JVM options. In this post we will take a look how to configure this and how this is implemented internally.

Important: With the exception of changing the heap size, you should not change any other parameter in this file, unless you really know what you are doing - meaning how the JVM works internally and how a change of JVM options will affect Elasticsearch.

Configuring the JVM

There are certain settings within the JVM, that cannot be set at runtime, like the maximum amount of memory that a Java process should use. This requires a dedicated configuration somewhere, even before elasticsearch.yml is read.

This is where the jvm.options file comes in. However, it is a little bit more advanced. Each line, that is not empty and does not start with a # is considered a JVM option. In addition you can also configure different options for different major versions of Java.

You may ask yourself, why this is so important, but keep in mind that the Java Ecosystem is moving as well. A very recent example is the removal of the CMS Garbage Collector from the upcoming Java 14 release (this has happened in this commit). As this used to be the default Elasticsearch garbage collector, we need to use a different one from java 14 onwards. This was added in a separate PR, which now uses CMS up until Java 13 and G1GC from then onwards.

## GC configuration
8-13:-XX:+UseConcMarkSweepGC
8-13:-XX:CMSInitiatingOccupancyFraction=75
8-13:-XX:+UseCMSInitiatingOccupancyOnly

14-:-XX:+UseG1GC
14-:-XX:G1ReservePercent=25
14-:-XX:InitiatingHeapOccupancyPercent=30

In addition, over time JVM options can change, so that we have to take this into account as well. Take a look at the different GC settings between java 8 and all the following versions

## JDK 8 GC logging
8:-XX:+PrintGCDetails
8:-XX:+PrintGCDateStamps
8:-XX:+PrintTenuringDistribution
8:-XX:+PrintGCApplicationStoppedTime
8:-Xloggc:${loggc}
8:-XX:+UseGCLogFileRotation
8:-XX:NumberOfGCLogFiles=32
8:-XX:GCLogFileSize=64m

# JDK 9+ GC logging
9-:-Xlog:gc*,gc+age=trace,safepoint:file=${loggc}:utctime,pid,tags:filecount=32,filesize=64m

Implementation details

Let's dive a little bit into the implementation details, as we need to understand how this works. The parsing of that file is custom to Elasticsearch, meaning it needs to extract the proper arguments for the JVM that is about to run and then also start the JVM itself with those arguments.

The main class to look at is the JvmOptionsParser class. This class is responsible to parse only the lines that match the Java version being in use, along with the options that do not have a java version set.

When is this being called, you might ask yourself? Whenever you start Elasticsearch, a dedicated java process is started, that parses this options file and then returns the parsed options as a single line string. See the elasticsearch startup script under Linux and Mac OS.

This also means, that the JvmOptionsParser will not start with the amount of memory that you have configured for the Elasticsearch process, but with much less instead.

There is a good reason, why you should not add any hardcoded options to the jvm.options file, despite those option having the same default value on the JVM.

Looking within the JvmOptionsParser, there is a small list mentioning so called ergonomicJvmOptions called like this

final List<String> ergonomicJvmOptions = JvmErgonomics.choose(substitutedJvmOptions);

This code currently configures the maximum direct memory size based on the configured heap size. For more information, see JvmErgonomics.

Lastly, there is another line of code looking like this

final List<String> systemJvmOptions = SystemJvmOptions.systemJvmOptions();

The SystemJvmOptions class contains a list of default command line options, where sane default values have been picked. A common example is the default file encoding being UTF-8 or always allocating the whole memory during start up. You can override those as well, but it is highly unlikely that you ever need to.

One last feature of the options parser is the ability to replace some place holders with real values. There is a substitutePlaceholders() method which replaces a line like

-Djava.io.tmpdir=${ES_TMPDIR}

with the current environment variable.

Checking for applied JVM options in a running instance

First and foremost, those options are written to the logfile on every start up like this

[2019-11-28T12:18:14,447][INFO ][o.e.n.Node               ] [rhincodon] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Des.networkaddress.cache.ttl=60, -Des.networkaddress.cache.negative.ttl=10, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dio.netty.allocator.numDirectArenas=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/var/folders/xx/111r6kc974z_tmqc70rymxkm0000gn/T/elasticsearch-456895883997808587, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m, -Djava.locale.providers=COMPAT, -Dio.netty.allocator.type=unpooled, -XX:MaxDirectMemorySize=536870912, -Des.path.home=/Users/alr/Downloads/stacks/7.4.1/elasticsearch-7.4.1, -Des.path.conf=/Users/alr/Downloads/stacks/7.4.1/elasticsearch-7.4.1/config, -Des.distribution.flavor=default, -Des.distribution.type=tar, -Des.bundled_jdk=true]

Second, you can also use ps to check the arguments from the command line

$ jps
50138 Jps
49676 Elasticsearch

$ ps wwwp 49676
  PID   TT  STAT      TIME COMMAND
49676 s000  S+     0:58.62 /Users/alr/.sdkman/candidates/java/current/bin/java -Xms1g -Xmx1g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Des.networkaddress.cache.ttl=60 -Des.networkaddress.cache.negative.ttl=10 -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dio.netty.allocator.numDirectArenas=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Djava.io.tmpdir=/var/folders/xx/111r6kc974z_tmqc70rymxkm0000gn/T/elasticsearch-456895883997808587 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=data -XX:ErrorFile=logs/hs_err_pid%p.log -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m -Djava.locale.providers=COMPAT -Dio.netty.allocator.type=unpooled -XX:MaxDirectMemorySize=536870912 -Des.path.home=/Users/alr/Downloads/stacks/7.4.2/elasticsearch-7.4.2 -Des.path.conf=/Users/alr/Downloads/stacks/7.4.2/elasticsearch-7.4.2/config -Des.distribution.flavor=default -Des.distribution.type=tar -Des.bundled_jdk=true -cp /Users/alr/Downloads/stacks/7.4.2/elasticsearch-7.4.2/lib/* org.elasticsearch.bootstrap.Elasticsearch

However, you can also use the node stats API

GET _nodes/jvm?human

will return the static JVM configuration that also contains a field input_arguments looking like this

"input_arguments": [
  "-Xms1g",
  "-Xmx1g",
  "-XX:+UseConcMarkSweepGC",
  "-XX:CMSInitiatingOccupancyFraction=75",
  "-XX:+UseCMSInitiatingOccupancyOnly",
  "-Des.networkaddress.cache.ttl=60",
  "-Des.networkaddress.cache.negative.ttl=10",
  "-XX:+AlwaysPreTouch",
  "-Xss1m",
  "-Djava.awt.headless=true",
  "-Dfile.encoding=UTF-8",
  "-Djna.nosys=true",
  "-XX:-OmitStackTraceInFastThrow",
  "-Dio.netty.noUnsafe=true",
  "-Dio.netty.noKeySetOptimization=true",
  "-Dio.netty.recycler.maxCapacityPerThread=0",
  "-Dio.netty.allocator.numDirectArenas=0",
  "-Dlog4j.shutdownHookEnabled=false",
  "-Dlog4j2.disable.jmx=true",
  "-Djava.io.tmpdir=/var/folders/xx/111r6kc974z_tmqc70rymxkm0000gn/T/elasticsearch-456895883997808587",
  "-XX:+HeapDumpOnOutOfMemoryError",
  "-XX:HeapDumpPath=data",
  "-XX:ErrorFile=logs/hs_err_pid%p.log",
  "-Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m",
  "-Djava.locale.providers=COMPAT",
  "-Dio.netty.allocator.type=unpooled",
  "-XX:MaxDirectMemorySize=536870912",
  "-Des.path.home=/Users/alr/Downloads/stacks/7.4.2/elasticsearch-7.4.2",
  "-Des.path.conf=/Users/alr/Downloads/stacks/7.4.2/elasticsearch-7.4.2/config",
  "-Des.distribution.flavor=default",
  "-Des.distribution.type=tar",
  "-Des.bundled_jdk=true"
]

Note, that this startup did not set any custom options in the jvm.options file, these are indeed all the options set by Elasticsearch when starting up.

It's a wrap!

There is not much more to learn about parsing JVM options: a small dedicated process being able to parse different options for different JDK versions, whose output is used when the Elasticsearch process is started. You also learned how to check if your custom JVM options are applied properly.

See you next time!

5 Likes