Is it required to set MetaSpace Limit for ElasticSearch

Hi,
My configuration:

 ES: 6.3.2
 Java : 1.8.0_171
 CPU : 8
 RAM : 16 GB

Java Settings :

-Xms12g 
-Xmx12g 
-XX:MaxMetaspaceSize=1g 
-XX:+UseConcMarkSweepGC 
-XX:CMSInitiatingOccupancyFraction=75 
-XX:+UseCMSInitiatingOccupancyOnly 
-XX:+AlwaysPreTouch 
-Xss1m 
-Djava.awt.headless=true 
-XX:NumberOfGCLogFiles=32 
-XX:GCLogFileSize=64 

My doubt is do I need to set MaxMetaspaceSize or leave it unlimited?
Also what is the optimal setting for Xmx for a 16 GB Machine?

Also I have 2 data directories attached to each machine of 1 TB each. What is the recommended number of data directories and max size of each for my machine size?

Will updating Java to 9 or 10 help?

Pretty a newbie in ES. Please recommend

Other than heap sizing, we do not recommend or support changing any other JVM settings.

As for heap sizing, we recommend no more than 50% of system memory to heap, as long as that is below 32GB. See https://www.elastic.co/guide/en/elasticsearch/reference/6.4/heap-size.html

@warkolm appreciate the help.

Will updating Java to 9 0r 10 help in performance

And Can I increase the size or count of the data dirs?

What problems are you trying to solve?

The java version typically have limited (if any) impact on performance as far as I know.

You can specify multiple data paths in the elasticsearch.yml file.

@warkolm @Christian_Dahlqvist

Issue 1:
I have already configured two data paths which are EBS(AWS) volumes of 1TB each. My data is filling up. So should I add one more volume to data paths/increase the volumes size/add one more node? I know all the options are possible. But need to know what is best for my Instance Size(8 CPU, 16GB RAM)

Issue 2:
I having issues when suddenly one of my node gets high CPU/Memory utilisation. So looking into tune my ES cluster. We use ES to store Logs of our Apps which is fetched through Kibana

Can you show us the full output of the cluster stats API? This provides a good overview of the state of your cluster.

@Christian_Dahlqvist I stumbled upon couple of blogs like this. Thats why the query on Java 9

Cluster Stats

{
  "_nodes": {
"total": 15,
"successful": 15,
"failed": 0
  },
  "cluster_name": "prod_elasticsearch",
  "timestamp": 1539930277930,
  "status": "green",
  "indices": {
"count": 1501,
"shards": {
  "total": 11622,
  "primaries": 3930,
  "replication": 1.9572519083969466,
  "index": {
    "shards": {
      "min": 2,
      "max": 15,
      "avg": 7.742838107928048
    },
    "primaries": {
      "min": 1,
      "max": 5,
      "avg": 2.6182544970019985
    },
    "replication": {
      "min": 1,
      "max": 2,
      "avg": 1.9733510992671552
    }
  }
},
"docs": {
  "count": 4769623441,
  "deleted": 1883381
},
"store": {
  "size": "17.9tb",
  "size_in_bytes": 19715747774942
},
"fielddata": {
  "memory_size": "1gb",
  "memory_size_in_bytes": 1121331232,
  "evictions": 0
},
"query_cache": {
  "memory_size": "200.9mb",
  "memory_size_in_bytes": 210762480,
  "total_count": 17495237,
  "hit_count": 2351503,
  "miss_count": 15143734,
  "cache_size": 23100,
  "cache_count": 48882,
  "evictions": 25782
},
"completion": {
  "size": "0b",
  "size_in_bytes": 0
},
"segments": {
  "count": 146549,
  "memory": "37.7gb",
  "memory_in_bytes": 40503579735,
  "terms_memory": "28gb",
  "terms_memory_in_bytes": 30122436929,
  "stored_fields_memory": "6.4gb",
  "stored_fields_memory_in_bytes": 6872874936,
  "term_vectors_memory": "0b",
  "term_vectors_memory_in_bytes": 0,
  "norms_memory": "1.2gb",
  "norms_memory_in_bytes": 1342949632,
  "points_memory": "464.7mb",
  "points_memory_in_bytes": 487376498,
  "doc_values_memory": "1.5gb",
  "doc_values_memory_in_bytes": 1677941740,
  "index_writer_memory": "1gb",
  "index_writer_memory_in_bytes": 1108038237,
  "version_map_memory": "2.2mb",
  "version_map_memory_in_bytes": 2390271,
  "fixed_bit_set": "0b",
  "fixed_bit_set_memory_in_bytes": 0,
  "max_unsafe_auto_id_timestamp": 1539927835338,
  "file_sizes": {}
}
  },
  "nodes": {
"count": {
  "total": 15,
  "data": 12,
  "coordinating_only": 0,
  "master": 3,
  "ingest": 15
},
"versions": [
  "6.3.2"
],
"os": {
  "available_processors": 102,
  "allocated_processors": 102,
  "names": [
    {
      "name": "Linux",
      "count": 15
    }
  ],
  "mem": {
    "total": "204.6gb",
    "total_in_bytes": 219781988352,
    "free": "3gb",
    "free_in_bytes": 3268747264,
    "used": "201.6gb",
    "used_in_bytes": 216513241088,
    "free_percent": 1,
    "used_percent": 99
  }
},
"process": {
  "cpu": {
    "percent": 290
  },
  "open_file_descriptors": {
    "min": 551,
    "max": 4340,
    "avg": 3156
  }
},
"jvm": {
  "max_uptime": "42.2d",
  "max_uptime_in_millis": 3651267696,
  "versions": [
    {
      "version": "1.8.0_171",
      "vm_name": "Java HotSpot(TM) 64-Bit Server VM",
      "vm_version": "25.171-b11",
      "vm_vendor": "Oracle Corporation",
      "count": 15
    }
  ],
  "mem": {
    "heap_used": "119.2gb",
    "heap_used_in_bytes": 127993490440,
    "heap_max": "156.3gb",
    "heap_max_in_bytes": 167885537280
  },
  "threads": 2056
},
"fs": {
  "total": "23.3tb",
  "total_in_bytes": 25678902558720,
  "free": "5.3tb",
  "free_in_bytes": 5855567454208,
  "available": "4.1tb",
  "available_in_bytes": 4550518153216
},
"plugins": [
  {
    "name": "repository-s3",
    "version": "6.3.2",
    "elasticsearch_version": "6.3.2",
    "java_version": "1.8",
    "description": "The S3 repository plugin adds S3 repositories",
    "classname": "org.elasticsearch.repositories.s3.S3RepositoryPlugin",
    "extended_plugins": [],
    "has_native_controller": false
  }
],
"network_types": {
  "transport_types": {
    "netty4": 15
  },
  "http_types": {
    "netty4": 15
  }
}
  }
}

You have too many shards for starters

That's nearly 1000 shards per data node, it's likely causing resourcing issues given the node sizes.

You will need to add more nodes. Adding more data paths won't reduce the amount of data on the existing paths.
You could add more nodes with 3 data paths and then do a phased replacement of the existing 2 paths nodes though.

How many shards per node would be recommended for my instance size?

If I add a new volume to the existing instance and restart elasticsearch service will it rebalance data to the new volume?

Else is there a document on how to do a phased replacement?

Have a look at this blog post for guidance on shard sizes and sharding practices.

One final clarification. If I change my instance type to 16 CPUs and 30GB RAM

Xmx would be 15GB.

Can I set data path to three volumes of 2TB each(Total 6TB) ? With 600 shards in each machine?

Technically yes, I am on AWS so I have extra two EBS volumes attached to my instance of 1TB each

That still sounds like a lot of shards for that heap size and data volume. Did you look at the blog post I linked to?

Yes. I did

A node with a 30GB heap should therefore have a maximum of 600-750 shards, but the further below this limit you can keep it the better. This will generally help the cluster stay in good health.

A good rule-of-thumb is to ensure you keep the number of shards per node below 20 to 25 per GB heap it has configured

So I assume for a 15GB Xmx the max I can have max of 15x25=375 shards per node

Aim to keep the average shard size between a few GB and a few tens of GB. For use-cases with time-based data, it is common to see shards between 20GB and 40GB in size

If a shard is around 35GB size the total combined size of all shards on a single can go upto 375*35= 13125GB = 13TB

Am I missing something?

It is a guideline for maximum shard count, not a recommendation. If you have larger shards you will typically have fewer shards than the recommended max value.

@Christian_Dahlqvist I agree. But even if I use a shard size of 25-30GB I need at least 10TB of disk to accommodate the 300 shards per node?

To be more clear

path.data: /data/elasticsearch, /data1/elasticsearch

The above are two external mounts. each a 1TB disk