Which RAID config for 2TB * 12disks

jihun · August 7, 2018, 6:53am

Hello,
There are 6 servers, and each have 64GB ram and 12 spinning disks (2 TB per disk. so 24TB in total).

When I give 32GB RAM to ES, It seems that maximum storable index size for a node is around 4TB.
So, If I set RAID 0, 20TB disks will be unused.
And I can not increase replica due to the ES heap limit.

How about,
Make 6 Raid1 array? I mean (raid1 = 2disk) * 6 ==> 6 path, 12TB available disk space.
Then one raid1 give to OS, /data0
Five raid1 will be configured for path.data=/data1,/data2,/data3,/data4,/data5

How does it seems?

Christian_Dahlqvist · August 7, 2018, 7:26am

This will generally depend on your data and use-case as well as how well you optimize heap usage.

What type of data are you indexing?
Have you gone through and optimised your mappings?
What is the average shard size in your cluster? Having lots of small shards and indices can be inefficient and drive up heap usage.
Are you using any coordinating-only nodes?

jihun · August 7, 2018, 7:50am

@Christian_Dahlqvist

OK,
I store user click stream logs to ES, a index per a day, with 5 shard, no replica.
In a day, 2 billion logs indexed and it's index size is about 750GB.

I thinks, this click stream logs are not mission critical, so replica required not importantly.

Actually there are 6 hot-nodes, 2 warm-nodes, 3 master dedicated nodes

What I described in the first comment is about NEW warm-nodes.
because I want to increase retention date for warm-node data.

Below is all about the current warm-node.

Kibana monitoring

single node

single index

Here is mapping of index.

{
  "jpl_raw_20180724": {
    "mappings": {
      "_default_": {
        "dynamic": "false",
        "_all": {
          "enabled": false
        },
        "_source": {
          "excludes": [
            "ac_hash",
            "event_hash"
          ]
        },
        "properties": {
          "action_id": {
            "type": "keyword"
          },
          "app_ver": {
            "type": "keyword",
            "ignore_above": 30
          },
          "classifier": {
            "type": "keyword"
          },
          "client_ip": {
            "type": "keyword"
          },
          "country": {
            "type": "keyword"
          },
          "deliver_delay_time": {
            "type": "long"
          },
          "device_id": {
            "type": "keyword",
            "ignore_above": 200
          },
          "event_time": {
            "type": "date"
          },
          "ingest_host": {
            "type": "keyword",
            "ignore_above": 50
          },
          "ingest_time": {
            "type": "date",
            "format": "epoch_millis"
          },
          "language": {
            "type": "keyword",
            "index": false
          },
          "os_name": {
            "type": "keyword",
            "ignore_above": 30
          },
          "os_ver": {
            "type": "keyword",
            "ignore_above": 30
          },
          "p0value": {
            "type": "keyword",
            "ignore_above": 300
          },
          "p1value": {
            "type": "keyword",
            "ignore_above": 300
          },
          "p2value": {
            "type": "keyword",
            "ignore_above": 300
          },
          "p3value": {
            "type": "keyword",
            "ignore_above": 300
          },
          "p4value": {
            "type": "keyword",
            "ignore_above": 300
          },
          "p5value": {
            "type": "keyword",
            "ignore_above": 300
          },
          "p6value": {
            "type": "keyword",
            "ignore_above": 300
          },
          "p7value": {
            "type": "keyword",
            "ignore_above": 300
          },
          "p8value": {
            "type": "keyword",
            "ignore_above": 300
          },
          "p9value": {
            "type": "keyword",
            "ignore_above": 300
          },
          "product": {
            "type": "keyword"
          },
          "scene_id": {
            "type": "keyword",
            "ignore_above": 50
          },
          "service_id": {
            "type": "keyword"
          },
          "user_key": {
            "type": "keyword"
          }
        }
      },
      "default": {
        "dynamic": "false",
        "_all": {
          "enabled": false
        },
        "_source": {
          "excludes": [
            "ac_hash",
            "event_hash"
          ]
        },
        "properties": {
          "action_id": {
            "type": "keyword"
          },
          "app_ver": {
            "type": "keyword",
            "ignore_above": 30
          },
          "classifier": {
            "type": "keyword"
          },
          "client_ip": {
            "type": "keyword"
          },
          "country": {
            "type": "keyword"
          },
          "deliver_delay_time": {
            "type": "long"
          },
          "device_id": {
            "type": "keyword",
            "ignore_above": 200
          },
          "event_time": {
            "type": "date"
          },
          "ingest_host": {
            "type": "keyword",
            "ignore_above": 50
          },
          "ingest_time": {
            "type": "date",
            "format": "epoch_millis"
          },
          "language": {
            "type": "keyword",
            "index": false
          },
          "os_name": {
            "type": "keyword",
            "ignore_above": 30
          },
          "os_ver": {
            "type": "keyword",
            "ignore_above": 30
          },
          "p0value": {
            "type": "keyword",
            "ignore_above": 300
          },
          "p1value": {
            "type": "keyword",
            "ignore_above": 300
          },
          "p2value": {
            "type": "keyword",
            "ignore_above": 300
          },
          "p3value": {
            "type": "keyword",
            "ignore_above": 300
          },
          "p4value": {
            "type": "keyword",
            "ignore_above": 300
          },
          "p5value": {
            "type": "keyword",
            "ignore_above": 300
          },
          "p6value": {
            "type": "keyword",
            "ignore_above": 300
          },
          "p7value": {
            "type": "keyword",
            "ignore_above": 300
          },
          "p8value": {
            "type": "keyword",
            "ignore_above": 300
          },
          "p9value": {
            "type": "keyword",
            "ignore_above": 300
          },
          "product": {
            "type": "keyword"
          },
          "scene_id": {
            "type": "keyword",
            "ignore_above": 50
          },
          "service_id": {
            "type": "keyword"
          },
          "user_key": {
            "type": "keyword"
          }
        }
      }
    }
  }
}

Here is a single index stats

gist.github.com

https://gist.github.com/jeesim2/5d7f4fae102923dcefd2a3f7576db0c7

single index stats

```
{
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "_all": {
    "primaries": {
      "docs": {

This file has been truncated. show original

Here is a single node info

gist.github.com

https://gist.github.com/jeesim2/76cb7ac5dc3429ffd8c499e787bdabf6

single node info

{
  "_nodes": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "cluster_name": "elastic_jpl",
  "nodes": {
    "eD3_FZmAR4OnX7Y6JuU9lg": {
      "timestamp": 1533628403645,

This file has been truncated. show original

Additionally,
when I force merged an index there were no remarkable heap reduce(just 5%?), as I remember..

Christian_Dahlqvist · August 7, 2018, 8:22am

Mappings look fine and properly optimised, so no problem there. You do however have quite large shards and the terms heap usage is high. I would recommend trying to reduce the average shard size to closer to 50GB to see if this makes a difference, e.g. by increasing the number of primary shards to between 15 and 18.

jihun · August 7, 2018, 8:51am

I'll try to do that.
BTW, having more shard with smaller size can reduce the total heap usage?

Christian_Dahlqvist · August 7, 2018, 8:56am

It may, which is why I am asking you to test it and then look at the index stats.

jihun · August 7, 2018, 11:29pm

Now reindexing ongoing with 15 shards.
It might takes 6 hours...

It seems that even though 15 shards reduces heap usage by 2-30% compared to 5 shards, I still cannot have replica. If 50% reduced I can have 1 replica.

Anyway I will check out the 15 shards index's stats.

Raid configuration, what I asked first, seems a little independent from the this 15 shard test. Because disk space quietly large. How do you think about set raid up regardless of heap optimizing if multiple raid1 make sense.

Christian_Dahlqvist · August 8, 2018, 5:21am

Multiple RAID1 maks sense, and if you manage to reduce heap usage you will be able to use a larger portion of that storage.

jihun · August 8, 2018, 5:56am

Of course!
Thanks for your help! : D

rcowart · August 8, 2018, 6:01am

If a single daily index is nearly 750GB, I would consider writing hourly indices rather than daily. That will reduce them to a far more manageable ~30GB/index.

jihun · August 9, 2018, 12:22am

If a single daily index is nearly 750GB, I would consider writing hourly indices rather than daily. That will reduce them to a far more manageable ~30GB/index.

It seems a good idea!

jihun · August 9, 2018, 12:31am

@Christian_Dahlqvist
15 shards index does not shows an notable heap reduce.

15 shards

5 shards

I did not reindexed but applied 15 shard to a new day's index.
document size of both index are almost same.

one different is that

20180803 index's(15shards) segment count is around 650
but 20180808 index's(5shards) segment count is around 290

Christian_Dahlqvist · August 9, 2018, 6:24am

It is a shame it did help, but I have to admit it was a long shot. As the mappings look good I am not sure I have any other suggestions apart from tweaking the circuit breaker thresholds a bit, but this is unlikely to give any massive improvement and could cause instability if pushed too far.

There is one thing I forgot to ask earlier: Do you allow Elasticsearch to automatically assign document IDs or do you set them yourself? If you set them yourself, what do the IDs look like and how are they generated?

jihun · August 9, 2018, 6:31am

@Christian_Dahlqvist
Could you tell me the technical background of 'more shards to reduce terms heap'? for next time when I meet similar situation and need heap optimizing. : )

Do you allow Elasticsearch to automatically assign document IDs or do you set them yourself?

Document _id automatically generated.

Christian_Dahlqvist · August 9, 2018, 7:19am

This was based on something I saw in a test of large shards (small sample size), but it seems there is no such direct correlation to shard size.

system · September 6, 2018, 7:20am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
New User -- Index Settings Reccomdendations and Suggestions Elasticsearch	8	462	July 6, 2017
Recommended Hardware Specs & Sharding\Index Strategy Elasticsearch	13	801	July 6, 2017
Hybrid RAID 0 and Mutliple Data Paths Elasticsearch	10	1687	July 5, 2017
Choosing Shards and Replica's configuration values Elasticsearch	11	712	July 6, 2017
Shard size / Index number / server count and performance Elasticsearch	4	1409	July 6, 2017