Tips for setting new ES indices and cluster

I'm setting up new ES 6.5.4 in cluster containing 6 data nodes. Each node has 16 CPU and heap 8 GB.
Here's the conf:

[root@data01~]# curl -XGET 'https://localhost:9201/_cat/nodes?v'
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.100.115.12 41 83 0 0.13 0.13 0.14 mdi - tt2_6_5_4-data02
10.100.115.16 22 99 1 0.06 0.44 1.25 mdi - tt2_6_5_4-master02
10.100.115.14 43 99 2 2.12 1.63 2.06 mdi - tt2_6_5_4-data04
10.100.115.13 32 97 1 0.20 0.20 0.37 mdi * tt2_6_5_4-data03
10.100.115.15 26 99 2 1.03 1.04 1.87 mdi - tt2_6_5_4-master01
10.100.115.11 36 99 2 4.53 4.24 3.81 mdi - tt2_6_5_4-data01
10.100.115.17 43 99 1 0.35 0.70 1.11 mi - tt2_6_5_4-master03

I need to import transactions from Oracle database. My main goal is to get fast query results. Just want to get last 10,20,100,1000 transactions based on account_id and currency_id sorted by date_currency and transaction_id. I have the same setup on 2.3 Cluster with 3 big indices (transactions_2017, transactions_2018, transactions_2019). Each index has cca 150M documents and is about 150GB size. Query results are very bad, like 5-8 seconds (tried bool, filter, force_merge etc, nothig helped, so I'm moving to the new infrastructure and newer version of ES).

Questions?

  • How should i organize my indices, was thinking again for year indices (transactions_year) and maybe 6 shards or should I try something different now. Maybe indices by last account_id number and then query on just one indice instead of 3, that should be faster, or not?

  • What should I do about my mapping? I guess account_id and currency_id should be type keyword, but what about index:false and doc_values:false on other fields?

  • Should I use sorting while inserting data with logstash, will it perform better because my query has sort by date_currency and transaction_id

  • Should I try with 1 or more replicas for better searching performance?

Here's my template (for now):

{
  "transactions": {
    "order" : 0,
    "index_patterns": [
      "transactions*"
    ],
  "settings": {
    "index": {
      "sort.field" : ["date_currency", "transaction_id"],
      "sort.order" : ["desc", "desc"],
      "number_of_shards": "6",
      "number_of_replicas": "1"
    }
  },
  "mappings": {
    "transaction_item": {
      "_routing": {
        "required": true
      },
      "properties": {
        "@timestamp": {
          "type": "date",
          "format": "strict_date_optional_time||epoch_millis"
        },
        "transaction_id": {
          "type": "keyword"
        },
        "date_booking": {
          "type": "date",
          "format": "date",
          "index": false,
          "doc_values": false
        },
        "date_currency": {
          "format": "date",
          "type": "date",  //can't use keyword here?
        },
        "message": {
          "type": "keyword",
          "index": false,
          "doc_values": false
        },
        "path": {
          "type": "keyword",
          "index": false,
          "doc_values": false
        },
        "account_id": {
          "type": "keyword"
        },
        "description": {
          "type": "keyword",
          "index": false,
          "doc_values": false
        }
        "v_pnb": {
          "type": "keyword",
          "index": false,
          "doc_values": false
        },
        "currency_id": {
          "type": "keyword"
        },
        "vd_amount": {
          "type": "double",
          "index": false,
          "doc_values": false
        },
        "vd_name": {
          "type": "keyword",
          "index": false,
          "doc_values": false
        },
        "vd_vbdi": {
          "type": "keyword",
          "index": false,
          "doc_values": false
        }
      }
    }
  }
}
}

and this is how I plan to do queries:

  GET transactions_2017,transactions_2018,transactions_2019/_search?routing=1234567890
    {
      "query": {
        "bool": {
          "filter": [
            {
              "match": {
                "account_id": "1234567890"
              }
            },
            {
              "match": {
                "currency_id": "011"
              }
            }
          ]
        }
      },
      "size": 10000,
      "from": 0,
      "sort": [
        {
          "date_currency": {
            "order": "desc"
          }
        },
        {
          "transaction_id": {
            "order": "desc"
          }
        }
      ]
    }

Any suggestion is welcome and I'm willing to experiment on my own, I know there is no silver bullet solution, but I want to avoid template/shards stuff because it takes 3 days to import the data. Main goal is to show last XX transactions very fast, and nothing else (no full text searches, calculations etc..) If needed, I can put more info about hardware. Tnx

How many shards does each of your yearly indices have? How many replicas do you have configured? How much RAM do the hosts have? What type of storage are you using? How many concurrent queries are you expecting to need to support?

16 shards per index (with 1 replica on 2 data nodes (16GB RAM - jvm heap=8GB) + 2 client nodes (6GB RAM) + 1 monitoring node) . Now I have 32GB RAM. I'm using HDD (not SSD, don't know exactly what kind of HDD, can find out with dev ops member, but in production will be SSD disks, this is simulation environment). Around 50-100 concurrent queries at max I guess. We have 700k users and 23 sessions in the same time was peak last year).

I am no expert on this but can tell you what happens in my environment
I have 48gig servers 3 (master+data) few index.
largest indice is 56gig, 5 shard, 1 replica, with 88Million document in it.

when I run query like you did with two filter{match}
it returns 1832 record in less then millisecond.
but then my storage disks are fast 15k

GET xyz_indice/_search?pretty
    {
      "query": {
        "bool": {
          "filter": [
            {
              "match": {
                "queue.keyword": "1234"
              }
            },
            {
              "match": {
                "host.keyword": "xya"
              }
            }
          ]
        }
      }
    }

can you set sort by some random 2 fields, and size to 1000? What version of ES are u using?

ok did that sort by two filed and size 10,000 and it took 11+ second.

something to do with sort?

I am using 6.6.1

I would recommend looking at disk I/O, e.g. using iostat if you are on Linux, while you are querying and experiencing bad performance. As your data set is larger than what can be cached, slow disk is quite likely to be a limiting factor. I was going to suggest routing, which you seem to already be using.

yep, sorting is killing it :slight_smile:

i have this stat on grafana?! what about number of shards or putting in lots of smaller indices, does that makes sense?

I do not know what Grafana provides. I usually just run íostat -x on the nodes in question. If disk I/O is the problem I am not sure changing sharing will help much.

iostat -x -m 5 5
you want to see if your %util and w_await and r_await. times. -m=result in megabytes

here's the demo how it looks like, it's good because you can see history of stats, instead of cathing live stat with iostat -x. What should I look after, i don't know anything about those numbers yet..

https://pmmdemo.percona.com/graph/d/oxkrQMNmz/disk-performance?refresh=1m&orgId=1

I am interested in seeing how much iowait you have, which does not seem to be captured by Grafana. Please run iostat -x and post the results here.

unfortunately i can ssh only to client node on my 2.3 cluster. can in 6.5. cluster, but it has no data.. will check tomorrow and post it

Query took 5800 ms, this is from datanode1 iostat -x -1

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.25    0.00    0.50   20.68    0.00   78.57

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     1.00    0.00    2.00     0.00    24.00    12.00     0.00    0.50   0.50   0.10
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-3              0.00     0.00  320.00    4.00 29136.00    32.00    90.02     1.98    6.12   3.06  99.10
dm-4              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-5              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-6              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-7              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb               0.00     0.00    0.00    1.00     0.00     8.00     8.00     0.00    0.00   0.00   0.00
sdc               0.00     0.00  320.00    0.00 29440.00     0.00    92.00     1.97    6.19   3.10  99.10

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.50    0.00    0.50   24.31    0.00   74.69

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-3              0.00     0.00  343.00    0.00  6272.00     0.00    18.29     1.98    5.76   2.92 100.00
dm-4              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-5              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-6              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-7              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdc               0.00     0.00  344.00    0.00  6360.00     0.00    18.49     1.98    5.75   2.90  99.90

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.50    0.00    0.13   22.81    0.00   76.57

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     2.00    0.00    4.00     0.00    48.00    12.00     0.00    0.50   0.50   0.20
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-3              0.00     0.00  297.00    0.00  5528.00     0.00    18.61     1.87    6.33   3.37 100.00
dm-4              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-5              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-6              0.00     0.00    0.00    3.00     0.00    24.00     8.00     0.00    0.33   0.33   0.10
dm-7              0.00     0.00    0.00    3.00     0.00    24.00     8.00     0.00    0.33   0.33   0.10
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdc               0.00     0.00  298.00    0.00  5536.00     0.00    18.58     1.87    6.31   3.36 100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.13    0.00    0.13   12.03    0.00   87.72

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-3              0.00     0.00  181.00    0.00  3432.00     0.00    18.96     1.00    5.55   5.49  99.40
dm-4              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-5              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-6              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-7              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdc               0.00     0.00  181.00    0.00  3432.00     0.00    18.96     1.00    5.55   5.49  99.40

As you can see the disk utilisation is at 100% so it seems like disk performance is the bottleneck. I would recommend switching to SSDs if possible or consider scaling out the cluster by adding nodes.

Could you please explain what is happening here? At first, I would say that ES is going to shard which is on dm-3 part of the disk, and this await is too long?! Let's say I add 6 more data nodes, how many shards should i put in each index (now I have 16 shards per index). Thanks.

does more shard solves the problem?
find out why dm-3 is 100% use. is dm-3 has anything else in it.
what process is using all this disk i/o
is this raid-5?

I don’t know, guess nothing is going on with this dm-3, cause same performance is also on another cluster with same setup. This is ES 2.3 cluster with 16 shards per index on 2 data nodes, and I was asking about setting new ES 6.5 on another cluster (bigger and better) (7 nodes)

You seem to have very heavy usage of just one drive (sdc / dm-3). Is this the drive that hold your full data directory?