What do you think is more critical? To allocate more resources or change architecture?

mahalabobis · December 8, 2020, 2:06pm

I have multiple issues with Elasticsearch right now.

I'm running out of disk
I'm running out of memory
Architecture may need to change since I have more than 600 shards per node. For sure, I will add a new node to the cluster

I'm not sure what should be the priority. And also I'm thinking that maybe some of these 3 issues caused the others.

GET /

{
  "name" : "a",
  "cluster_name" : "cluster_name",
  "cluster_uuid" : "uuid",
  "version" : {
    "number" : "7.1.0",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "606a173",
    "build_date" : "2019-05-16T00:43:15.323135Z",
    "build_snapshot" : false,
    "lucene_version" : "8.0.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

GET /_cat/allocation?v

shards disk.indices disk.used disk.avail disk.total disk.percent   host          ip            node
   605       16.4gb   257.9gb     37.1gb    295.1gb      87     ip-adress-b  ip-address-b   name-node-b
   642       24.8gb   242.8gb     52.3gb    295.1gb      82     ip-address-a ip-address-a   name-node-a
    39                                                                                       UNASSIGNED

GET /_cat/nodes?v

ip            heap.percent  ram.percent cpu load_1m load_5m load_15m   node.role   master name
ip-adress-a           63          94     9    0.62     0.90      0.95 mdi       *   name-node-a
ip-adress-b           48          98     2    0.63     0.35      0.29 mdi       -    name-node-b

GET /_cat/health

epoch      timestamp cluster   status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1607443543 16:05:43  name    yellow          2         2   1247 643    0    0       39             0                  -                 97.0%

GET /_cat/indices?v

gist.github.com

https://gist.github.com/giannakoulis/fba19bd1e53c064f68edf3447a6cd4e6

gistfile1.txt

health status index                                                                          uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   6ed45d8e-c954-4c7b-a6cb-4718cc2531aa-21-b35c5fa6-088c-4de8-a265-f039b437151a-0 93RYrEheTjWSwuiOI4xCsQ   1   1          0            0       566b           283b
green  open   51100b04-8c98-441c-8013-b8efd66ad5c4-32-e9298e92-eba3-4276-ab5c-c2721f2c4e65-0 G-kpTB_2SRm-tHj4LoJ4Jg   1   1      26764            0       11mb          5.5mb
green  open   ecd8b2da-979f-47b6-9ff8-ceec6ec68a5a-34-9b46f4ca-3713-40d5-b1f8-f9418be79851-0 OseGBN7_TQmaO8DxD8rxwQ   1   1         45            0     17.7kb          8.8kb
green  open   dd1b7e89-d634-4a55-82d6-479f477bfd0f-25-5a647d0c-a7e4-4386-9000-192d175b9846-0 SiGv_hC7Q269UI2POb5Pgw   1   1          0            0       566b           283b
green  open   b3667cc9-c09e-4cc8-90eb-0375c2059f0f-35-8a8d7ae9-ebae-4b4a-a4c9-6ff131cae687-0 dRzHIzGoSBGdKo9tpOdRYg   1   1       5252            0      2.3mb          1.1mb
green  open   b3667cc9-c09e-4cc8-90eb-0375c2059f0f-3-7d73d5ca-4770-4773-91ae-58046eb06096-0  VUBx03Z9SSae_5MJ2rGHJg   1   1          0            0       566b           283b
green  open   870cd886-89cb-498c-ae22-fd9901222208-2-2e58669b-1600-4e79-b317-ef5485f6eb74-0  KPjBZP6wR_mGCqvxetN_Mw   1   1          0            0       566b           283b
green  open   ecd8b2da-979f-47b6-9ff8-ceec6ec68a5a-34-2114f880-c7e9-4d98-9f08-d5cb5ad2ee4e-0 WZatI7rGR2KBQKrf99fcZA   1   1          3            0      8.9kb          4.4kb
green  open   b094ff36-ed31-49ad-a8d8-06fdc8b7fcf4-3-376c2fde-e4a8-40df-a2c7-80667f56990c-0  e7ICHCToTOKcPcTyypDr4g   1   1          0            0       566b           283b

This file has been truncated. show original

dadoonet · December 8, 2020, 3:26pm

Please don't post images of text as they are hard to read, may not display correctly for everyone, and are not searchable.

Instead, paste the text and format it with </> icon or pairs of triple backticks (```), and check the preview window to make sure it's properly formatted before posting it. This makes it more likely that your question will receive a useful answer.

It would be great if you could update your post to solve this.

What is the output of:

GET /
GET /_cat/nodes?v
GET /_cat/health?v
GET /_cat/indices?v

If some outputs are too big, please share them on gist.github.com and link them here.

grumo35 · December 8, 2020, 3:57pm

You might want to add nodes and scale horizontally, honestly you can always reuse different hardware and add more nodes. Elasticsearch is meant to be scalable.

maybe provide some hardware info ?

How many nodes ?

dadoonet · December 8, 2020, 4:04pm

Could you share the missing:

GET /
GET /_cat/health?v
GET /_cat/indices?v

Please don't forget the ?v

warkolm · December 8, 2020, 9:45pm

You seem oversharded, looks like most of them are ~3GB. Look at using _shrink to reduce that.

mahalabobis · December 8, 2020, 9:56pm

Updated post with all indices

warkolm · December 8, 2020, 9:57pm

Can you explain your indexing strategy? You have tonnes of tiny indices.

mahalabobis · December 8, 2020, 10:11pm

I'm afraid I can't answer your question, about indexing strategy, as I've inherited this project. Indices are created dynamically with source code.

warkolm · December 8, 2020, 10:12pm

Ok. Well it's pretty wasteful, so it's definitely something you should dig into and try to optimise.

mahalabobis · December 8, 2020, 10:16pm

Ok but given the situation do you think increasing the disk and/or memory would resolve these issues?
I'm also planning to add a third node to the cluster.

warkolm · December 8, 2020, 10:16pm

It would yes. It's ultimately a short term fix though as if you increase your indexing you will still reach the same point in the future.

mahalabobis · December 8, 2020, 10:28pm

What about long term?

I'm planning to add a third node to the cluster. I believe this would reduce shards per index which I believe is very critical.
But will this be enough? Do I need to review the whole indexing strategy? I would be happy if I can avoid this because the code is a mess.

PS:
BTW, what do you mean by indexing strategy. Can you provide me a link or resource to study it?

warkolm · December 8, 2020, 10:29pm

Per node, yes. Not per index.

It will be enough, but for how long I can't say.

I mean - what's in these indices? Why are you creating so many empty indices, or indices with little to no documents? What sort of data is it?

system · January 5, 2021, 10:29pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Improving Elasticsearch performance on a single node by increasing shards Elasticsearch	4	6703	July 6, 2017
Setting up elasticsearch to scale: shards per index Elasticsearch	9	524	July 6, 2017
Shard allocation logic not taking disk size into account - why? Elasticsearch	12	967	July 6, 2017
On scaling Elasticsearch	10	882	July 6, 2017
Increasing shards and then nodes Elasticsearch	12	916	July 6, 2017

What do you think is more critical? To allocate more resources or change architecture?

Related topics