Help guide a noob for optimizing elasticsearch in current cluster

Hi. I have been reading docs learning about elasticsearch. I would like definitive help and guidance to quickly assess and provide "actual" guidance for optimization. I am really hoping no one says "it depends" as I have read docs after docs not getting anywhere.

I inherited a 4 node cluster which has recently started to crash about once a week from what i suspect was super high memory.heap usage. I read that once it goes to 80's i can expect trouble. 3 of the nodes would go into high 80's, one in particular and i would need to restart that node and memory would go back down to high 70's and continue to climb. i checked my logs for errors, none that i could see.

I know our load has been increasing this year, and last year our cluster was fine so logically i assumed we need to scale out. Not being able to ascertain exactly how much I need to scale up by I randomly picked to nodes to add to our cluster and it seems healthier. but have have really want numbers and facts and not wishfull thinking issue will go away. 1) i dont know by how much data load has increased, and now idea how to determine that.

I read different tid bits from official and forum docs and am no closer to getting a clue how to assess current setup and plan performance tweaking...i see many others have that same issue.

Previous setup:

4x 8 CPU, 32GB RAM VMs (storage dynamically can grow on NFS mount)
1x Master/Data/Client node
3x Data nodes
Replicas: 1
Shards: 5
discovery.zen.minimum_master_nodes: 1
Java memory min/max: 16GB

here's some of the past output (apologies i did not capture all data well before hand)

shards disk.indices disk.used node
   ???        ???tb     5.6tb hostl0104
   ???        ???tb     5.6tb hostl0025
   ???        ???tb     5.6tb hostl0103
   ???        ???tb     5.6tb hostl0026
   
   
heap.percent ram.percent cpu  node.role master name
          85          68  34  d         -      hostl0104
          86          99  36  d         -      hostl0026
          88          91  32  md        *      hostl0025
          82          99  34  d         -      hostl0103


cluster       status node.total node.data shards  pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
es_myapp_test green           4         4   3202 1601    0    0        0             0                  -                100.0%

As mentioned I took a guess based on what i read so far and added 2 nodes.

New setup:

six 8 CPU, 32GB RAM VMs (storage dynamically can grow on NFS mount)
1x Master/Data/Client node
2x Master/Data
3x Data nodes
Replicas: 1
Shards: 5
discovery.zen.minimum_master_nodes: 2
Java memory min/max: 16GB

Syncing data is still in progress, I have observed 1) heap size went down to safer levels 2) disk space usage across all previous hosts is decreasing due to data being distributed across all six nodes. hardware wise I increased by 50% actual number in gains i have no idea. I am a bit disappointed that heap.memory is hovering currently in the mid-70s on the old nodes. I observed that heap.memory is proportional to disk size. when sync first started and data was insignificant heap.memory size was just ~5%. as the volume grows so does the heap size.

shards disk.indices disk.used  node
   616        4.2tb     3.4tb  hostl0104
   616        4.2tb     3.5tb  hostl0025
   368        2.5tb     2.1tb  hostl0110
   617        4.2tb     3.4tb  hostl0103
   617        4.2tb     3.5tb  hostl0026
   368        2.2tb     1.8tb  hostl0111
   
   
heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
          72          68   3    2.10    1.92     1.93 d         -      hostl0104
          70          99   3    1.98    1.80     1.74 md        *      hostl0026
          61          75   3    1.00    1.00     0.98 d         -      hostl0110
          74          91   2    1.82    1.83     1.78 md        -      hostl0025
          71          99   4    1.50    1.59     1.71 md        -      hostl0103
          56          99   3    1.49    1.30     1.21 d         -      hostl0111


cluster       status node.total node.data shards  pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
es_myapp_test green           6         6   3202 1601    2    0        0             0                  -                100.0%

Once fully synced I plan on moving the Client onto either its own client own configuration install or set it up on one of the data nodes, moving off of the master/data node.

I really would like guidance as to how to proceed next with analyzing my my cluster. I can get a ton of data from ES' API but those numbers are meaningless to me, i have no idea what good looks like, or what to look for or what a healthy benchmark should be. I read somewhere that my shards should math the number of nodes I have.

Also not sure if this helps but when accessing the client via Kibana it takes 2.7 minutes to load the first screen.

Aside from setting the master/data zen nodes settings my installs are generic.

/etc/security/limits.conf
user soft memlock unlimited
user hard memlock unlimited
user - nofile 65536

/etc/sysctl.conf
vm.max_map_count=262144

Below I am pasting my allocation output. from what i know I have quite a bit of indexes that are HUGE. Largest being 1.5TB. I have no idea where to go from here, any solid guidance would be greatly appreciated.

If it helps any, the below data I believe is from the last 2 years approximately. forums limit didnt allow me to paste all now 3.6 TB x 6 nodes.

health status index                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   my_index_name_goes_here   5   1    1682791            0      2.9gb          1.4gb
green  open   my_index_name_goes_here   5   1    9374637            0      8.8gb          4.4gb
green  open   my_index_name_goes_here   5   1    8675140            0       14gb            7gb
green  open   my_index_name_goes_here   5   1   10521171            0     17.1gb          8.5gb
green  open   my_index_name_goes_here   5   1  893644249            0      1.4tb        748.2gb
green  open   my_index_name_goes_here   5   1   11890171            0     19.8gb          9.9gb
green  open   my_index_name_goes_here   5   1   13164680            0     20.6gb         10.3gb
green  open   my_index_name_goes_here   5   1  129384775            0    207.1gb        103.5gb
green  open   my_index_name_goes_here   5   1   13530013            0     21.9gb         10.9gb
green  open   my_index_name_goes_here   5   1 1317183222            0      1.5tb          779gb

It would be easier to provide guidance if you told us a bit about your use case. Is it index and/or query heavy? What kind of data do you have in the cluster? Are you using nested documents and/or parent-child relationships?

You also seem to have a reasonably small number of indices that vary greatly in size. What is the strategy behind this?

Hi Christian, thanks for the reply. the data being ingested are all application logs. Our agents parse these logs and tag each entry by date, severity type (ERR, WARN, INFO), and the associated content. end users are mostly interested in searching for errors. purpose is to monitor our load test environment, which when ramped up a simulation of ~150k users connections is generated. this lasts typically 3-4 days straight per week. I updated our java min/max settings in my post above, i forgot about that yesterday.

I am guessing our queries are heavy as engineers need to sift through the logs using various query strings. we also have very verbose logging so im also guessing our indexing is heavy too? no idea about nested docs or parent/child relationships. Strategy? strategy? the ask is to retain data, however most data lookup is short term as in past few weeks which equals past few load tests to compare and analyze.

apologies i dont know how to get the answers more technically.

If you have log data that you typically do not update it is very common to use time-based indices. This will limit the size of individual indices and can be managed either by having each index correspond to a fixed time period or by using the rollover index API to manage this. This is also described in this blog post. This can improve querying as well as querying very large shards can be slow.

Thanks Christian, really good read. I guess i was looking for something more like (for example only):

  • if data over terabyte recommended shards are 10
  • this API will give the query speeds, 80ms + indicates lag recommended action to duple replication
  • metric A,B,C indicates too many indexes, too many shards, too much lag etc
  • a large shard is anything over ~30GBs
  • these are the recommended benchmarks for x sizes etc.

...that sort of thing. I realize there are so many factors to consider an exact answer is impossible.

even after reading the links you provided I still dont know what my next course of actions are for my cluster. i guess I need to dissect each index and revisit each point and play around in a test environment or something.

As Elasticsearch can support a wide variety of very different use cases, it is hard to give any simple guidelines like that. You have two very large indices, so I would probably tackle these first. It may make sense to reindex them and simply split them up into a larger number of shards. If these however contain some kind of time-based data, you may want to switch to time-based indices and e.g. have one index per month. If you are not updating documents this can work really well as older indices that are no longer written to can be optimised. It also helps manage the shard size.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.