ELK performance

Slop · March 13, 2018, 10:35am

Hello,

I'm using ELK since a while (8 month), and everything start to being slow. For exemple it take 3 or 4 minutes to display dashboard of the day.

I think, it's because no tuning have been done on the install (it has been done by a tierce person with Bitnami).

Here're the stats of my cluster Elastic :

{
  "_nodes": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "cluster_name": "elasticsearch",
  "timestamp": 1520937046119,
  "status": "yellow",
  "indices": {
    "count": 386,
    "shards": {
      "total": 1470,
      "primaries": 1470,
      "replication": 0,
      "index": {
        "shards": {
          "min": 1,
          "max": 5,
          "avg": 3.8082901554404147
        },
        "primaries": {
          "min": 1,
          "max": 5,
          "avg": 3.8082901554404147
        },
        "replication": {
          "min": 0,
          "max": 0,
          "avg": 0
        }
      }
    },
    "docs": {
      "count": 56143114,
      "deleted": 861594
    },
    "store": {
      "size": "18.1gb",
      "size_in_bytes": 19435480122,
      "throttle_time": "0s",
      "throttle_time_in_millis": 0
    },
    "fielddata": {
      "memory_size": "1.2mb",
      "memory_size_in_bytes": 1286376,
      "evictions": 0
    },
    "query_cache": {
      "memory_size": "0b",
      "memory_size_in_bytes": 0,
      "total_count": 196,
      "hit_count": 0,
      "miss_count": 196,
      "cache_size": 0,
      "cache_count": 0,
      "evictions": 0
    },
    "completion": {
      "size": "0b",
      "size_in_bytes": 0
    },
    "segments": {
      "count": 7485,
      "memory": "94.5mb",
      "memory_in_bytes": 99113518,
      "terms_memory": "75.8mb",
      "terms_memory_in_bytes": 79568407,
      "stored_fields_memory": "7.1mb",
      "stored_fields_memory_in_bytes": 7546504,
      "term_vectors_memory": "0b",
      "term_vectors_memory_in_bytes": 0,
      "norms_memory": "199.1kb",
      "norms_memory_in_bytes": 203968,
      "points_memory": "2.8mb",
      "points_memory_in_bytes": 3016067,
      "doc_values_memory": "8.3mb",
      "doc_values_memory_in_bytes": 8778572,
      "index_writer_memory": "0b",
      "index_writer_memory_in_bytes": 0,
      "version_map_memory": "0b",
      "version_map_memory_in_bytes": 0,
      "fixed_bit_set": "39.2kb",
      "fixed_bit_set_memory_in_bytes": 40168,
      "max_unsafe_auto_id_timestamp": -1,
      "file_sizes": {}
    }
  },
  "nodes": {
    "count": {
      "total": 1,
      "data": 1,
      "coordinating_only": 0,
      "master": 1,
      "ingest": 1
    },
    "versions": [
      "5.4.1"
    ],
    "os": {
      "available_processors": 8,
      "allocated_processors": 8,
      "names": [
        {
          "name": "Windows Server 2012 R2",
          "count": 1
        }
      ],
      "mem": {
        "total": "31.9gb",
        "total_in_bytes": 34359271424,
        "free": "26gb",
        "free_in_bytes": 27950968832,
        "used": "5.9gb",
        "used_in_bytes": 6408302592,
        "free_percent": 81,
        "used_percent": 19
      }
    },
    "process": {
      "cpu": {
        "percent": 18
      },
      "open_file_descriptors": {
        "min": -1,
        "max": -1,
        "avg": 0
      }
    },
    "jvm": {
      "max_uptime": "38.4m",
      "max_uptime_in_millis": 2304563,
      "versions": [
        {
          "version": "1.8.0_131",
          "vm_name": "Java HotSpot(TM) Server VM",
          "vm_version": "25.131-b11",
          "vm_vendor": "Oracle Corporation",
          "count": 1
        }
      ],
      "mem": {
        "heap_used": "949.8mb",
        "heap_used_in_bytes": 995994536,
        "heap_max": "989.8mb",
        "heap_max_in_bytes": 1037959168
      },
      "threads": 81
    },
    "fs": {
      "total": "59.6gb",
      "total_in_bytes": 64055406592,
      "free": "10.6gb",
      "free_in_bytes": 11409387520,
      "available": "10.6gb",
      "available_in_bytes": 11409387520
    },
    "plugins": [],
    "network_types": {
      "transport_types": {
        "netty4": 1
      },
      "http_types": {
        "netty4": 1
      }
    }
  }
}

Christian_Dahlqvist · March 13, 2018, 10:40am

You have far too many shards given the size of the cluster (single node with only 1GB heap) and volume of data. Please read this blog post and then alter your sharding strategy and try to reduce the shard count significantly.

Ben96 · March 13, 2018, 10:43am

Hi @Slop,

When reading logs, your cluster seems "fine", no overcache, no out of memory (26G Free! ) FS got 10G remaining space ( maybe not enought ? ).

Maybe, with the time, your data range is bigger than the begining ( more docs in the time ) then your search take more time to be fully performed.

You can monitor your CPU usage when quering your dashboard's day, to see if any anomaly comes

Your cluster is Yellow, don't know why, but it could be the issue. check it out

dadoonet · March 13, 2018, 10:49am

Let me add some other resources about sizing which might help:

https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing

rcowart · March 13, 2018, 11:35am

The first thing you MUST do is to give Elasticsearch more JVM Heap Space to work with. You can do this by editing the jvm.options file for Elasticsearch, this is usually found at /etc/elasticsearch/jvm.options

In this file the first section allows you to set the initial and max JVM Heap Size. It will probably look like this on your node...

# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space

-Xms1g
-Xmx1g

Since you have plenty of RAM available, modify these two settings for 8GB (could be 12 or even 16GB, but start at 8).

-Xms8g
-Xmx8g

Restart Elasticsearch and you will probably be fine.

It is still recommended that you watch heap utilization and ensure that you don't have too many indices and shards. You really don't have very much data, but it may be that it could be better organized. For example, if you are currently writing daily indices, you may want to change to monthly.

Slop · March 22, 2018, 4:59pm

Thanks everybody for your answers, I was away for a while.
I'll try to reduce the number of shard and upgrade JVM heap size.
I've reseted everything and will try with a new config.

Kind regards,
Nicolas

system · April 19, 2018, 5:00pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch slow, crash Elasticsearch	3	355	May 22, 2019
Slow indexing rate Elasticsearch	3	32	October 6, 2024
My ELK Cluster is slow indexing Why? Please Help Elasticsearch	6	605	February 8, 2021
Elastic cluster slow down afre a few weeks of uptime(cluster recommendations) Elasticsearch	17	865	January 17, 2020
Elasticsearch Cluster Disk Write Performance Elasticsearch	2	116	February 7, 2024

ELK performance

Related topics