Elastic Performance is slow

Hi Team

I have custom cluster built with 3 nodes and all of them are master and data nodes. (at any point only one can master)
we have seen very quick data pull and push with response time is around 150ms to 250ms, but sometime, we have it takes larger response time like 1.5sec to 2 sec.

is it something related to cluster issues or how do we monitor what is causing this performance issues?

or do we need to add more masters to cluster or what could be the issue here.

please assist. thank you

Welcome to our community! :smiley:

There's not enough information here for us to effectively help you. You'll need to share things like your use case, Elasticsearch version, what this data pull is, the output from the _cluster/stats?pretty&human API and possibly more.

1 Like

Hi warkolm

thanks for your reply. we have 3 nodes with below stats,

timestamp": 1652846048038,
  "status": "green",
  "indices": {
    "count": 23,
    "shards": {
      "total": 222,
      "primaries": 111,
      "replication": 1.0,
      "index": {
        "shards": {
          "min": 2,
          "max": 10,
          "avg": 9.652173913043478
        },
        "primaries": {
          "min": 1,
          "max": 5,
          "avg": 4.826086956521739
        },
        "replication": {
          "min": 1.0,
          "max": 1.0,
          "avg": 1.0
        }
      }
    },
    "docs": {
      "count": 5662128,
      "deleted": 1087260
    },
    "store": {
      "size": "21.8gb",
      "size_in_bytes": 23453624948,
      "throttle_time": "0s",
      "throttle_time_in_millis": 0
    },
    "fielddata": {
      "memory_size": "0b",
      "memory_size_in_bytes": 0,
      "evictions": 0
    },
    "query_cache": {
      "memory_size": "9.5mb",
      "memory_size_in_bytes": 10006937,
      "total_count": 1211379,
      "hit_count": 789060,
      "miss_count": 422319,
      "cache_size": 9520,
      "cache_count": 9520,
      "evictions": 0
    },
    "completion": {
      "size": "0b",
      "size_in_bytes": 0
    },
    "segments": {
      "count": 1768,
      "memory": "345.4mb",
      "memory_in_bytes": 362212322,
      "terms_memory": "176.5mb",
      "terms_memory_in_bytes": 185168612,
      "stored_fields_memory": "3.6mb",
      "stored_fields_memory_in_bytes": 3805520,
      "term_vectors_memory": "0b",
      "term_vectors_memory_in_bytes": 0,
      "norms_memory": "21.5mb",
      "norms_memory_in_bytes": 22625792,
      "points_memory": "2.7mb",
      "points_memory_in_bytes": 2837646,
      "doc_values_memory": "140.9mb",
      "doc_values_memory_in_bytes": 147774752,
      "index_writer_memory": "0b",
      "index_writer_memory_in_bytes": 0,
      "version_map_memory": "0b",
      "version_map_memory_in_bytes": 0,
      "fixed_bit_set": "1.6mb",
      "fixed_bit_set_memory_in_bytes": 1764432,
      "max_unsafe_auto_id_timestamp": -1,
      "file_sizes": {
        
      }
    }
  },
  "nodes": {
    "count": {
      "total": 3,
      "data": 3,
      "coordinating_only": 0,
      "master": 3,
      "ingest": 3
    },
    "versions": [
      "5.6.3"
    ],
    "os": {
      "available_processors": 24,
      "allocated_processors": 24,
      "names": [
        {
          "name": "Linux",
          "count": 3
        }
      ],
      "mem": {
        "total": "93gb",
        "total_in_bytes": 99918176256,
        "free": "11.1gb",
        "free_in_bytes": 12006846464,
        "used": "81.8gb",
        "used_in_bytes": 87911329792,
        "free_percent": 12,
        "used_percent": 88
      }
    },
    "process": {
      "cpu": {
        "percent": 0
      },
      "open_file_descriptors": {
        "min": 487,
        "max": 487,
        "avg": 487
      }
    },
    "jvm": {
      "max_uptime": "43.1m",
      "max_uptime_in_millis": 2587638,
      "versions": [
        {
          "version": "1.8.0_312",
          "vm_name": "OpenJDK 64-Bit Server VM",
          "vm_version": "25.312-b07",
          "vm_vendor": "Private Build",
          "count": 3
        }
      ],
      "mem": {
        "heap_used": "4gb",
        "heap_used_in_bytes": 4296081856,
        "heap_max": "23.8gb",
        "heap_max_in_bytes": 25560612864
      },
      "threads": 275
    },
    "fs": {
      "total": "145.2gb",
      "total_in_bytes": 155930431488,
      "free": "82.1gb",
      "free_in_bytes": 88170692608,
      "available": "82gb",
      "available_in_bytes": 88120360960
    },
    "plugins": [
      {
        "name": "discovery-ec2",
        "version": "5.6.3",
        "description": "The EC2 discovery plugin allows to use AWS API for the unicast discovery mechanism.",
        "classname": "org.elasticsearch.discovery.ec2.Ec2DiscoveryPlugin",
        "has_native_controller": false
      }
    ],
    "network_types": {
      "transport_types": {
        "netty4": 3
      },
      "http_types": {
        "netty4": 3
      }
    ```
could you suggest here how to increase performance of queries. 

I had to get response in few milliseconds, I have 3 nodes and all pointed as master but at any point of time, one can be master and 2 are worker but all 3 are data nodes. 

hope this data good enough to give some clue,

JVM memory increased from 2 GB to 8 GB. 

also, please let me know how to enable log see incoming queries in full format, slowlog is giving just the time but not actual queries.

thanks

5.X is very, very old and and well past EOL. You need to upgrade.

Hi Warklom

I understand that 5.6 is very old and we are in process of upgrading to 8.2.
but meantime, is my configurations are okay to continue for sometime till we get upgraded?
or do you suggest any other options or tweeking in current ES?
please assist.

You have less than 6 million documents and 21.8GB of totral storage over 222 shards. It looks like you are using the standard for that version which is 5 primary shards with 1 replia. This is very inefficient. All of that data could easily fit into a single index with a single primary shard. If you wanted to parallellise a bit you could use a few primary shards for the index or a few indices with a single primary shard each. I would recommend you reindex your data and address this to see how much that improves performance.

You should also optimise your mappings and queries if you have not already done so.

3 Likes

Hi @Christian_Dahlqvist

could you explain single primary shard and how to configure them.
we had created custom cluster with above configurations, we have 12-15 indices and all are working on this cluster having 3 nodes.

when you say inefficient, could you explain more about the configuration , how this can be made more effiecient?

also, we have done fresh index few days ago and fresh indexes would run freequently. please assist

Create an index template and set the number of primary shards to 1. This will apply for all newly created indices.

To reduce the shard count for existing indices you can use the shrink index API. I would recommend doing this for all existing indices.

I would also recommend you look at this section about performance tuning.

1 Like

Hi @Christian_Dahlqvist
thanks for your reply,

if I set primary shard to 1, will that work for huge set of data (like mine above).
are you asking to change below?
"primaries": {
"min": 1,
"max": 5,
"avg": 4.826086956521739
}

could you give little idea of how primary shard works with other shards theoritically? this would help understanding core concept.

also, Bulk index operation failed 1 times in index {index} for type category. Error (illegal_argument_exception) : Limit of total fields [1000] in index [{index}] has been exceeded. Failed doc ids sample

what max value can be set for total fields, I have set 15k, is that okay to have 15 k or more?

please assist

From an Elasticsearch perspective your data set (21GB) is small. It is actually around the recommended limit for a single shard. I do not understand why you have so many indices or shards, but would recommend you reduce this, either through the shrink index API or by reindexing the data.

I would recommedn having a look at this section in the Elasticsearch - the Definitive Guide. It is very old (just a bit older than the version you are running) but still a good overview.

Why do you have so many fields? Are you using multiple document types?

I have seen it set to similar levels in the past but it can have an impact on performance and stability, especially if you are using dynamic mappings and these mappings need to be updated frequently.

Hi @Christian_Dahlqvist
thanks for your reply,

  • List item 21GB is small, what is max size of data single primary shard can hold?

  • List item sure, will take a look

  • List item yes, we have multiple document types in ES5 and all are in single indexes and I have 15 such indexes. fields are dynamic based on data.

  • List item Dynamic mapping is created from database attribute values and attribute keep growing if data increases. I think fields is nothing but _source attributes, so this is keep growing.

if primary shard is limited to 1, can I have 'n' number of other shards? If I set primary shard to 1, is it okay to handle huge set of data, and primary shard is applicable for each indices or per node?

also to ask you, how to invalidate ES cache based on data change? if nothing in changed in data, I would want ES send to cached data and if changed, send only changed fields and other fields should be picked from cache, is this possible?

please explain. thanks

I would initially reduce the number of primary shards to 1 per index you have and see what effect that has.

Ever growing mappings can become a problem so it is generally recommended trying to control mapping growth.

What is your definition of huge data? The recommended shard size is generally tens of GB and Elasticsearch nodes often end up holding terabytes of data, although this depends on use case and usage patterns.

This is all handled automatically.

@Christian_Dahlqvist

huge data is more than 50 or 100GB. you have answered that it can handle terabytes.

one confusion: when you say primary should be 1 which means, like this below?

"primaries": {
"min": 1,
"max": 1,
},

also if I want to increase 2 master to handle cluster, how can we have 2 master active at a time with how many data nodes (including master nodes?)

like I have 1 master and 2 nodes = 3 nodes (all data nodes as well)

please explain

thanks

You should always have at least 3 master-eligible nodes in a cluster if you are looking for high availability. One of these will be the active master at any time, never more.

@Christian_Dahlqvist
thank you\

what I meant was can I have 2 master at a time running , load balancing few number of data nodes?
please explain.

I think 1 master may not be able to handle so many requests coming-in including indexing data with same master.

thanks

The master node only manages cluster state and is not involved in request processing, and therefore not a bottleneck.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.