18MB of data enough to destory ES performance?

Hi,

I'm having some performance issues. I got a index with some netflow data in it and even though the total size is only 18MB, once I try to visualize more than a week of data in Kibana (couple of aggegrations) performance thanks, browsers crash and a ton of heap space is required.

I'm running on AWS elasticservice (t.2medium, 2gb heap) so unfortunately there is very little I can provide as far as logs go.

How can I check what is causing these issues? It doesn't look like the queries itself are that heavy nor do I have that much data in my indeces. I also don't get why I can see heap size growing to nearly 2gb even though the total data set is only 18MB.

{
  "_shards": {
    "total": 2,
    "successful": 2,
    "failed": 0
  },
  "_all": {
    "primaries": {
      "docs": {
        "count": 160495,
        "deleted": 0
      },
      "store": {
        "size_in_bytes": 18537477
      },
      "indexing": {
        "index_total": 149,
        "index_time_in_millis": 166,
        "index_current": 0,
        "index_failed": 0,
        "delete_total": 0,
        "delete_time_in_millis": 0,
        "delete_current": 0,
        "noop_update_total": 0,
        "is_throttled": false,
        "throttle_time_in_millis": 0
      },
      "get": {
        "total": 0,
        "time_in_millis": 0,
        "exists_total": 0,
        "exists_time_in_millis": 0,
        "missing_total": 0,
        "missing_time_in_millis": 0,
        "current": 0
      },
      "search": {
        "open_contexts": 0,
        "query_total": 424,
        "query_time_in_millis": 97876,
        "query_current": 0,
        "fetch_total": 0,
        "fetch_time_in_millis": 0,
        "fetch_current": 0,
        "scroll_total": 0,
        "scroll_time_in_millis": 0,
        "scroll_current": 0,
        "suggest_total": 0,
        "suggest_time_in_millis": 0,
        "suggest_current": 0
      },
      "merges": {
        "current": 0,
        "current_docs": 0,
        "current_size_in_bytes": 0,
        "total": 0,
        "total_time_in_millis": 0,
        "total_docs": 0,
        "total_size_in_bytes": 0,
        "total_stopped_time_in_millis": 0,
        "total_throttled_time_in_millis": 0,
        "total_auto_throttle_in_bytes": 41943040
      },
      "refresh": {
        "total": 12,
        "total_time_in_millis": 114,
        "listeners": 0
      },
      "flush": {
        "total": 2,
        "total_time_in_millis": 20
      },
      "warmer": {
        "current": 0,
        "total": 7,
        "total_time_in_millis": 0
      },
      "query_cache": {
        "memory_size_in_bytes": 407376,
        "total_count": 846,
        "hit_count": 410,
        "miss_count": 436,
        "cache_size": 42,
        "cache_count": 42,
        "evictions": 0
      },
      "fielddata": {
        "memory_size_in_bytes": 6552,
        "evictions": 0
      },
      "completion": {
        "size_in_bytes": 0
      },
      "segments": {
        "count": 9,
        "memory_in_bytes": 101356,
        "terms_memory_in_bytes": 64698,
        "stored_fields_memory_in_bytes": 14160,
        "term_vectors_memory_in_bytes": 0,
        "norms_memory_in_bytes": 4992,
        "points_memory_in_bytes": 6150,
        "doc_values_memory_in_bytes": 11356,
        "index_writer_memory_in_bytes": 0,
        "version_map_memory_in_bytes": 0,
        "fixed_bit_set_memory_in_bytes": 0,
        "max_unsafe_auto_id_timestamp": 1518673926047,
        "file_sizes": {}
      },
      "translog": {
        "operations": 3180,
        "size_in_bytes": 1444518,
        "uncommitted_operations": 0,
        "uncommitted_size_in_bytes": 86
      },
      "request_cache": {
        "memory_size_in_bytes": 545570,
        "evictions": 0,
        "hit_count": 88,
        "miss_count": 248
      },
      "recovery": {
        "current_as_source": 0,
        "current_as_target": 0,
        "throttle_time_in_millis": 114
      }

t2 instances have burstable CPU and are in my opinion not very well suited for Elasticsearch (possibly apart from very light search use cases). Aggregations can use a fair amount of CPU in a short amount of time, which may deplete your CPU credits causing performance problems. I would recommend instead using a m4/m5 instance.

1 Like

BTW you should give a try to cloud.elastic.co :slight_smile: It's also available from the AWS marketplace.

1 Like

@dadoonet: I initially wanted to roll with elastic cloud. Would have also given us the benefit of free X-pack at least for a while. Unfortunately the powers that be decided otherwise. I will give it another try but as we already use various AWS services we will probably end up with an EC2 instance for ES.

BTW does elastic cloud give you full access to log files and configuration files in the same way as when running your own installation?

@Christian_Dahlqvist: I will give it a try. Though I was running a very similar setup with much more data on a intel nuc i3 inside a VM with only 2gb heap and that could actually search a lot more. Had to set some very large time outs etc. but it worked.

Is it normal that only 18MB of data can cause 2gb heap size requirements? Also I don't suppose more indices is going to make any difference unless I add more nodes? I'm now running everything off one index (no replicas).

No, that is not normal. Having lots of small shards and using deeply nested aggregations creating a lot of buckets could however drive up heap usage, so it would help if you could share the full output of the cluster stats API as well as details about the aggregations you are running.

The data is spread over only to indices so I can't see that being the problem. I've removed the elasticservice instance and am now running a EC2 m5 instance. Once I get it running (for some reason the mapping I used before isn't working now...) I'll post the shards and aggregations I'm running.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.