Data too large indices:data/read/search[phase/query

Hi
How I can increase such value ? for avoid any disturbance in read data over kibana

    [parent] Data too large, data for [indices:data/read/search[phase/query]] would be [4093997030/3.8gb], which is larger than the limit of [4080218931/3.7gb], real usage: [4093995640/3.8gb], new bytes reserved: [1390/1.3kb], usages [inflight_requests=1390/1.3kb, model_inference=0/0b, eql_sequence=0/0b, fielddata=36900747/35.1mb, request=0/0b]

What does this show

GET _nodes/stats/jvm/

Under the JVM section for your hot nodes....

     "jvm": {
        "timestamp": 1674514330238,
        "uptime_in_millis": 3982594145,
        "mem": {
          "heap_used_in_bytes": 1267712272,
          "heap_used_percent": 30,
          "heap_committed_in_bytes": 4202692608,
          "heap_max_in_bytes": 4202692608,
          "non_heap_used_in_bytes": 387954520,
          "non_heap_committed_in_bytes": 400424960,

stats for above query

I'd rather say that I've a problem with warm nodes for read the data over kibana

I need to understand what I'm doing wrong it's a natural that on hot tier we have much more memory due to warm tier. So after shifting these data over ILM policy we should do with these data (resize/change count of shards?) . What should be done during this process over ILM or how can I limited query from kibana (it was just simply review from discovery point tab)

on the hot tier nodes

    "wgCiK7OqTbG6XUPTtv-_gg" : {
      "timestamp" : 1674516876515,
      "name" : "es_data_ssd_2_2",
      "transport_address" : "10.0.9.14:9300",
      "host" : "10.0.9.14",
      "ip" : "10.0.9.14:9300",
      "roles" : [
        "data_content",
        "data_hot"
      ],
      "attributes" : {
        "rack_id" : "rack_two",
        "xpack.installed" : "true"
      },
      "breakers" : {
        "request" : {
          "limit_size_in_bytes" : 8160437862,
          "limit_size" : "7.5gb",
          "estimated_size_in_bytes" : 1310720,
          "estimated_size" : "1.2mb",
          "overhead" : 1.0,
          "tripped" : 0
        },
        "inflight_requests" : {
          "limit_size_in_bytes" : 8589934592,
          "limit_size" : "8gb",
          "estimated_size_in_bytes" : 376915,
          "estimated_size" : "368kb",
          "overhead" : 2.0,
          "tripped" : 0
        },
        "model_inference" : {
          "limit_size_in_bytes" : 4294967296,
          "limit_size" : "4gb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.0,
          "tripped" : 0
        },
        "eql_sequence" : {
          "limit_size_in_bytes" : 4294967296,
          "limit_size" : "4gb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.0,
          "tripped" : 0
        },
        "fielddata" : {
          "limit_size_in_bytes" : 3435973836,
          "limit_size" : "3.1gb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.03,
          "tripped" : 0
        },
        "parent" : {
          "limit_size_in_bytes" : 8160437862,
          "limit_size" : "7.5gb",
          "estimated_size_in_bytes" : 4106377312,
          "estimated_size" : "3.8gb",
          "overhead" : 1.0,
          "tripped" : 0
        }
      }

on the warm tier nodes

    "HU8aioFzTcmPVppkZUdxlw" : {
      "timestamp" : 1674516876515,
      "name" : "es_data_hdd_3_3",
      "transport_address" : "10.0.9.31:9300",
      "host" : "10.0.9.31",
      "ip" : "10.0.9.31:9300",
      "roles" : [
        "data_warm"
      ],
      "attributes" : {
        "rack_id" : "rack_three",
        "xpack.installed" : "true"
      },
      "breakers" : {
        "request" : {
          "limit_size_in_bytes" : 4080218931,
          "limit_size" : "3.7gb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.0,
          "tripped" : 0
        },
        "fielddata" : {
          "limit_size_in_bytes" : 1717986918,
          "limit_size" : "1.5gb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.03,
          "tripped" : 0
        },
        "eql_sequence" : {
          "limit_size_in_bytes" : 2147483648,
          "limit_size" : "2gb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.0,
          "tripped" : 0
        },
        "model_inference" : {
          "limit_size_in_bytes" : 2147483648,
          "limit_size" : "2gb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.0,
          "tripped" : 0
        },
        "inflight_requests" : {
          "limit_size_in_bytes" : 4294967296,
          "limit_size" : "4gb",
          "estimated_size_in_bytes" : 8408,
          "estimated_size" : "8.2kb",
          "overhead" : 2.0,
          "tripped" : 0
        },
        "parent" : {
          "limit_size_in_bytes" : 4080218931,
          "limit_size" : "3.7gb",
          "estimated_size_in_bytes" : 1778009752,
          "estimated_size" : "1.6gb",
          "overhead" : 1.0,
          "tripped" : 0
        }
      }
    },

Hi @INS

Apologies what version are you on?

How do you know? But ok lets assume.

There can be a few reasons ... but in short you are running over your heap.

First I see on a warm node you have 4GB of Heap which is not very large especially if you have a lot of fields, indices, shards... that can cause you to run out of heap during read or write operations.

"DyMOMyeWQSSiMYHiY00GSg" : {
      "timestamp" : 1674515278756,
      "name" : "es_data_hdd_1_1",
      "transport_address" : "10.0.9.115:9300",
      "host" : "10.0.9.115",
      "ip" : "10.0.9.115:9300",
      "roles" : [
        "data_warm"
      ],
      "attributes" : {
        "rack_id" : "rack_one",
        "xpack.installed" : "true"
      },
      "jvm" : {
        "timestamp" : 1674515278756,
        "uptime_in_millis" : 660382257,
        "mem" : {
          "heap_used_in_bytes" : 3991299504,
          "heap_used_percent" : 92, <!--- Running Very Hot
          "heap_committed_in_bytes" : 4294967296,  <!--- 4GB 
          "heap_max_in_bytes" : 4294967296,
          "non_heap_used_in_bytes" : 238926808,
          "non_heap_committed_in_bytes" : 245956608,

So let's take a closer look at that node see what see, and share the results of this command plus the version of the stack.

GET /_nodes/iDyMOMyeWQSSiMYHiY00GSg/stats?metric=indices,jvm,breaker

Pls hold on, yesterday I've redeployed this cluster from scratch, so I need to wait for come back this issue . I think that it will come back soon.

@stephenb ok So I catch it the same case on the other node but right now I have a stats

{
  "_nodes" : {
    "total" : 1,
    "successful" : 0,
    "failed" : 1,
    "failures" : [
      {
        "type" : "failed_node_exception",
        "reason" : "Failed node [MZ-C2rmjTgC4C-WLXA_KFQ]",
        "node_id" : "MZ-C2rmjTgC4C-WLXA_KFQ",
        "caused_by" : {
          "type" : "circuit_breaking_exception",
          "reason" : "[parent] Data too large, data for [cluster:monitor/nodes/stats[n]] would be [4106389772/3.8gb], which is larger than the limit of [4080218931/3.7gb], real usage: [4106389280/3.8gb], new bytes reserved: [492/492b], usages [model_inference=0/0b, eql_sequence=0/0b, fielddata=103148105/98.3mb, request=0/0b, inflight_requests=492/492b]",
          "bytes_wanted" : 4106389772,
          "bytes_limit" : 4080218931,
          "durability" : "PERMANENT"
        }
      }
    ]
  },
  "cluster_name" : "elk_cluster",
  "nodes" : { }
}
{
  "_nodes" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "cluster_name" : "elk_cluster",
  "nodes" : {
    "MZ-C2rmjTgC4C-WLXA_KFQ" : {
      "timestamp" : 1675155284377,
      "name" : "es_data_hdd_5_1",
      "transport_address" : "10.0.9.214:9300",
      "host" : "10.0.9.214",
      "ip" : "10.0.9.214:9300",
      "roles" : [
        "data_warm"
      ],
      "attributes" : {
        "rack_id" : "rack_one",
        "xpack.installed" : "true"
      },
      "indices" : {
        "docs" : {
          "count" : 4408698196,
          "deleted" : 0
        },
        "shard_stats" : {
          "total_count" : 528
        },
        "store" : {
          "size_in_bytes" : 1606382858178,
          "total_data_set_size_in_bytes" : 1606382858178,
          "reserved_in_bytes" : 0
        },
        "indexing" : {
          "index_total" : 8085108,
          "index_time_in_millis" : 209702,
          "index_current" : 0,
          "index_failed" : 0,
          "delete_total" : 0,
          "delete_time_in_millis" : 0,
          "delete_current" : 0,
          "noop_update_total" : 0,
          "is_throttled" : false,
          "throttle_time_in_millis" : 0
        },
        "get" : {
          "total" : 0,
          "time_in_millis" : 0,
          "exists_total" : 0,
          "exists_time_in_millis" : 0,
          "missing_total" : 0,
          "missing_time_in_millis" : 0,
          "current" : 0
        },
        "search" : {
          "open_contexts" : 0,
          "query_total" : 7597,
          "query_time_in_millis" : 20782869,
          "query_current" : 0,
          "fetch_total" : 282,
          "fetch_time_in_millis" : 23437,
          "fetch_current" : 0,
          "scroll_total" : 75,
          "scroll_time_in_millis" : 2277788,
          "scroll_current" : 0,
          "suggest_total" : 0,
          "suggest_time_in_millis" : 0,
          "suggest_current" : 0
        },
        "merges" : {
          "current" : 0,
          "current_docs" : 0,
          "current_size_in_bytes" : 0,
          "total" : 20,
          "total_time_in_millis" : 88605,
          "total_docs" : 13483958,
          "total_size_in_bytes" : 748275470,
          "total_stopped_time_in_millis" : 0,
          "total_throttled_time_in_millis" : 19118,
          "total_auto_throttle_in_bytes" : 12811692218
        },
        "refresh" : {
          "total" : 1604,
          "total_time_in_millis" : 41281,
          "external_total" : 1337,
          "external_total_time_in_millis" : 41673,
          "listeners" : 0
        },
        "flush" : {
          "total" : 616,
          "periodic" : 616,
          "total_time_in_millis" : 2059
        },
        "warmer" : {
          "current" : 0,
          "total" : 651,
          "total_time_in_millis" : 133
        },
   "query_cache" : {
          "memory_size_in_bytes" : 54523743,
          "total_count" : 37784,
          "hit_count" : 12062,
          "miss_count" : 25722,
          "cache_size" : 503,
          "cache_count" : 503,
          "evictions" : 0
        },
        "fielddata" : {
          "memory_size_in_bytes" : 100143792,
          "evictions" : 0
        },
        "completion" : {
          "size_in_bytes" : 0
        },
        "segments" : {
          "count" : 5770,
          "memory_in_bytes" : 0,
          "terms_memory_in_bytes" : 0,
          "stored_fields_memory_in_bytes" : 0,
          "term_vectors_memory_in_bytes" : 0,
          "norms_memory_in_bytes" : 0,
          "points_memory_in_bytes" : 0,
          "doc_values_memory_in_bytes" : 0,
          "index_writer_memory_in_bytes" : 0,
          "version_map_memory_in_bytes" : 0,
          "fixed_bit_set_memory_in_bytes" : 0,
          "max_unsafe_auto_id_timestamp" : 1674774061516,
          "file_sizes" : { }
        },
        "translog" : {
          "operations" : 0,
          "size_in_bytes" : 29040,
          "uncommitted_operations" : 0,
          "uncommitted_size_in_bytes" : 29040,
          "earliest_last_modified_age" : 7494558
        },
        "request_cache" : {
          "memory_size_in_bytes" : 1765752,
          "evictions" : 0,
          "hit_count" : 3428,
          "miss_count" : 1026
        },
        "recovery" : {
          "current_as_source" : 0,
          "current_as_target" : 0,
          "throttle_time_in_millis" : 13637052
        },
        "bulk" : {
          "total_operations" : 1626,
          "total_time_in_millis" : 216141,
          "total_size_in_bytes" : 2692338928,
          "avg_time_in_millis" : 131,
          "avg_size_in_bytes" : 1623956
        }
      },
      "jvm" : {
        "timestamp" : 1675155283586,
        "uptime_in_millis" : 638731679,
        "mem" : {
          "heap_used_in_bytes" : 4052139984,
          "heap_used_percent" : 94,
          "heap_committed_in_bytes" : 4294967296,
          "heap_max_in_bytes" : 4294967296,
          "non_heap_used_in_bytes" : 237456848,
          "non_heap_committed_in_bytes" : 244252672,
          "pools" : {
            "young" : {
              "used_in_bytes" : 8388608,
              "max_in_bytes" : 0,
              "peak_used_in_bytes" : 2529165312,
              "peak_max_in_bytes" : 0
            },
            "old" : {
              "used_in_bytes" : 4043128320,
              "max_in_bytes" : 4294967296,
              "peak_used_in_bytes" : 4245606400,
              "peak_max_in_bytes" : 4294967296
            },
            "survivor" : {
              "used_in_bytes" : 623056,
              "max_in_bytes" : 0,
              "peak_used_in_bytes" : 322961408,
              "peak_max_in_bytes" : 0
            }
          }
        },
        "threads" : {
          "count" : 326,
          "peak_count" : 326
        },
        "gc" : {
          "collectors" : {
            "young" : {
              "collection_count" : 52432,
              "collection_time_in_millis" : 428935
            },
            "old" : {
              "collection_count" : 0,
              "collection_time_in_millis" : 0
            }
          }
        },
        "buffer_pools" : {
          "mapped" : {
            "count" : 13400,
            "used_in_bytes" : 776426767983,
            "total_capacity_in_bytes" : 776426767983
          },
          "direct" : {
            "count" : 368,
            "used_in_bytes" : 73475831,
            "total_capacity_in_bytes" : 73475829
          },
          "mapped - 'non-volatile memory'" : {
            "count" : 0,
            "used_in_bytes" : 0,
            "total_capacity_in_bytes" : 0
          }
        },
        "classes" : {
          "current_loaded_count" : 26176,
          "total_loaded_count" : 26699,
          "total_unloaded_count" : 523
        }
      },
      "breakers" : {
        "model_inference" : {
          "limit_size_in_bytes" : 2147483648,
          "limit_size" : "2gb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.0,
          "tripped" : 0
        },
        "eql_sequence" : {
          "limit_size_in_bytes" : 2147483648,
          "limit_size" : "2gb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.0,
          "tripped" : 0
        },
        "fielddata" : {
          "limit_size_in_bytes" : 1717986918,
          "limit_size" : "1.5gb",
          "estimated_size_in_bytes" : 100143792,
          "estimated_size" : "95.5mb",
          "overhead" : 1.03,
          "tripped" : 0
        },
        "request" : {
          "limit_size_in_bytes" : 4080218931,
          "limit_size" : "3.7gb",
          "estimated_size_in_bytes" : 0,
          "estimated_size" : "0b",
          "overhead" : 1.0,
          "tripped" : 0
        },
        "inflight_requests" : {
          "limit_size_in_bytes" : 4294967296,
          "limit_size" : "4gb",
          "estimated_size_in_bytes" : 246,
          "estimated_size" : "246b",
          "overhead" : 2.0,
          "tripped" : 0
        },
        "parent" : {
          "limit_size_in_bytes" : 4080218931,
          "limit_size" : "3.7gb",
          "estimated_size_in_bytes" : 4064722896,
          "estimated_size" : "3.7gb",
          "overhead" : 1.0,
          "tripped" : 424233
        }
      }
    }
  }
}

Hi @INS

You have still not provided the version you are on.... which is important....

In short, it looks like you are running out of JVM Heap.

You have a LOT of shards 500+ for 4GB Heap...

There are a number of factors that consume heap...

The number of Field Mapping, Number of Shards etc.

Give us the version perhaps we can help but a short-term fix would be to increase the JVM heap space or clean up your indices, shards, mappings...

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.