Elasticsearch 7.10 durability permanent

Michael_Sanchez · September 21, 2022, 1:47pm

Hello team,

My Elasticsearch cluster v7.10 has the following error:

reason" : "[parent] Data too large, data for [<http_request>] would be [17110531128/15.9gb], which is larger than the limit of [16320875724/15.1gb], real usage: [17110531128/15.9gb], new bytes reserved: [0/0b], usages [request=0/0b, fielddata=0/0b, in_flight_requests=0/0b, model_inference=0/0b, accounting=669569972/638.5mb]",
    "bytes_wanted" : 17110531128,
    "bytes_limit" : 16320875724,
    "durability" : "PERMANENT"
  },
  "status" : 429
}

I haven't been able to execute a single query. This server has 8 cores and 32 Gb RAM. The heap size is set to 16g. Can anyone provide any hints on this, please? Thanks in advance.

system · September 21, 2022, 1:47pm

Elasticsearch 7.10 is EOL and no longer supported. Please upgrade ASAP.

(This is an automated response from your friendly Elastic bot. Please report this post if you have any suggestions or concerns )

Christian_Dahlqvist · September 21, 2022, 3:54pm

It looks like you do not have enough heap space. What is the full output of the cluster stats API?

Michael_Sanchez · September 21, 2022, 7:26pm

Thanks @Christian_Dahlqvist for your response:

Here's heap size:

heap.current heap.percent heap.max
      15.2gb           95     16gb

Also the sever is 32gb of RAM and the heap size in jvm.options is -Xms16g -Xmx16g

The cluster has more than 200 active shards and this is a single node with no replicas, does that influence?.

Here's the output of the stats API

{
  "_nodes" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "cluster_name" : "qis-ep",
  "cluster_uuid" : "JZF8_1tGRe2mN9JubP1u_A",
  "timestamp" : 1663788253589,
  "status" : "red",
  "indices" : {
    "count" : 18,
    "shards" : {
      "total" : 18,
      "primaries" : 18,
      "replication" : 0.0,
      "index" : {
        "shards" : {
          "min" : 1,
          "max" : 1,
          "avg" : 1.0
        },
        "primaries" : {
          "min" : 1,
          "max" : 1,
          "avg" : 1.0
        },
        "replication" : {
          "min" : 0.0,
          "max" : 0.0,
          "avg" : 0.0
        }
      }
    },
    "docs" : {
      "count" : 32073266,
      "deleted" : 2571
    },
    "store" : {
      "size_in_bytes" : 12300364879,
      "reserved_in_bytes" : 0
    },
    "fielddata" : {
      "memory_size_in_bytes" : 0,
      "evictions" : 0
    },
    "query_cache" : {
      "memory_size_in_bytes" : 0,
      "total_count" : 0,
      "hit_count" : 0,
      "miss_count" : 0,
      "cache_size" : 0,
      "cache_count" : 0,
      "evictions" : 0
    },
    "completion" : {
      "size_in_bytes" : 0
    },
    "segments" : {
      "count" : 253,
      "memory_in_bytes" : 207333756,
      "terms_memory_in_bytes" : 174175296,
      "stored_fields_memory_in_bytes" : 130136,
      "term_vectors_memory_in_bytes" : 0,
      "norms_memory_in_bytes" : 25046272,
      "points_memory_in_bytes" : 0,
      "doc_values_memory_in_bytes" : 7982052,
      "index_writer_memory_in_bytes" : 0,
      "version_map_memory_in_bytes" : 0,
      "fixed_bit_set_memory_in_bytes" : 296,
      "max_unsafe_auto_id_timestamp" : 1663778615179,
      "file_sizes" : { }
    },
    "mappings" : {
      "field_types" : [
        {
          "name" : "binary",
          "count" : 8,
          "index_count" : 1
        },
        {
          "name" : "boolean",
          "count" : 59,
          "index_count" : 18
        },
        {
          "name" : "date",
          "count" : 2008,
          "index_count" : 273
        },
        {
          "name" : "flattened",
          "count" : 9,
          "index_count" : 1
        },
        {
          "name" : "float",
          "count" : 2728991,
          "index_count" : 13
        },
        {
          "name" : "integer",
          "count" : 22,
          "index_count" : 5
        },
        {
          "name" : "ip",
          "count" : 10,
          "index_count" : 10
        },
        {
          "name" : "keyword",
          "count" : 372311,
          "index_count" : 277
        },
        {
          "name" : "long",
          "count" : 6966,
          "index_count" : 32
        },
        {
          "name" : "nested",
          "count" : 11,
          "index_count" : 6
        },
        {
          "name" : "object",
          "count" : 275449,
          "index_count" : 276
        },
        {
          "name" : "text",
          "count" : 372090,
          "index_count" : 277
        },
        {
          "name" : "unsigned_long",
          "count" : 1764,
          "index_count" : 12
        }
      ]
    },
    "analysis" : {
      "char_filter_types" : [ ],
      "tokenizer_types" : [ ],
      "filter_types" : [ ],
      "analyzer_types" : [ ],
      "built_in_char_filters" : [ ],
      "built_in_tokenizers" : [ ],
      "built_in_filters" : [ ],
      "built_in_analyzers" : [ ]
    }
  },
  "nodes" : {
    "count" : {
      "total" : 1,
      "coordinating_only" : 0,
      "data" : 1,
      "data_cold" : 1,
      "data_content" : 1,
      "data_hot" : 1,
      "data_warm" : 1,
      "ingest" : 1,
      "master" : 1,
      "ml" : 1,
      "remote_cluster_client" : 1,
      "transform" : 1,
      "voting_only" : 0
    },
    "versions" : [
      "7.10.2"
    ],
    "os" : {
      "available_processors" : 8,
      "allocated_processors" : 8,
      "names" : [
        {
          "name" : "Linux",
          "count" : 1
        }
      ],
      "pretty_names" : [
        {
          "pretty_name" : "Debian GNU/Linux 10 (buster)",
          "count" : 1
        }
      ],
      "mem" : {
        "total_in_bytes" : 32893620224,
        "free_in_bytes" : 3265601536,
        "used_in_bytes" : 29628018688,
        "free_percent" : 10,
        "used_percent" : 90
      }
    },
    "process" : {
      "cpu" : {
        "percent" : 66
      },
      "open_file_descriptors" : {
        "min" : 537,
        "max" : 537,
        "avg" : 537
      }
    },
    "jvm" : {
      "max_uptime_in_millis" : 49787,
      "versions" : [
        {
          "version" : "11.0.14",
          "vm_name" : "OpenJDK 64-Bit Server VM",
          "vm_version" : "11.0.14+9-post-Debian-1deb10u1",
          "vm_vendor" : "Debian",
          "bundled_jdk" : true,
          "using_bundled_jdk" : false,
          "count" : 1
        }
      ],
      "mem" : {
        "heap_used_in_bytes" : 13734673448,
        "heap_max_in_bytes" : 17179869184
      },
      "threads" : 72
    },
    "fs" : {
      "total_in_bytes" : 1055813427200,
      "free_in_bytes" : 971787694080,
      "available_in_bytes" : 918083932160
    },
    "plugins" : [ ],
    "network_types" : {
      "transport_types" : {
        "security4" : 1
      },
      "http_types" : {
        "security4" : 1
      }
    },
    "discovery_types" : {
      "single-node" : 1
    },
    "packaging_types" : [
      {
        "flavor" : "default",
        "type" : "deb",
        "count" : 1
      }
    ],
    "ingest" : {
      "number_of_pipelines" : 1,
      "processor_stats" : {
        "gsub" : {
          "count" : 0,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 0
        },
        "script" : {
          "count" : 0,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 0
        }
      }
    }
  }
}

Also sometimes when I restart elasticsearch for a couple of seconds I'm able to execute some queries than the heap size gets full very fast and get the 429 permanent error again.

Christian_Dahlqvist · September 21, 2022, 8:05pm

It looks like you have 18 indices, each with a single primary shard - not 200 active shards.

These seem to take up a limited amount of memory (207MB or so). Nothing here that seems to be causing problems.

One thing that stands out is the mappings, which seem quite large for that number of indices.

What type of data is this? How are you querying it?

Michael_Sanchez · September 21, 2022, 8:11pm

@Christian_Dahlqvist

I know for sure there are 200+ shards that output was from the stats API executed right after a service reboot before consuming all the heap and not being able to perform more queries. I ran the command immediately so at least I can get some info to show it to you. This is log data by the way.

Michael_Sanchez · September 21, 2022, 8:20pm

@Christian_Dahlqvist

This is system information data from endpoints. The format is nested JSON. The backend server application is written in Python.

Michael_Sanchez · September 21, 2022, 8:22pm

There are other indices in the elasticsearch server. There are logs, alerts, commands, batches, and devices (i.e. system info). I think the devices index is what's causing the problem due to large 3-4 MB

Michael_Sanchez · September 22, 2022, 4:23pm

I don't know where to start troubleshooting this cluster. The physical RAM is 30 and the heap size is 16g but whenever I start Elasticsearch the heap gets full in less than a minute and I get the 420 durability permanent error.

dadoonet · September 22, 2022, 4:36pm

I'm not sure if this will solve what you are seeing, but I know that lot of optimizations have been done since 7.10. So you should upgrade to 8.x the latest

Michael_Sanchez · September 22, 2022, 4:49pm

@dadoonet Thanks so much I know that upgrading to 8.x is a good path to solve a lot of issues from 7.x but that's not an option for me right now. Is there anything else I could do instead of upgrading to solve this? Any workarounds that you know of? Thanks in advance.

Christian_Dahlqvist · September 22, 2022, 4:49pm

I would stop all queries and indexing to the cluster and start it up. That way you should be able to get complete output from the cluster stats API. That will give us more accurate information about the state of the cluster.

Michael_Sanchez · September 22, 2022, 5:04pm

@Christian_Dahlqvist Thanks for the reply. I will do that now, any stats in specific that you would like to see?.

dadoonet · September 22, 2022, 5:28pm

At least upgrade to the latest 7.17.

Michael_Sanchez · September 22, 2022, 6:45pm

Hey, @dadoonet I will explore that option with the team. Also to provide more context prior to the cluster entering this stage the cluster was in at least an operative state. I modified the number of the replicas on all the endpoints to be 0 instead of 1 to solve the YELLOW state of the indices and then all of this began to happen.

system · October 20, 2022, 6:45pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch storage problem Elasticsearch	6	372	August 16, 2021
ES 7.8 often giving errors "Data too large for http_request", but some evidence makes no sense Elasticsearch	9	365	October 21, 2021
{"statusCode":429,"message":"circuit_breaking_exception: [circuit_breaking_exception] Reason: [parent] Data too large, data for [<http_request>] would be [1052991986/1004.2mb], which is larger than the limit of [805306368/768mb], real usage: [1052991848/1 Elasticsearch	11	685	June 27, 2023
Increased heap usage after updating from 7.2.0 => 7.8.0 Elasticsearch	3	388	October 6, 2020
ElasticSearch cluster down due to high memory usage Elasticsearch	2	495	August 7, 2023

Elasticsearch 7.10 durability permanent

Related topics