Filebeat doesn't sent the logs to elastic cloud (circuit_breaking_exception)

David_Oceans · July 30, 2020, 10:32am

Hi!

I don't know why I get this error (agents filebeat in kubernetes cluster), if we increased the size of the cluster so that it had more performance and changed the indexes so that instead of 1 shard and 1 replica they had 2 shards and 1 replica so that it was optimal, but now it works worse than before.

This is the error I see in filebeat pod.

2020-07-29T12:37:35.671Z	ERROR	[elasticsearch]	elasticsearch/client.go:223	failed to perform any bulk index operations: 429 Too Many Requests: {"error":{"root_cause":[{"type":"circuit_breaking_exception","reason":"[parent] Data too large, data for [<http_request>] would be [2041914692/1.9gb], which is larger than the limit of [2040109465/1.8gb], real usage: [2041909248/1.9gb], new bytes reserved: [5444/5.3kb], usages [request=0/0b, fielddata=180233/176kb, in_flight_requests=5444/5.3kb, accounting=27779116/26.4mb]","bytes_wanted":2041914692,"bytes_limit":2040109465,"durability":"PERMANENT"}],"type":"circuit_breaking_exception","reason":"[parent] Data too large, data for [<http_request>] would be [2041914692/1.9gb], which is larger than the limit of [2040109465/1.8gb], real usage: [2041909248/1.9gb], new bytes reserved: [5444/5.3kb], usages [request=0/0b, fielddata=180233/176kb, in_flight_requests=5444/5.3kb, accounting=27779116/26.4mb]","bytes_wanted":2041914692,"bytes_limit":2040109465,"durability":"PERMANENT"},"status":429}

Any suggestion?

Thank you very much

David_Oceans · July 31, 2020, 6:59am

Hi

Any reason why it happens?

Thank you

Christian_Dahlqvist · July 31, 2020, 7:03am

How many beats are indexing into the cluster? How many indices and shards are you actively index into? How much data do you have in the cluster?

If you could provide the full output of the cluster stats API we would get a better idea about the state of the cluster.

David_Oceans · July 31, 2020, 7:53am

This helps?

As I said I have 2 shards and 1 replica for almost all my indices

Version: 7.8.0

Nodes: 3
Disk Available
67.07%
162.3 GB / 242.0 GB

JVM Heap
62.49%
2.9 GB / 4.6 GB

indices: 76
Documents: 54,448,810
Disk Usage: 72.8 GB
Primary Shards: 116
Replica Shards: 116

Thank you

Christian_Dahlqvist · July 31, 2020, 8:14am

Can you please provide the full output of the cluster stats API?

David_Oceans · July 31, 2020, 8:30am

{
  "_nodes" : {
    "total" : 3,
    "successful" : 3,
    "failed" : 0
  },
  "cluster_name" : "8a3af794e5b7464c9389dd64dee07860",
  "cluster_uuid" : "Ii2dPs_ITa-FL1ZDfTZKMA",
  "timestamp" : 1596184190552,
  "status" : "green",
  "indices" : {
    "count" : 69,
    "shards" : {
      "total" : 204,
      "primaries" : 102,
      "replication" : 1.0,
      "index" : {
        "shards" : {
          "min" : 2,
          "max" : 4,
          "avg" : 2.9565217391304346
        },
        "primaries" : {
          "min" : 1,
          "max" : 2,
          "avg" : 1.4782608695652173
        },
        "replication" : {
          "min" : 1.0,
          "max" : 1.0,
          "avg" : 1.0
        }
      }
    },
    "docs" : {
      "count" : 48798886,
      "deleted" : 694768
    },
    "store" : {
      "size_in_bytes" : 69728558111
    },
    "fielddata" : {
      "memory_size_in_bytes" : 249840,
      "evictions" : 0
    },
    "query_cache" : {
      "memory_size_in_bytes" : 89281611,
      "total_count" : 19698157,
      "hit_count" : 13508225,
      "miss_count" : 6189932,
      "cache_size" : 2295,
      "cache_count" : 4105,
      "evictions" : 1810
    },
    "completion" : {
      "size_in_bytes" : 0
    },
    "segments" : {
      "count" : 2261,
      "memory_in_bytes" : 49403640,
      "terms_memory_in_bytes" : 40103368,
      "stored_fields_memory_in_bytes" : 2331448,
      "term_vectors_memory_in_bytes" : 0,
      "norms_memory_in_bytes" : 69504,
      "points_memory_in_bytes" : 0,
      "doc_values_memory_in_bytes" : 6899320,
      "index_writer_memory_in_bytes" : 135718268,
      "version_map_memory_in_bytes" : 1679732,
      "fixed_bit_set_memory_in_bytes" : 11188400,
      "max_unsafe_auto_id_timestamp" : 1596182513602,
      "file_sizes" : { }
    },
    "mappings" : {
      "field_types" : [
        {
          "name" : "alias",
          "count" : 1122,
          "index_count" : 33
        },
        {
          "name" : "binary",
          "count" : 11,
          "index_count" : 4
        },
        {
          "name" : "boolean",
          "count" : 3385,
          "index_count" : 54
        },
        {
          "name" : "date",
          "count" : 3314,
          "index_count" : 66
        },
        {
          "name" : "double",
          "count" : 843,
          "index_count" : 36
        },
        {
          "name" : "flattened",
          "count" : 2,
          "index_count" : 2
        },
        {
          "name" : "float",
          "count" : 896,
          "index_count" : 38
        },
        {
          "name" : "geo_point",
          "count" : 264,
          "index_count" : 33
        },
        {
          "name" : "geo_shape",
          "count" : 3,
          "index_count" : 3
        },
        {
          "name" : "half_float",
          "count" : 24,
          "index_count" : 6
        },
        {
          "name" : "integer",
          "count" : 145,
          "index_count" : 13
        },
        {
          "name" : "ip",
          "count" : 3399,
          "index_count" : 33
        },
        {
          "name" : "keyword",
          "count" : 77709,
          "index_count" : 68
        },
        {
          "name" : "long",
          "count" : 29587,
          "index_count" : 57
        },
        {
          "name" : "nested",
          "count" : 78,
          "index_count" : 46
        },
        {
          "name" : "object",
          "count" : 20845,
          "index_count" : 67
        },
        {
          "name" : "scaled_float",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "short",
          "count" : 3334,
          "index_count" : 34
        },
        {
          "name" : "text",
          "count" : 3493,
          "index_count" : 60
        }
      ]
    },
    "analysis" : {
      "char_filter_types" : [ ],
      "tokenizer_types" : [ ],
      "filter_types" : [
        {
          "name" : "pattern_capture",
          "count" : 1,
          "index_count" : 1
        }
      ],
      "analyzer_types" : [
        {
          "name" : "custom",
          "count" : 1,
          "index_count" : 1
        }
      ],
      "built_in_char_filters" : [ ],
      "built_in_tokenizers" : [
        {
          "name" : "uax_url_email",
          "count" : 1,
          "index_count" : 1
        }
      ],
      "built_in_filters" : [
        {
          "name" : "lowercase",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "unique",
          "count" : 1,
          "index_count" : 1
        }
      ],
      "built_in_analyzers" : [ ]
    }
  },
  "nodes" : {
    "count" : {
      "total" : 3,
      "coordinating_only" : 0,
      "data" : 2,
      "ingest" : 2,
      "master" : 3,
      "ml" : 0,
      "remote_cluster_client" : 3,
      "transform" : 2,
      "voting_only" : 1
    },
    "versions" : [
      "7.8.0"
    ],
    "os" : {
      "available_processors" : 54,
      "allocated_processors" : 6,
      "names" : [
        {
          "name" : "Linux",
          "count" : 3
        }
      ],
      "pretty_names" : [
        {
          "pretty_name" : "CentOS Linux 7 (Core)",
          "count" : 3
        }
      ],
      "mem" : {
        "total_in_bytes" : 9663676416,
        "free_in_bytes" : 3768320,
        "used_in_bytes" : 9659908096,
        "free_percent" : 0,
        "used_percent" : 100
      }
    },
    "process" : {
      "cpu" : {
        "percent" : 6
      },
      "open_file_descriptors" : {
        "min" : 375,
        "max" : 1558,
        "avg" : 1140
      }
    },
    "jvm" : {
      "max_uptime_in_millis" : 2163566242,
      "versions" : [
        {
          "version" : "14.0.1",
          "vm_name" : "OpenJDK 64-Bit Server VM",
          "vm_version" : "14.0.1+7",
          "vm_vendor" : "AdoptOpenJDK",
          "bundled_jdk" : true,
          "using_bundled_jdk" : true,
          "count" : 3
        }
      ],
      "mem" : {
        "heap_used_in_bytes" : 2774695512,
        "heap_max_in_bytes" : 4938792960
      },
      "threads" : 188
    },
    "fs" : {
      "total_in_bytes" : 259845521408,
      "free_in_bytes" : 185039581184,
      "available_in_bytes" : 185039581184
    },
    "plugins" : [
      {
        "name" : "repository-s3",
        "version" : "7.8.0",
        "elasticsearch_version" : "7.8.0",
        "java_version" : "1.8",
        "description" : "The S3 repository plugin adds S3 repositories",
        "classname" : "org.elasticsearch.repositories.s3.S3RepositoryPlugin",
        "extended_plugins" : [ ],
        "has_native_controller" : false
      },
      {
        "name" : "repository-gcs",
        "version" : "7.8.0",
        "elasticsearch_version" : "7.8.0",
        "java_version" : "1.8",
        "description" : "The GCS repository plugin adds Google Cloud Storage support for repositories.",
        "classname" : "org.elasticsearch.repositories.gcs.GoogleCloudStoragePlugin",
        "extended_plugins" : [ ],
        "has_native_controller" : false
      }
    ],
    "network_types" : {
      "transport_types" : {
        "security4" : 3
      },
      "http_types" : {
        "security4" : 3
      }
    },
    "discovery_types" : {
      "zen" : 3
    },
    "packaging_types" : [
      {
        "flavor" : "default",
        "type" : "docker",
        "count" : 3
      }
    ],
    "ingest" : {
      "number_of_pipelines" : 16,
      "processor_stats" : {
        "append" : {
          "count" : 15662626,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 2252
        },
        "conditional" : {
          "count" : 31630406,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 21143
        },
        "date" : {
          "count" : 7913266,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 10522
        },
        "geoip" : {
          "count" : 15826532,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 35533
        },
        "grok" : {
          "count" : 23750528,
          "failed" : 51795,
          "current" : 0,
          "time_in_millis" : 47344
        },
        "gsub" : {
          "count" : 0,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 0
        },
        "pipeline" : {
          "count" : 0,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 0
        },
        "remove" : {
          "count" : 23745827,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 3818
        },
        "rename" : {
          "count" : 23745827,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 4370
        },
        "script" : {
          "count" : 0,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 0
        },
        "set" : {
          "count" : 7831313,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 812
        },
        "split" : {
          "count" : 15826532,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 6064
        },
        "user_agent" : {
          "count" : 7913266,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 2732
        }
      }
    }
  }
}

Christian_Dahlqvist · July 31, 2020, 8:41am

That seems like a lot of shards given the data size and the size of the cluster. You are also using ingest pipelines so am not sure how much memory that will use. I would start by reducing the number of indices and shards. Set the number of primary shards to 1 for all indices and looking using ILM to get a larger average shard size if you are not already.

David_Oceans · July 31, 2020, 8:46am

If I understood correctly, the problem is that I have gone from having 1 shard / 1 replica to having 2 shards / 1 replica, and this is not good in my case because the size of the shards is very small?

About "using ILM to get a larger average shard size if you are not already" I have a ILM to rotate every day the indexes but I have indexes that only have less than 2gb

Christian_Dahlqvist · July 31, 2020, 8:48am

Correct. You could also switch from daily to e.g. monthly indices.

David_Oceans · July 31, 2020, 8:50am

And this errors parsing the fields, they are related with not sending the logs?

system · August 28, 2020, 8:50am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
[circuit_breaking_exception] [parent] Data too large Elasticsearch	14	7383	September 24, 2020
Filebeat is not sending logs to ELastic search Beats filebeat	9	1100	October 9, 2020
Courier fetch: 7 of 80 shared failed Kibana	2	982	July 6, 2017
Large Scale elastic Search Logstash collection system Elasticsearch	6	451	July 6, 2017
How to Improve ELK Performance Elasticsearch	9	963	July 21, 2022

Filebeat doesn't sent the logs to elastic cloud (circuit_breaking_exception)

Related topics