Elasticsearch failes to load after log4j2.properties update

I was trying to drop the logging level of our cluster and after reading through the Logging Configuration page I thought I had a good idea of how to drop the level down to warn. After reading through the documentation and then going through out log4j2.properties file I found that our name of logging hierarchy is org.elasticsearch.action (see below @ very bottom). With that in mind in the Dev Tools I ran:

PUT /_cluster/settings
{
  "transient": {
    "logger.org.elasticsearch.action": "warn"
  }
}  

This was acknowledge as true by the cluster. After a restart of the node ES fails to load. After some troubleshooting I realized that I may have forgot to add the .level after action which could be throwing log4j2.properties off but I'm not even sure if thats correct as thats not part of the command listed in the documentation.

Also, when I run

GET /_cluster/settings?include_defaults=true

I can see that the above command updated the settings and a little bit farther down theres logger.level: INFO as well which is probably where this problem is coming from.

{
  "persistent" : {
    "xpack" : {
      "monitoring" : {
        "collection" : {
          "enabled" : "true"
        }
      }
    }
  },
  "transient" : {
    "logger" : {
      "org" : {
        "elasticsearch" : {
          "action" : "warn"
        }
      }
    }
  },
  "defaults" : {
    "cluster" : {
      "routing" : {
        "use_adaptive_replica_selection" : "false",
        "rebalance" : {
          "enable" : "all"
        },
        "allocation" : {
          "node_concurrent_incoming_recoveries" : "2",
          "node_initial_primaries_recoveries" : "4",
          "same_shard" : {
            "host" : "false"
          },
          "total_shards_per_node" : "-1",
          "type" : "balanced",
          "disk" : {
            "threshold_enabled" : "true",
            "watermark" : {
              "low" : "85%",
              "flood_stage" : "95%",
              "high" : "90%"
            },
            "include_relocations" : "true",
            "reroute_interval" : "60s"
          },
          "awareness" : {
            "attributes" : [ ]
          },
          "balance" : {
            "index" : "0.55",
            "threshold" : "1.0",
            "shard" : "0.45"
          },
          "enable" : "all",
          "node_concurrent_outgoing_recoveries" : "2",
          "allow_rebalance" : "indices_all_active",
          "cluster_concurrent_rebalance" : "2",
          "node_concurrent_recoveries" : "2"
        }
      },
      "indices" : {
        "tombstones" : {
          "size" : "500"
        },
        "close" : {
          "enable" : "true"
        }
      },
      "nodes" : {
        "reconnect_interval" : "10s"
      },
      "persistent_tasks" : {
        "allocation" : {
          "enable" : "all"
        }
      },
      "blocks" : {
        "read_only_allow_delete" : "false",
        "read_only" : "false"
      },
      "service" : {
        "slow_task_logging_threshold" : "30s"
      },
      "name" : "infradata_np",
      "max_shards_per_node" : "1000",
      "remote" : {
        "node" : {
          "attr" : ""
        },
        "initial_connect_timeout" : "30s",
        "connect" : "true",
        "connections_per_cluster" : "3"
      },
      "info" : {
        "update" : {
          "interval" : "30s",
          "timeout" : "15s"
        }
      }
    },
    "no" : {
      "model" : {
        "state" : {
          "persist" : "false"
        }
      }
    },
    "logger" : {
      "level" : "INFO"
    },
  **Theres more than this but this is whats relevant for now**

Whats the best way to fix this? I don's see an extra line in log4j2.properties fille thats been created or anything. Should the command have had "logger.org.elasticsearch.action.level"" "warn" to make this work? Pretty sure this setting is whats causing the issue. We do have a log4j2.properties copied called log4j2.properties_original that might be usefull. I just don't want to go changing things right now that could make this worse.

Part of the log4js.properties file that I got the name of logging hierarchy from.

status = error

# log action execution errors for easier debugging
logger.action.name = org.elasticsearch.action
logger.action.level = debug

appender.console.type = Console
appender.console.name = console
appender.console.layout.type = PatternLayout
appender.console.layout.pattern = [%d{ISO8601}][%-5p][%-25c{1.}] [%node_name]%marker %m%n

appender.rolling.type = RollingFile
appender.rolling.name = rolling
appender.rolling.fileName = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}.log
appender.rolling.layout.type = PatternLayout
appender.rolling.layout.pattern = [%d{ISO8601}][%-5p][%-25c{1.}] [%node_name]%marker %.-10000m%n
appender.rolling.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}-%d{yyyy-MM-dd}-%i.log.gz
**Note: this file continues I just added what I thought was relevant to this thread**

Is the best thing to do here to just run:

PUT /_cluster/settings
{
  "transient": {
    "logger.org.elasticsearch.action": null
  }
}

Just to set this value so that its not read by ES.

You may also have to comment out the loggerl.action.name = org.elasticsearch.action and the logger.action.level = debug lines within the log4j2.properties file in order to get things working again.

The order of operations that I used was to comment out the logger lines above, save the file, and then run the PUT command (null). I'm not sure if the original PUT command I applied changed the rw permissions of a bunch of files in our system but I had to go into the node that wasn't starting and change a lot of file permissions back to drw because many of them had been changed to just write which was stopping the files from loading. Once I update the appropriate files ES was able to start back up. I did test out whether or not I could uncomment the log4j2.properties file lines which I did and was able to restart the service with no problems.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.