::: {ip-10-16-13-67-master}{MStdCN0_TSihH2muPN7NcQ}{zDNK_-UIQhaJy6tLBwCAqg}{10.240.53.104}{10.240.53.104:9300}{ml.machine_memory=32905895936, node_type=hot, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}
Hot threads at 2020-04-02T18:11:54.911Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:
86.6% (433ms out of 500ms) cpu usage by thread 'elasticsearch[ip-10-16-13-67-master][clusterApplierService#updateTask][T#1]'
9/10 snapshots sharing following 21 elements
org.apache.logging.log4j.core.Logger$PrivateConfig.<init>(Logger.java:383)
org.apache.logging.log4j.core.Logger.updateConfiguration(Logger.java:365)
org.apache.logging.log4j.core.LoggerContext.updateLoggers(LoggerContext.java:654)
org.apache.logging.log4j.core.LoggerContext.updateLoggers(LoggerContext.java:643)
org.apache.logging.log4j.core.config.Configurator.setLevel(Configurator.java:296)
org.elasticsearch.common.logging.Loggers.setLevel(Loggers.java:129)
org.elasticsearch.common.logging.Loggers.setLevel(Loggers.java:111)
org.elasticsearch.common.settings.ClusterSettings$LoggingSettingUpdater.apply(ClusterSettings.java:163)
org.elasticsearch.common.settings.ClusterSettings$LoggingSettingUpdater.apply(ClusterSettings.java:121)
org.elasticsearch.common.settings.AbstractScopedSettings$SettingUpdater.lambda$updater$0(AbstractScopedSettings.java:609)
org.elasticsearch.common.settings.AbstractScopedSettings$SettingUpdater$$Lambda$4307/2017848792.run(Unknown Source)
org.elasticsearch.common.settings.AbstractScopedSettings.applySettings(AbstractScopedSettings.java:190)
org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:461)
org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:419)
org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:163)
org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681)
org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)
org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
unique snapshot
org.apache.logging.log4j.core.LoggerContext.updateLoggers(LoggerContext.java:655)
org.apache.logging.log4j.core.LoggerContext.updateLoggers(LoggerContext.java:643)
org.apache.logging.log4j.core.config.Configurator.setLevel(Configurator.java:296)
org.elasticsearch.common.logging.Loggers.setLevel(Loggers.java:129)
org.elasticsearch.common.logging.Loggers.setLevel(Loggers.java:111)
org.elasticsearch.common.settings.ClusterSettings$LoggingSettingUpdater.apply(ClusterSettings.java:163)
org.elasticsearch.common.settings.ClusterSettings$LoggingSettingUpdater.apply(ClusterSettings.java:121)
org.elasticsearch.common.settings.AbstractScopedSettings$SettingUpdater.lambda$updater$0(AbstractScopedSettings.java:609)
org.elasticsearch.common.settings.AbstractScopedSettings$SettingUpdater$$Lambda$4307/2017848792.run(Unknown Source)
org.elasticsearch.common.settings.AbstractScopedSettings.applySettings(AbstractScopedSettings.java:190)
org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:461)
org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:419)
org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:163)
org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681)
org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)
org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
Your master is spending all its time configuring loggers? I don't think I've ever seen that before. Do you have an unusually complicated logging configuration?
Nothing complex. One thing I did notice was a filebeat index trying to create repeatedly and was failing due to “doc” type and it appeared to be caused by old filebeat* template. After removing that template the index created fine. Wondering if that was the underlying issue.
I also mentioned in that channel seeing a lot of “authorization is not allowed”. Now while I know that not in DEBUG this isn’t actually being logged but doesn’t mean the context switching still isn’t happening. How do you stop this “noise” from happening when using FOSS model?
create-index runs at priority URGENT whereas mapping updates are HIGH and the ILM tasks are NORMAL, so if you spam your cluster hard enough with create-index requests (including bulks that need to create indices) then that can cause starvation for other tasks.
I'm still surprised that we saw the master spending 500+ms updating loggers. Looks like you've set logger._root dynamically -- might be worth removing that setting.
The message not opting out of the query cache; authorization is not allowed doesn't appear in the OSS distribution. I don't know what context-switching you are asking about.
Yes when debug logging for _root was enabled I am flooded with these messages and the entire setup is OSS so not sure how it “doesn’t appear in OSS”?
[2020-04-02T13:51:46,726][DEBUG][o.e.x.s.a.a.OptOutQueryCache] [ip-10-16-13-67-master] [.reporting-2019.05.12] not opting out of the query cache; authorization is not allowed
[2020-04-02T13:51:46,726][DEBUG][o.e.x.s.a.a.OptOutQueryCache] [ip-10-16-13-67-master] [.kibana_task_manager] not opting out of the query cache; authorization is not allowed
2019.05.05] not opting out of the query cache; authorization is not allowed
[2020-04-02T13:51:46,727][DEBUG][o.e.x.s.a.a.OptOutQueryCache] [ip-10-16-13-67-master] [.reporting-2019.05.12] not opting out of the query cache; authorization is not allowed
[2020-04-02T13:51:46,727][DEBUG][o.e.x.s.a.a.OptOutQueryCache] [ip-10-16-13-67-master] [.reporting-2019.05.12] not opting out of the query cache; authorization is not allowed
[2020-04-02T13:54:25,558][DEBUG][o.e.i.f.p.SortedSetDVOrdinalsIndexFieldData] [ip-10-16-13-67-master] global-ordinals [shard.index][106] took [573micros]
[2020-04-02T13:54:25,558][DEBUG][o.e.i.f.p.SortedSetDVOrdinalsIndexFieldData] [ip-10-16-13-67-master] global-ordinals [shard.state][3] took [163.4micros]
[2020-04-02T13:54:25,558][DEBUG][o.e.i.f.p.SortedSetDVOrdinalsIndexFieldData] [ip-10-16-13-67-master] global-ordinals [shard.node][1] took [70.5micros]
[2020-04-02T13:54:25,558][DEBUG][o.e.i.f.p.SortedSetDVOrdinalsIndexFieldData] [ip-10-16-13-67-master] global-ordinals [source_node.name][1] took [77.2micros]
[2020-04-02T13:54:25,559][DEBUG][o.e.i.f.p.SortedSetDVOrdinalsIndexFieldData] [ip-10-16-13-67-master] global-ordinals [source_node.uuid][1] took [76.7micros]
[2020-04-02T13:54:25,559][DEBUG][o.e.x.s.a.a.OptOutQueryCache] [ip-10-16-13-67-master] [.monitoring-es-6-2020.04.02] not opting out of the query cache; authorization is not allowed
[2020-04-02T13:54:23,612][DEBUG][o.e.x.s.a.a.OptOutQueryCache] [ip-10-16-13-67-master] [.kibana_2] not opting out of the query cache; authorization is not allowed
[2020-04-02T13:54:23,612][DEBUG][o.e.x.s.a.a.OptOutQueryCache] [ip-10-16-13-67-master] [.kibana_2] not opting out of the query cache; authorization is not allowed
[2020-04-02T13:54:23,612][DEBUG][o.e.x.s.a.a.OptOutQueryCache] [ip-10-16-13-67-master] [.kibana_2] not opting out of the query cache; authorization is not allowed
[2020-04-02T13:54:23,612][DEBUG][o.e.x.s.a.a.OptOutQueryCache] [ip-10-16-13-67-master] [.kibana_2] not opting out of the query cache; authorization is not allowed
[2020-04-02T13:54:23,612][DEBUG][o.e.x.s.a.a.OptOutQueryCache] [ip-10-16-13-67-master] [.kibana_2] not opting out of the query cache; authorization is not allowed
This is the default distribution, not the OSS distribution. The default distribution includes all the free features including those that are not purely OSS.
Coming back to the question of context switching, what exactly is your concern there? It sounds like you are worried about the cost of a DEBUG-level log when DEBUG-level logging is disabled. I don't think that involves any context-switching, it should be very cheap.
Yes I'm leveraging the free features of x-pack but shouldn't there be something to enabled in elasticsearch.yml to state using the "free" license and circumvent any of this code?
The sheer amount of volume of these messages that spits out when DEBUG is enabled is what I am concerned about. Has to add up. Without DEBUG doesn't mean the code isn't run just means the line logger.debug("not opting out of the query cache; authorization is not allowed"); isn't but the licenseState.isAuthAllowed() is still run unnecessarily.
If I didn't see it so frequently in the log for .kibana2, .reporting, etc indexes then I would agree it should be cheap but that frequently has to have some performance hit?
Unless you've evidence from a profiler that we're spending an excessive amount of time in licenseState.isAuthAllowed() I don't think this is cause for concern, no. This code is part of the security functionality which is included in the free basic license and enabled by default, hence why it's being called.
It looks like you can completely disable security by setting xpack.security.enabled: false which should make this logging go away, but I don't expect that to have much impact on anything.
Setting logger._root: DEBUG is rarely useful and not recommended. DEBUG-level logging is often only meaningful when read alongside the code itself, and even then you normally want to focus on a small number of packages. There's no reason to expect this level of logging to have a manageable volume or any actionable content.
Thank You!
If I switch to OSS packages I lose the xpack monitoring correct?
I will look at the xpack.security.enabled: false setting to see if that helps but to your point my original issue was probably the filebeat* template and the URGENT create index causing issues.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.