Kibana requestTimeout and shardTimeout are not respected

Hi,

Here's out kibana configuration in kubernetes:

apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
  annotations:
    common.k8s.elastic.co/controller-version: 1.0.1
  creationTimestamp: "2020-04-09T20:01:31Z"
  generation: 5
  labels:
    app.kubernetes.io/instance: kibana
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: kibana
    app.kubernetes.io/version: 7.6.1
    helm.sh/chart: kibana-1.0.1
  name: kibana
  namespace: elastic-system
  resourceVersion: "7317456"
  selfLink: /apis/kibana.k8s.elastic.co/v1/namespaces/elastic-system/kibanas/kibana
  uid: f9e74876-1c3f-4e2d-b37a-387209ef0f6e
spec:
  config:
    elasticsearch.hosts:
    - http://elasticsearch-data.elastic-system.svc.cluster.local:9200/
    elasticsearch.password: "SECRET_PASSWORD_HERE"
    elasticsearch.requestHeadersWhitelist:
    - es-security-runas-user
    - authorization
    elasticsearch.requestTimeout: 120000
    elasticsearch.shardTimeout: 120000
    elasticsearch.ssl.verificationMode: none
    elasticsearch.username: kibana
    server.host: 0.0.0.0
    xpack.monitoring.elasticsearch.requestHeadersWhitelist:
    - es-security-runas-user
    - authorization
    xpack.monitoring.enabled: true
    xpack.reporting.enabled: true
    xpack.security.enabled: true
  count: 2
  elasticsearchRef:
    name: ""
  http:
    service:
      spec:
        type: ClusterIP
  podTemplate:
    spec:
      containers:
      - name: kibana
        resources:
          limits:
            cpu: "1"
            memory: 2Gi
          requests:
            cpu: "1"
            memory: 2Gi
  version: 7.6.1

Our indexing rate ranges from (3000-8000 docs per second) with a total of 250 million docs per day.
We have 10 data nodes running with the following capacity and we have a total of around 4.5 billion documents at the moment:

resources:
  limits:
    cpu: "2"
    memory: 5Gi
  requests:
    cpu: "2"
    memory: 5Gi
javaOpts: "-Xms2500m -Xmx2500m"
persistenceStorage: "1000Gi"

Here's out heap usage in the past 1 week (max heap is 25 GB):

When I run a long running query to get all the data I get timeout errors after exactly 60 seconds even though the timeout is set to 120 seconds
Sometimes we are able to run the query before 60 second and sometimes it fails. Please let us know why the timeout config is not working.

second part of the question:

Even though the official docs insist on setting the heap to 50% of available RAM, it seems like we are not using most of the heap available, do you think the search would improve if I decrease the heap size to 1.5GB?
any recommendations to improve search performance are highly appreciated.

Have you been able to verify the timeout is occurring in the Kibana server? You should see a log message about a timeout in the Kibana server logs if you configure Kibana with logging.verbose: true.

If you don't see those logs, it's likely that you have another proxy or load balancer that is enforcing a timeout. If that is the case, you may see some logs in the Kibana server about a socket hangup or client disconnect.

Regarding your Elasticsearch questions, I would recommend making a separate post in the #elasticsearch forum.

Thank you, we were using nginx-ingress-controller which has a default timeout of 60 seconds.

1 Like

Great! That same thing has caught me too many times :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.