Elasticsearch Crashing OOM

My local Enterprise Search development environment keeps crashing while indexing documents.
My docker compose

version: '3'

services:
  setup:
    image: docker.elastic.co/elasticsearch/elasticsearch:8.6.1
    volumes:
      - certs:/usr/share/elasticsearch/config/certs
    user: '0'
    command: >
      bash -c '
        if [ ! -f certs/ca.zip ]; then
          echo "Creating CA";
          bin/elasticsearch-certutil ca --silent --pem -out config/certs/ca.zip;
          unzip config/certs/ca.zip -d config/certs;
        fi;
        if [ ! -f certs/certs.zip ]; then
          echo "Creating certs";
          echo -ne \
          "instances:\n"\
          "  - name: elasticsearch\n"\
          "    dns:\n"\
          "      - elasticsearch\n"\
          "      - localhost\n"\
          "    ip:\n"\
          "      - 127.0.0.1\n"\
          > config/certs/instances.yml;
          bin/elasticsearch-certutil cert --silent --pem -out config/certs/certs.zip --in config/certs/instances.yml --ca-cert config/certs/ca/ca.crt --ca-key config/certs/ca/ca.key;
          unzip config/certs/certs.zip -d config/certs;
        fi;
        echo "Setting file permissions"
        chown -R root:root config/certs;
        find . -type d -exec chmod 750 \{\} \;;
        find . -type f -exec chmod 640 \{\} \;;
        echo "Waiting for Elasticsearch availability";
        until curl -s --cacert config/certs/ca/ca.crt https://elasticsearch:9200 | grep -q "missing authentication credentials"; do sleep 30; done;
        echo "Setting kibana_system password";
        until curl -s -X POST --cacert config/certs/ca/ca.crt -u elastic:elastic -H "Content-Type: application/json" https://elasticsearch:9200/_security/user/kibana_system/_password -d "{\"password\":\"kibana\"}" | grep -q "^{}"; do sleep 10; done;
        echo "All done!";
      '
    healthcheck:
      test: ['CMD-SHELL', '[ -f config/certs/elasticsearch/elasticsearch.crt ]']
      interval: 1s
      timeout: 5s
      retries: 120

  elasticsearch:
    depends_on:
      setup:
        condition: service_healthy
    image: docker.elastic.co/elasticsearch/elasticsearch:8.6.1
    volumes:
      - certs:/usr/share/elasticsearch/config/certs
      - esdata01:/usr/share/elasticsearch/data
    ports:
      - 9200:9200
    environment:
      ES_JAVA_OPTS: -Xms1024m -Xmx1024m
      bootstrap.memory_lock: true
      node.name: elasticsearch
      cluster.name: es-cluster
      cluster.initial_master_nodes: elasticsearch
      ELASTIC_PASSWORD: elastic
      xpack.security.enabled: true
      xpack.security.http.ssl.enabled: true
      xpack.security.http.ssl.key: certs/elasticsearch/elasticsearch.key
      xpack.security.http.ssl.certificate: certs/elasticsearch/elasticsearch.crt
      xpack.security.http.ssl.certificate_authorities: certs/ca/ca.crt
      xpack.security.http.ssl.verification_mode: certificate
      xpack.security.transport.ssl.enabled: true
      xpack.security.transport.ssl.key: certs/elasticsearch/elasticsearch.key
      xpack.security.transport.ssl.certificate: certs/elasticsearch/elasticsearch.crt
      xpack.security.transport.ssl.certificate_authorities: certs/ca/ca.crt
      xpack.security.transport.ssl.verification_mode: certificate
      xpack.license.self_generated.type: basic
    mem_limit: 1500000000
    ulimits:
      memlock:
        soft: -1
        hard: -1
    healthcheck:
      test:
        [
          'CMD-SHELL',
          "curl -s --cacert config/certs/ca/ca.crt https://localhost:9200 | grep -q 'missing authentication credentials'"
        ]
      interval: 10s
      timeout: 10s
      retries: 120

  kibana:
    depends_on:
      elasticsearch:
        condition: service_healthy
    image: docker.elastic.co/kibana/kibana:8.6.1
    volumes:
      - certs:/usr/share/kibana/config/certs
      - kibanadata:/usr/share/kibana/data
    ports:
      - 5601:5601
    environment:
      - SERVERNAME=kibana
      - ELASTICSEARCH_HOSTS=https://elasticsearch:9200
      - ELASTICSEARCH_USERNAME=kibana_system
      - ELASTICSEARCH_PASSWORD=kibana
      - ELASTICSEARCH_SSL_CERTIFICATEAUTHORITIES=config/certs/ca/ca.crt
      - ENTERPRISESEARCH_HOST=http://enterprisesearch:3002
    mem_limit: 1073741824
    healthcheck:
      test:
        [
          'CMD-SHELL',
          "curl -s -I http://localhost:5601 | grep -q 'HTTP/1.1 302 Found'"
        ]
      interval: 10s
      timeout: 10s
      retries: 120

  enterprisesearch:
    depends_on:
      elasticsearch:
        condition: service_healthy
      kibana:
        condition: service_healthy
    image: docker.elastic.co/enterprise-search/enterprise-search:8.6.1
    volumes:
      - certs:/usr/share/enterprise-search/config/certs
      - enterprisesearchdata:/usr/share/enterprise-search/config
    ports:
      - 3002:3002
    environment:
      ES_JAVA_OPTS: -Xms1024m -Xmx1024m
      SERVERNAME: enterprisesearch
      secret_management.encryption_keys: '[4a2cd3f81d39bf28738c10db0ca782095ffac07279561809eecc722e0c20eb09]'
      allow_es_settings_modification: true
      elasticsearch.host: https://elasticsearch:9200
      elasticsearch.username: elastic
      elasticsearch.password: elastic
      elasticsearch.ssl.enabled: true
      elasticsearch.ssl.certificate_authority: /usr/share/enterprise-search/config/certs/ca/ca.crt
      kibana.external_url: http://kibana:5601
    mem_limit: 1500000000
    healthcheck:
      test:
        [
          'CMD-SHELL',
          "curl -s -I http://localhost:3002 | grep -q 'HTTP/1.1 302 Found'"
        ]
      interval: 10s
      timeout: 10s
      retries: 120

volumes:
  certs:
    driver: local
  enterprisesearchdata:
    driver: local
  esdata01:
    driver: local
  kibanadata:
    driver: local

networks:
  stack: {}

This OOM error kills the Elasticsearch instance during the indexing process. It constantly crashes between 600,000 and 800,000 documents indexed. I have tried upping the memory for the JVM and docker itself and no matter how much I give it, it crashes. I am out of ideas to get this working on my local machine.

How much memory did you increase for both Elasticsearch and Enterprise Search?

What do you have in the logs when it crashes? Please share logs.

I tried upping both to 2g. I am rerunning the indexing process now expecting it to crash. I'll post those logs soon. Keep in mind this is just my local machine

{"@timestamp":"2023-02-22T18:37:28.385Z", "log.level": "INFO", "message":"[.ent-search-engine-documents-feed-engine/qjtsbnvcSZ2TCRfFMjvQCg] update_mapping [_doc]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[elasticsearch][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.metadata.MetadataMappingService","elasticsearch.cluster.uuid":"O0KZ6A30QOqUwn3OT3_M9w","elasticsearch.node.id":"pLbONbJPRj2MJVvTubg7-g","elasticsearch.node.name":"elasticsearch","elasticsearch.cluster.name":"es-cluster"}

{"@timestamp":"2023-02-22T18:37:30.318Z", "log.level": "INFO", "message":"[.ent-search-engine-documents-feed-engine/qjtsbnvcSZ2TCRfFMjvQCg] update_mapping [_doc]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[elasticsearch][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.metadata.MetadataMappingService","elasticsearch.cluster.uuid":"O0KZ6A30QOqUwn3OT3_M9w","elasticsearch.node.id":"pLbONbJPRj2MJVvTubg7-g","elasticsearch.node.name":"elasticsearch","elasticsearch.cluster.name":"es-cluster"}

{"@timestamp":"2023-02-22T18:37:33.945Z", "log.level": "INFO", "message":"[.ent-search-engine-documents-feed-engine/qjtsbnvcSZ2TCRfFMjvQCg] update_mapping [_doc]", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[elasticsearch][masterService#updateTask][T#1]","log.logger":"org.elasticsearch.cluster.metadata.MetadataMappingService","elasticsearch.cluster.uuid":"O0KZ6A30QOqUwn3OT3_M9w","elasticsearch.node.id":"pLbONbJPRj2MJVvTubg7-g","elasticsearch.node.name":"elasticsearch","elasticsearch.cluster.name":"es-cluster"}


ERROR: Elasticsearch exited unexpectedly

Honestly, the errors aren't helpful. I don't know how to get anything better from the container logs

All these services should be logging to stdout in their containers. You should be able to watch the logs with a docker-compose logs -f. You could also ssh into a given container and tail the logfiles manually.

When we do local development on our laptops, we set Elasticsearch to "ES_JAVA_OPTS=-Xms512m -Xmx512m" and let Enterprise Search and Kibana use their default memory sizes. But I recommend making sure that your docker settings allocate at least 4 GB of RAM to docker. This is set differently depending on which OS you us. For me on Mac OSX using Docker Desktop, I can find this under Docker -> Settings -> Resources -> Memory. There's a chance that your containers don't have enough off-heap memory available to them, so it can be about more than just increasing the Xmx.

I have 6 gb allocated in Docker desktop. I upped it from 4 bg

Ok, great!
And you're still hitting issues, and you're sure they're OOMEs? If so, you could try removing the mem_limit lines from your docker-compose.yml, as well as the Xmx bits, and see if that helps.

Have you gotten a stacktrace that you can share?

I have reduced ES memory to ES_JAVA_OPTS: -Xms512m -Xmx512m and removed all other memory limits in the docker compose. I am running my indexing job now watching the logs so I can get them if it crashes.

1 Like

@Sean_Story Lowering the ES memory down and removing all the other memory restrictions appears to have done the trick. My guess is I had the ES memory too high and all my containers combined forced Docker to kill one of my containers and Es being the highest consumer, was what it killed. We also run local Redis and Postgres so that adds some extra consumption.
Thanks for your help.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.