Where does Elasticsearch store/read logs

Greetings for eveyone!
I'm newbie to Elasticsearch as much as in programming in general and I don't understand some things.
I have a little Django-project for practise and I install Elasticsearch into project. I'm also using docker.
And by itself all is working: when I do something in my project, Elasticsearch get some logs and I can see it from Kibana.
But I have 20 gb of logs from other app in .tar-format file and I wanna to export them into my project and see what I can do and how can I manage logs from different apps, different time, etc.

I'm using Ubuntu 20.04.
Here is my docker-compose.yml file:

version: '3.7'

services:
  django_gunicorn:
    build:
      context: .
    volumes:
      - static:/static
    ports:
      - "8000:8000"
    depends_on:
      - psql
  
  nginx:
    build: ./short_url/nginx
    volumes:
      - static:/static
    ports:
      - "80:80"
    depends_on:
      - django_gunicorn
  
  psql:
    image: postgres:latest
    container_name: psql
    volumes:
     - test:/var/lib/postgresql/data
    environment:
     - POSTGRES_DB=postgres
     - POSTGRES_USER=postgres
     - POSTGRES_PASSWORD=postgres

  logger:
      build:
        context: ./filebeat
      container_name: filebeat_logger
      restart: always
      user: root
      volumes:
        - /var/lib/docker:/var/lib/docker:ro
        - /var/run/docker.sock:/var/run/docker.sock
      depends_on:
        - "django_gunicorn"
      labels:
        co.elastic.logs/enabled: "false"
      logging:
        driver: "json-file"
        options:
          max-size: "100M"
          max-file: "10"
  
  elasticsearch:
      container_name: elasticsearch
      restart: always
      image: "docker.elastic.co/elasticsearch/elasticsearch:7.13.4"
      environment:
          - "ES_JAVA_OPTS=-Xms1g -Xmx1g"
          - "discovery.type=single-node"
          - "xpack.license.self_generated.type=basic"
      ports:
          - "9200:9200"
      volumes:
          - elasticsearch_data:/usr/share/elasticsearch/data
      labels:
        co.elastic.logs/enabled: "false"
      logging:
          driver: "json-file"
          options:
              max-size: "100M"
              max-file: "10"

  kibana:
      container_name: kibana
      restart: always
      image: "docker.elastic.co/kibana/kibana:7.13.4"
      ports:
          - "5601:5601"
      labels:
        co.elastic.logs/enabled: "false"
      logging:
          driver: "json-file"
          options:
              max-size: "100M"
              max-file: "10"

volumes:
  static:
  test:
  elasticsearch_data:

I try to expot logs by this command in terminal:
docker run --rm --volumes-from elasticsearch -v $PWD:/backup-dir bash -c "cd /usr/share/elasticsearch/data && tar xvf /backup-dir/mybackup.tar"
where mybackup.tar is archive with logs.

But Elasticsearch doesn't see those logs.
This is volumes list:
volume ls
I looked into var/lib/docker/volumes directory and those 20 gb of logs definitely extracted into short_elasticsearch_data volume.

I think maybe Elasticsearch see logs, that only in elasticsearch_elasticsearch_data, but that volume not connected to any existed container:


and I don't know how to export logs in specific volume, that not connected to container.

But then I delete whole short_elasticsearch_data volume and rebuild entire container, and all my logs is gone, even those that were specifically from mine project.

And I for this moment have several questions:

  1. Where are the logs that the Elasticsearch sees are stored?
  2. Is it possible to export logs in that way, that I can manage them by Elasticsearch? And if it possible, how to do that?

Thanks in advance!
P.S.: english is not my native, so I apologize in advance for any misunderstanding and will try to clarify any moment, that can confuse.

Welcome to the community @Ibicf and thank you for posting your query.

As I understand, you are pushing your application logs directly to Elasticsearch which running in a separate container, which is OK.
In a real world scenario, you generally just store the logs in a log file of your application, also running as a container, and configure a log shipper like filebeat to read all the log files from that server (or node in terms of Elastic). Filebeat essentially reads all the log files of all the applications and push them to Elasticsearch.

About your queries:

  1. Logs stored in Elasticsearch are saved into multiple segment files depending on number of dictionaries defined in your index. As such you cannot read or write to those files unless you are working with native Apache Lucene library which is implicitly used by Elasticsearch. The path to data directory (where segment files are stored) is configured in your elaticsearch.yml configuration file.
  2. For exporting the logs to Elasticsearch, you can few options in hand depending on if your logs are Time Series Data (TSD) i.e. logs are being written into the log files continuously - for this you can use filebeat. If you just want to store log dump or Static Data which is not being touched/ written on anymore, you can use either filebeat or logstash. For both cases, you need to first untar/unzip your log files to a directory and configure beats/stash to read all the log files from the directory.

For more information, please read the Elastic documentation carefully and follow the steps to configure your log agents.

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.