Index data size is too big

55

I deploy an ELK system on Ubuntu, use Filebeat to collect logs. But the index size is too huge. I can't figure out why...

This is my Logstash setting:

input {
  beats {
    port => 8903
  }
}

output {
    elasticsearch {
        hosts => localhost
        manage_template => false
        index => "huopu_tool-%{+YYYY.MM.dd}"
    }
}

This is my Filebeat setting:

filebeat.prospectors:
- input_type: log
  paths:
    - /var/log/nginx/access.log*
  exclude_files: [".gz$"]
  document_type: nginx_access

- input_type: log
  paths:
    - /var/log/nginx/error.log*
  exclude_files: [".gz$"]
  document_type: nginx_error

- input_type: log
  paths:
    - /home/deploy/projects/site/shared/log/production.log
  document_type: rails_production

- input_type: log
  paths:
    - /home/deploy/projects/site/shared/log/puma_access.log
  document_type: puma_access

- input_type: log
  paths:
    - /home/deploy/projects/site/shared/log/puma_error.log
  document_type: puma_error
- input_type: log
  paths:
    - /home/deploy/projects/site/shared/log/sidekiq.log
  document_type: sidekiq

output.logstash:
  hosts: ["localhost:8903"]

And this is my Elasticsearch index setting, mostly is the default:

"settings" : {
      "index" : {
        "creation_date" : "1505887670966",
        "number_of_shards" : "5",
        "number_of_replicas" : "1",
        "uuid" : "h5EuSxuJTOaMU9MRFxMvOg",
        "version" : {
          "created" : "5060099"
        },
        "provided_name" : "huopu_tool-2017.09.20"
      }
    }

FYI we’ve renamed ELK to the Elastic Stack, otherwise Beats feels left out :wink:

Why do you say it's too big?

How much space your data will take up on disk will depend on a a number of things, e.g. amount of data added through enrichment and the mappings you are using. I wrote a blog post discussing this that may be useful and give you some ideas about how you can go about optimizing your mappings.

My bad :stuck_out_tongue: , I'm a newbie to the Elastic Stack world.

It's embarrassing that I thought the Document Count: 41.1m means the size of received log files.

You reminded me to check my log file. I found I collect a wrong log, which is a 7G-sized log...

Thank you very much! Your article help me a log. I searched many articles, but still have no idea how to save my disk space. I will follow your article and have a try.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.