My setup is like this: I have single instance of elasticseach with one node, 5 shards and 1 replica. I also use curator for maintaining the cluster. My problem is that ES API detects lower disk usage from indices than it is in reality. For example, heading towards elasticsearch's data directory and issuing du -hs there shows 10 GB of disk usage and at the same time API shows that ~5GB are used. This is actually quite dangerous situation as there is a risk that my server will run out of disk space, because curator won't remove old logs. I don't have any closed indices. Another thing I would like to know is if there is any way to set the replicas number permanently from the beginning. Before elasticsearch5 it was possible in .yml file, but right now setting any indices option throws exception on start and guides me to use API for such operation. I can use curator to manage replicas, but it'd be very convenient to set it at the beginning.
But, the main problem is with incorrect disk usage detection. Curator gets info from API that ~5 GB is used. That's not curator fault, because I also can see such usage using standard diagnostic API commands. There may be a problem with my config and some unnecessary things may grow inside data folder, so I'll post my elasticsearch.yml here:
bootstrap.memory_lock: true index.codec: best_compression indices.fielddata.cache.size: 40% network.host: localhost http.port: 9200 http.compression: true
If there is any more info you'd like to see, I can produce more output.