I have a three nodes cluster each of 114 GB disk capacity. I am pushing syslog events from around 190 devices to this server. The events flow at a rate of 100k withing 15 mins. The index has 5 shards and uses best_compression method. Inspite of this the disk gets filled up soon so i was forced to remove the replica for this index. The index size is 170 GB and each shard is of size 34.1 GB. Now if i get additional disk space and i try to re index this data to a new index with 3 shards and replica will it save disk space ?
The size on disk will depend on how much you enrich your data and what mappings you use. This blog post discusses how you can optimise mappings in order to save space. Reducing the number of shards may help, especially if they are small, but I would still recommend optimising your mappings if you have not already.
Thanks, the blog information was really helpful.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.