I've been trying to optimize ES for logstash; threads, indexing memory, number of shards, compression, etc and I'm stuck on disk usage. As a test I indexed 4GB of IIS logs into a two shard index on a two node cluster. Each shard has one replica. I'm seeing that each shard is 4GB and each replica is 4GB (fine, that's a given, it's a replica).
In really confused as to why each shard is 4GB. Shouldn't both shards equate to 4GB, not 8GB? At this point, 4GB of logs is equating to 16GB of used storage across two nodes.
routing: {
state: STARTED
primary: true
node: eF-H3zhSTI6piq_f6ukjtA
relocating_node: null
shard: 0
index: logstash-2013.05.27
}
state: STARTED
index: {
size: 3.8gb
size_in_bytes: 4157721500
}m
routing: {
state: STARTED
primary: true
node: eF-H3zhSTI6piq_f6ukjtA
relocating_node: null
shard: 1
index: logstash-2013.05.27
}
state: STARTED
index: {
size: 3.9gb
size_in_bytes: 4191907329
}
-confused
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.