Hi,
I am very new to the fantastic world of Elastic and the whole ELK Stack, so sorry for any dumb question that might pop up in this post.
I am right now working on a project where we are looking into replacing our existing central log environment, which consist of a syslogd on FreeBSD with ZFS with BZIP-9 compression.
It does the job, fair an simple, but working with the logs is a big hassle.
We would like to replace that solution with a new fancy shining ELK Stack.
Current environment:
Daily logs: Approx 13,6GB compressed logs on ZFS with a compress ratio on about x15 with BZIP9, which gives us a rough estimate of 200GB / day of uncompressed logs.
Target:
Hot storage 7 days (Approx 1.5TB SSD Storage)
"Archive": logs >7 days and < 366 days (Approx 71TB HDD Storage)
Design:
2x NGINX Loadbalancers with VRRP (Keepalived)
2x Logstash
1x Master node
2x Master / Data nodes HOT 64GB RAM / 8 cores, 31gb Heap
2x Data Nodes Warm (Archives). 64GB RAM / 8 cores, 31gb Heap
Question 1:
Scenario: Logstash is processing an example Cisco ASA file with 823,458 rows, raw file size of 136MB. Logstash index size becomes 364MB.
Example JSON from a document = 1,007 bytes
RAW message = 141 bytes
Actual file size on disk = 174 * 2 for both index shards = 374mb, which is fine. Then we have 315mb+319mb translogs, so the total used disk space = 980mb. About x7 of the actual raw log file size.
Question: Sorry for the stupid question here. Will I need to take account for the translog for old indices (last days for example)? Or will they only exist while there is changes done to the index?
Question 2
Archive design.
I have been testing and thinking about how to get the requirements lower for storage on our archive. As we don't have requirements for quick searches on logs older than 7 days. It makes me think of transparent compression on the filesystem here.
Does anyone have any experience with ZFS with BZIP9 with Elastic?
If these numbers don't lie, it do look like we could get fairly substantial savings by doing this.
Test Index on XFS:
root@xxxxx:/elasticsearch/data/nodes/0/indices/TdnOInHMSa-DtCD3FN2mvA# du -h
174M ./0/index
4.0K ./0/_state
315M ./0/translog
488M ./0
4.0K ./_state
174M ./1/index
4.0K ./1/_state
319M ./1/translog
492M ./1
980M .
Same Index on ZFS (data compression gzip-9)
root@xxxxxxx:/data/TdnOInHMSa-DtCD3FN2mvA# du -h
87M ./0/translog
1.5K ./0/_state
75M ./0/index
162M ./0
88M ./1/translog
1.5K ./1/_state
76M ./1/index
163M ./1
2.0K ./_state
325M .