Hello, am planning on storing 1,5TB of textfiles in Elasticsearch. However I do not have a lot of space to spare above the text file size. Now I have been reading up on elastic search and how to minimize storage and I have already made the following changes
Removed unnecessary fields
Removed _source
Enabled best_compression
This has decreased my the diskspace used by elasticsearch from 7 times the size of the text files down to 3 times the text file. But I was hoping there is a way to compress this even further.
The path is the location where the original textfile is located and is required. And the messageline is currently filled with close to random data very little text that is reused(But must be word searchable). I have thought about splitting the message field but I run into grok running after mutate. Which means if i mutate remove message that grok can't parse the fields. (But that is another problem)
{
"mapping": {
"properties": {
"message": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"path": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
Any suggestions are welcome.