Efficient storage

eddie4 · October 22, 2019, 3:31pm

Hello, am planning on storing 1,5TB of textfiles in Elasticsearch. However I do not have a lot of space to spare above the text file size. Now I have been reading up on elastic search and how to minimize storage and I have already made the following changes

Removed unnecessary fields
Removed _source
Enabled best_compression

This has decreased my the diskspace used by elasticsearch from 7 times the size of the text files down to 3 times the text file. But I was hoping there is a way to compress this even further.

The path is the location where the original textfile is located and is required. And the messageline is currently filled with close to random data very little text that is reused(But must be word searchable). I have thought about splitting the message field but I run into grok running after mutate. Which means if i mutate remove message that grok can't parse the fields. (But that is another problem)

{
  "mapping": {
    "properties": {
      "message": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      },
      "path": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "ignore_above": 256
          }
        }
      }
    }
  }
}

Any suggestions are welcome.

dadoonet · October 23, 2019, 1:22am

Do you need the keyword subfields? If not, I'd remove them.

eddie4 · October 24, 2019, 7:43am

As i understand it they are required to search the field arent they? if not how would one go about removing them? I think I tried at one point but wasn't allowed to.

dadoonet · October 24, 2019, 8:13am

You need keyword fields if you need to aggregate or sort.

system · November 21, 2019, 8:13am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Index space usage with text and keyword datatype Elasticsearch	3	1439	August 22, 2019
ELK cluster disk space usage optimization Elasticsearch	9	2477	July 5, 2017
How to reduce Index size on disk? Elasticsearch	7	16264	July 5, 2017
ElasticSearch index size peculiarity Elasticsearch	2	667	July 6, 2017
Weird storage change Elasticsearch	4	605	July 20, 2017

Efficient storage

Related topics