How to Reduce Received Logs Size in ELK Stack?

Hi,

I would like to ask, any way we can compress / dedup the sizes of the logs that we received in ELK stack?

Thanks.

Is there anyone can give some advice on this?

Have you checked this part of the documentation about tunning for disk usage?

Those are the some things that you can do to help you reduce the indices sizes.

Hi,
currently my indices with 1 x Primary Shard & 1 x Replica..
I can consider to Shrink & Force Merge it after 60 days in WARM phase..

If my Shrink set as 1 shard count, is it my indices will reduce from 1 x Primary, 1 x Replica to 1 Primary Shard only?

and How about Force Merge?

Hope to get some clarification from you, I still new to Elasticsearch, appreciate for your help.
Thanks.

Just a heads up that while we endeavour to reply to every thread, we don't provide SLAs on requests :slight_smile:

1 Like

Shrinking an ondex reduces the number of primary shsrds, so will not have any effect in your case. You can reduce the number of replicas but that will reduce availability and resiliency. Forcemerging may reduce size if you at this time apply best compression but may otherwise not necessarily shrink the size much. I would recommend looking atvthe link provided and make sure your mappings are optimized.

Are you using compression? How do you send logs to ELK?
Are you using filebeat? If yes then you can exclude some unnecessary fields from each event for the help of parameter drop_fields in filebeat configuration.

Hi,

I'm sending syslog to logstash.
Not using any beats yet...

What about compression?

I'm using default compression.

Please change it to best_compression codec.

Hi d.silwon,

Sorry, i'm new to Elasticsearch, may I know, which config file I should configure for the index.codec to best_compression? or how to change from default to best_compression?

Thanks.

It can be change in the template of the your index. Here you are the example:

{
  "index": {
    "codec": "best_compression"
  }
}

this is my current index templates,

{
  "index_templates" : [
    {
      "name" : "logstash",
      "index_template" : {
        "index_patterns" : [
          "logstash-*"
        ],
        "template" : {
          "settings" : {
            "index" : {
              "lifecycle" : {
                "name" : "logstash-policy",
                "rollover_alias" : "logstash"
              },
              "number_of_shards" : "1",
              "refresh_interval" : "5s"
            }
          },
          "mappings" : {
            "dynamic_templates" : [
              {
                "message_field" : {
                  "path_match" : "message",
                  "mapping" : {
                    "norms" : false,
                    "type" : "text"
                  },
                  "match_mapping_type" : "string"
                }
              },
              {
                "string_fields" : {
                  "mapping" : {
                    "norms" : false,
                    "type" : "text",
                    "fields" : {
                      "keyword" : {
                        "ignore_above" : 256,
                        "type" : "keyword"
                      }
                    }
                  },
                  "match_mapping_type" : "string",
                  "match" : "*"
                }
              }
            ],
            "properties" : {
              "@timestamp" : {
                "type" : "date"
              },
              "geoip" : {
                "dynamic" : true,
                "properties" : {
                  "ip" : {
                    "type" : "ip"
                  },
                  "latitude" : {
                    "type" : "half_float"
                  },
                  "location" : {
                    "type" : "geo_point"
                  },
                  "longitude" : {
                    "type" : "half_float"
                  }
                }
              },
              "@version" : {
                "type" : "keyword"
              }
            }
          }
        },
        "composed_of" : [ ],
        "priority" : 200,
        "version" : 80001,
        "_meta" : {
          "description" : "index template for logstash-output-elasticsearch"
        }
      }
    }
  ]
}

I'm getting this error,

PUT _index_template/logstash
{
  "index": {
    "codec": "best_compression"
  }
}
{
  "error" : {
    "root_cause" : [
      {
        "type" : "x_content_parse_exception",
        "reason" : "[2:3] [index_template] unknown field [index]"
      }
    ],
    "type" : "x_content_parse_exception",
    "reason" : "[2:3] [index_template] unknown field [index]"
  },
  "status" : 400
}

Can I set
index.codec: best_compression in Elasticsearch.yml ?

Here you are example of index template API in case of my example template called applications:

PUT _index_template/applications
{
  "index_patterns": ["applications-*"],
  "template": {
    "settings": {
      "codec": "best_compression"
    }
  }
}

Please adjust it to your needs.

after changed, it looks like this,

GET _index_template/logstash

{
  "index_templates" : [
    {
      "name" : "logstash",
      "index_template" : {
        "index_patterns" : [
          "logstash-*"
        ],
        "template" : {
          "settings" : {
            "index" : {
              "lifecycle" : {
                "name" : "logstash-policy",
                "rollover_alias" : "logstash"
              },
              "codec" : "best_compression",
              "refresh_interval" : "5s",
              "number_of_shards" : "1"
            }
          },
          "mappings" : {
            "dynamic_templates" : [
              {
                "message_field" : {
                  "path_match" : "message",
                  "mapping" : {
                    "norms" : false,
                    "type" : "text"
                  },
                  "match_mapping_type" : "string"
                }
              },
              {
                "string_fields" : {
                  "mapping" : {
                    "norms" : false,
                    "type" : "text",
                    "fields" : {
                      "keyword" : {
                        "ignore_above" : 256,
                        "type" : "keyword"
                      }
                    }
                  },
                  "match_mapping_type" : "string",
                  "match" : "*"
                }
              }
            ],
            "properties" : {
              "@timestamp" : {
                "type" : "date"
              },
              "geoip" : {
                "dynamic" : true,
                "properties" : {
                  "ip" : {
                    "type" : "ip"
                  },
                  "latitude" : {
                    "type" : "half_float"
                  },
                  "location" : {
                    "type" : "geo_point"
                  },
                  "longitude" : {
                    "type" : "half_float"
                  }
                }
              },
              "@version" : {
                "type" : "keyword"
              }
            }
          }
        },
        "composed_of" : [ ],
        "priority" : 200,
        "version" : 80001,
        "_meta" : {
          "description" : "index template for logstash-output-elasticsearch"
        }
      }
    }
  ]
}

The config Is it looks ok?
When the codec will take affect?

This change will be applied for new indexes, during the creation. In case of existing index you should try change settings for each index separately.

But I already changed the index templates (logstash).
It will take effect for new indexes, Am I right?

You are right