Log size issue

Hello All,

I am using ELK stack. I am parsing my logs using Logstash then forward to elasticsearch after that on Kibana dashboard.

My total log file size of 24 Hour is 3 GB at backend. But when i am checking at elasticsearch its showing pri.store.size around 7 GB which is more than the double size of actual file size.

I am using 2 Primary shards and 1 replica. So store size is around 15 GB which seems fine according to my replica shards. Its become double because i have one replica.

Please refer the below output:-

health status index               uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   filebeat-2018.08.07 Ta44e3HNQ4uAs4_m5q0IgA   2   1   12714187            0     15.2gb          7.6gb

Please help me to understand the issue.

Thanks in advance.

Which version of Elasticsearch are you using? What does your data look like? What does your mappings look like for any custom fields that have been parsed?

This blog post discusses the impact of mappings and enrichment on storage size for Elasticsearch 5.x, but some of that also applies to Elasticsearch 6.x. How to tune your mappings to reduce the indexed size on disk is also covered in the documentation.

Hello Christian,

I am using elasticsearch 6.2.4. Below are the sample logs:

I, [2018-08-07T06:26:02.966222 #9981]  INFO -- : [4c21290c-bdc5-42c2-8327-6b691c62fe49] Completed 200 OK in 17ms (ActiveRecord: 4.3ms)
I, [2018-08-07T06:26:02.966327 #9974]  INFO -- : [3b63b133-354f-4c33-a620-05815e7d61c8] notify_device :: previous_changes: {} :: device id: 100159
I, [2018-08-07T06:26:02.967995 #9981]  INFO -- : [2056560c-935e-4efc-a0b6-1770d3fed736] Started POST "/api/v1/gps_audits.json" for 2600:1004:b11a:74bf:0:5a:5f78:3c01 at 2018-08-07 06:26:02 +0000
I, [2018-08-07T06:26:02.968362 #9974]  INFO -- : [3b63b133-354f-4c33-a620-05815e7d61c8] [active_model_serializers] Rendered ActiveModel::Serializer::Null with Hash (0.16ms)
I, [2018-08-07T06:26:02.969514 #9974]  INFO -- : [3b63b133-354f-4c33-a620-05815e7d61c8] Completed 200 OK in 20ms (Views: 0.8ms | ActiveRecord: 4.6ms)
I, [2018-08-07T06:26:02.970578 #10002]  INFO -- : [830f249e-22f4-4124-9480-aefbbca40abb] notify_eva :: Dropping as chat not enabled for device owner: 97077
I, [2018-08-07T06:26:02.970714 #10002]  INFO -- : [830f249e-22f4-4124-9480-aefbbca40abb] notify_device :: previous_changes: {} :: device id: 97077
I, [2018-08-07T06:26:02.971049 #9974]  INFO -- : [116335ed-dca5-4798-b9f5-100b30845dc5] Started PUT "/api/v1/devices/ping.json" for 2001:44c8:4141:e86:1:1:6687:8b57 at 2018-08-07 06:26:02 +0000
I, [2018-08-07T06:26:02.972398 #10002]  INFO -- : [830f249e-22f4-4124-9480-aefbbca40abb] [active_model_serializers] Rendered ActiveModel::Serializer::Null with Hash (0.08ms)
I, [2018-08-07T06:26:02.973289 #10002]  INFO -- : [830f249e-22f4-4124-9480-aefbbca40abb] Completed 200 OK in 20ms (Views: 0.6ms | ActiveRecord: 7.2ms)
I, [2018-08-07T06:26:02.973708 #9974]  INFO -- : [116335ed-dca5-4798-b9f5-100b30845dc5] Processing by Api::V1::DevicesController#ping as JSON

I have created some custom fileds like date-time, pid, verb, request-id etc.

Please let me know how we can reduce log size at elasticsearch.

Thanks.

Did you look at the resources I linked to? What is the mapping for your index?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.