Elasticsearch Index size 2.2X times bigger than my log file

Hello Experts,

Elasticsearch index size 2x times bigger than my log file after compression.

[root@localhost ~]# curl XGET 'localhost:9200/_cat/indices/swift-proxy-log-2018.03.19-000001'
curl: (6) Could not resolve host: XGET; Unknown error
green open swift-proxy-log-2018.03.19-000001 vtIR6ul2Szuw6LVGedCbQA 5 0 4039457 0 2.1gb 2.1gb

[root@localhost ~]# curl XGET 'localhost:9200/swift-proxy-log-2018.03.19-000001/_settings?pretty'
curl: (6) Could not resolve host: XGET; Unknown error
{
"swift-proxy-log-2018.03.19-000001" : {
"settings" : {
"index" : {
"codec" : "best_compression",
"refresh_interval" : "5s",
"number_of_shards" : "5",
"provided_name" : "swift-proxy-log-2018.03.19-000001",
"creation_date" : "1521495831280",
"number_of_replicas" : "0",
"uuid" : "vtIR6ul2Szuw6LVGedCbQA",
"version" : {
"created" : "6010199"
}
}
}
}
}

[root@localhost tmp]# ls -lh all.log.2
-rw-r--r-- 1 root root 1.1G Mar 19 2018 all.log.2

is this expected behavior? or I am doing anything wrong?

Thanks
Chandra

The size data takes up on disk when indexed into Elasticsearch will typically depend on a number of factors., e.g. amount of data added through collection and enrichment, mappings used when indexing the data and shard size. The version of Elasticsearch used also can also play a part. This blog post walks through an example of how different choices regarding enrichment and mappings affect the indexed size on disk for web access logs.

Thanks @Christian_Dahlqvist for your response!

I don;t think my settings are that complex.

Here is my data after parsing(copied from KIbana).
{
"_index": "swift-proxy-log-2018.03.19-000001",
"_type": "doc",
"_id": "orhBQGIBIrkb2ftW1hxX",
"_version": 1,
"_score": null,
"_source": {
"source": "/tmp/all.log.2",
"httpversion": 1,
"method": "GET",
"@timestamp": "2018-03-19T17:42:07.302Z",
"response_time": 0.0135,
"prospector": {
"type": "log"
},
"date": "Mar 18 03:07:47",
"program": [
"proxy-server",
"proxy-server"
],
"message": "- - - 62591 - txcf44709a1e654837a14e9-005aae3a73, - 1521367667.206026077 1521367667.219532967 0",
"offset": 1086650978,
"client": "0-1",
"request": "/v1/ACC_388/d26098c2-b57d-4d2e-b938-a1a0ebb593fe/1520615514637:27a29814-dd8f-43ee-b768-19af98bf1d07:61",
"fields": {
"app_id": "swiftproxylog"
},
"beat": {
"name": "localhost.localdomain",
"version": "6.1.3",
"hostname": "localhost.localdomain"
},
"host": "localhost.localdomain",
"tags": [
"swift_proxy_test",
"beats_input_codec_plain_applied",
"swift_all_parsed"
],
"@version": "1",
"status_code": 200
},
"fields": {
"@timestamp": [
"2018-03-19T17:42:07.302Z"
]
},
"sort": [
1521481327302
]
}

My Environment details:

elasticsearch-6.1.1
kibana-6.1.3-linux-x86_64
logstash-6.1.3
filebeat version 6.1.3

I am going through the link which you have provided.

Thanks
Chandra

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.