Hi All,
I am small query related to raw data vs size of disk.I am using es 5.7 and below is the observation.
Can someone please help me to understand this size factor.
Here raw data is having 1.5 KB data but when i am calling es indices api its showing as 10KB.
So the question data in disk will increase with 1:10 ratio ?.In mapping i used keyword and date
type data type only
Raw data:
"_source": {
"service_key": "eaa05d04-6171-11e7-8292-005056944c2b",
"api_req_cmplt_time": "2018-03-15T06:16:56.441Z",
"handler_req_time": "2018-03-15T06:16:59.609Z",
"channel_req_cmplt_time": "2018-03-15T06:17:00.276Z",
"secured_key": "eaa05d04-6171-11e7-8292-005056944c2b",
"trans_id": "ba867688-3e02-4e68-bbcb-2b7ca88ac643",
"type": "IMICONNECT_TRANS_TIME_TAKEN",
"source_tid": "ba867688-3e02-4e68-bbcb-2b7ca88ac643",
"request_source": "1",
"api_req_time": "2018-03-15T06:16:56.440Z",
"datetime": "2018-03-15T06:17:06.000Z",
"@timestamp": "2018-03-15T06:17:13.154Z",
"filename": "/logdata/mlogserverj/encrypted_files/__302007___24025990174359981.log_1.done",
"handler_req_cmplt_time": "2018-03-15T06:16:59.773Z",
"channel_req_time": "2018-03-15T06:16:59.849Z",
"rule_action_tid": "null",
"total_elapsed_time": 3836,
"channel_id": "1"
}
Regards,
Chhavi
dadoonet
(David Pilato)
March 16, 2018, 7:41am
2
You meant 5.6.7 I suppose.
Nevermind, could you share your mapping as well?
FYI in 5.x series, we are still generating the _all
field which might not be useful for you. You may want to disable it.
Also we are storing the _source
json field.
So technically we are generating more data at index time than the raw JSON document.
Hi Dadoonet,
Please find mapping here.Also i want to know what is _all field
{
"template" : "*",
"settings" : {
"index" : {
"number_of_shards":1
}
},
"order" : 0,
"mappings" : {
"_default_" : {
"dynamic_templates" : [ {
"message_field" : {
"mapping" : {
"index" : "not_analyzed",
"norms" : true,
"fielddata" : {
"format" : "disabled"
},
"type" : "keyword"
},
"match_mapping_type" : "string",
"match" : "message"
}
}, {
"string_fields" : {
"match" : "*",
"match_mapping_type" : "string",
"mapping" : {
"type" : "keyword", "index" : "not_analyzed", "norms" : true,
"fielddata" : { "format" : "disabled" }
}
}
}, {
"long_fields" : {
"mapping" : {
"doc_values" : true,
"type" : "long"
},
"match_mapping_type" : "long",
"match" : "*"
}
}, {
"date_fields" : {
"mapping" : {
"doc_values" : true,
"type" : "date"
},
"match_mapping_type" : "date",
"match" : "*"
}
}],
"properties" : {
"request": { "type": "text" },
"response": { "type": "text" },
"description": { "type": "text" },
"message": { "type": "text" },
"filename": { "type": "text" },
"extraparams": { "type": "text" },
"@timestamp" : {
"doc_values" : true,
"type" : "date"
}
}
}
}
}
Regards,
Chhavi
dadoonet
(David Pilato)
March 16, 2018, 9:39am
4
You shared the template, not the mapping.
Could you share the mapping please?
The _all field doc: https://www.elastic.co/guide/en/elasticsearch/reference/5.6/mapping-all-field.html
Hi dadoonet,
Sorry for that.Please find mapping.
{
"test_imiconnect_trans_time_taken2018-03-15" : {
"mappings" : {
"_default_" : {
"_all" : {
"enabled" : false
},
"dynamic_templates" : [
{
"message_field" : {
"match" : "message",
"match_mapping_type" : "string",
"mapping" : {
"fielddata" : {
"format" : "disabled"
},
"index" : "not_analyzed",
"norms" : false,
"type" : "keyword"
}
}
},
{
"string_fields" : {
"match" : "*",
"match_mapping_type" : "string",
"mapping" : {
"fielddata" : {
"format" : "disabled"
},
"index" : "not_analyzed",
"norms" : false,
"type" : "keyword"
}
}
},
{
"long_fields" : {
"match" : "*",
"match_mapping_type" : "long",
"mapping" : {
"doc_values" : true,
"type" : "long"
}
}
},
{
"date_fields" : {
"match" : "*",
"match_mapping_type" : "date",
"mapping" : {
"doc_values" : true,
"type" : "date"
}
}
}
],
"properties" : {
"@timestamp" : {
"type" : "date"
},
"description" : {
"type" : "text"
},
"extraparams" : {
"type" : "text"
},
"filename" : {
"type" : "text"
},
"message" : {
"type" : "text"
},
"request" : {
"type" : "text"
},
"response" : {
"type" : "text"
}
}
}
}
}
}
Regards,
Chhavi
dadoonet
(David Pilato)
March 16, 2018, 4:02pm
6
I can see a lot of text
fields in your mapping.
They are using the default analyzer. That might emit a lot of tokens then.
That could explain.
Hi Dadoonet,
The field which defined as text are not available in raw data,only related to keyword and datetime field i am using.
so the my question is any way to find out for keyword,datetime elastic search will take this much space.
Regards,
Chhavi
dadoonet
(David Pilato)
March 19, 2018, 5:20am
8
system
(system)
Closed
April 16, 2018, 5:21am
9
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.