We use es as tsdb described in
and data look like:
{
"metric": "metric_name",
"@timestamp": "2020-01-01T01:01:01",
"t": {
"key1": "val1",
"key2": "val2"
},
"f": {
"name1": 1.2,
"name2": 2.3
}
}
But when data grows (to billion datapoint), these file grows to a size so huge as below (already rolling by date):
('Points', '0.07')
size: 1,657,438,615 bytes
('DocValues', '0.18')
size: 3,876,527,301 bytes
('Term Dictionary', '0.20')
size: 4,237,561,161 bytes
('Field Data', '0.44')
size: 9,265,590,659 bytes
('Frequencies', '0.08'):
size: 1,733,107,687 bytes
And the query we use are just term(yes or no) and prefix and some simple aggregation(terms, date_histogram then sum, avg).
So, my question is
Whether es can translate first then store the the mapping data instead of original keyword content in the storing layer(Transparent to users) to save the space and io?
like:
{
"metric": "0",
"@timestamp": "2020-01-01T01:01:01",
"t": {
"key1": "1",
"key2": "2"
},
"f": {
"name1": 1.2,
"name2": 2.3
}
}
instead of
{
"metric": "long_metric_name",
"@timestamp": "2020-01-01T01:01:01",
"t": {
"key1": "long_val1",
"key2": "long_val2"
},
"f": {
"name1": 1.2,
"name2": 2.3
}
}