Can ES not to store original keyword content but a mapping num value (4 save space)?

We use es as tsdb described in

my other question

and data look like:

{
	"metric": "metric_name",
	"@timestamp": "2020-01-01T01:01:01",
	"t": {
		"key1": "val1",
		"key2": "val2"
	},
	"f": {
		"name1": 1.2,
		"name2": 2.3
	}
}

But when data grows (to billion datapoint), these file grows to a size so huge as below (already rolling by date):

('Points', '0.07')
size: 1,657,438,615 bytes

('DocValues', '0.18')
size: 3,876,527,301 bytes

('Term Dictionary', '0.20')
size: 4,237,561,161 bytes

('Field Data', '0.44')
size: 9,265,590,659 bytes

('Frequencies', '0.08'):
size: 1,733,107,687 bytes

And the query we use are just term(yes or no) and prefix and some simple aggregation(terms, date_histogram then sum, avg).

So, my question is

Whether es can translate first then store the the mapping data instead of original keyword content in the storing layer(Transparent to users) to save the space and io?

like:

{
	"metric": "0",
	"@timestamp": "2020-01-01T01:01:01",
	"t": {
		"key1": "1",
		"key2": "2"
	},
	"f": {
		"name1": 1.2,
		"name2": 2.3
	}
}

instead of

{
	"metric": "long_metric_name",
	"@timestamp": "2020-01-01T01:01:01",
	"t": {
		"key1": "long_val1",
		"key2": "long_val2"
	},
	"f": {
		"name1": 1.2,
		"name2": 2.3
	}
}

wait for reply!

Read this and specifically the "Also be patient" part.

It's fine to answer on your own thread after 2 or 3 days (not including weekends) if you don't have an answer.

sorry for that, emmm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.