Does elasticsearch reduce bloating?

Jason_Brooks · February 1, 2018, 10:30pm

Hello,

Does elasticsearch deduplicate data? I mean if there is a name/value combination in common for many events, does elasticsearch do any storage optimization?

For example,I have a large ( >500,000) number of events. in each event is the "EventCode" field, and there are perhaps 20 unique Eventcodes. If this were sql, I would break these eventcodes into their own table and use a foreign key to significantly reduce disk storage, and complexity.

If elasticsearch doesn't do this out of the box, are there ways to provide hints to do this?

Thank you for your information and time...

--jason

Jason_Brooks · February 1, 2018, 10:32pm

OOps: Did I say >500,000 events? more like > 8,305,270 events...

warkolm · February 1, 2018, 10:36pm

It does not deduplicate.
It does compress, and there are two levels you can use - https://www.elastic.co/guide/en/elasticsearch/reference/6.1/index-modules.html#index-codec

system · March 1, 2018, 10:36pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How does elasticsearch store repeated values across documents? Elasticsearch	4	1151	March 30, 2018
Logstash->Elasticsearch document deduplication efficiency & optimization Elasticsearch	3	827	December 12, 2020
Is it any Elasticsearch storage optimization applicable? Elasticsearch	8	225	April 25, 2024
Document compression - duplicate fields Elasticsearch	2	393	November 18, 2020
Deduplicating data in ElasticSearch Elasticsearch	2	704	September 12, 2017

Does elasticsearch reduce bloating?

Related topics