JSON & Index conversion factors Elastic sizing factors

manikandanid · February 20, 2019, 12:54pm

I am trying to come up with the cluster sizing for our centralised logging system. On reading the factors influencing the size i came across two terms

JSON conversion factor
Index conversion factor

Not sure what are these. tried to find it in the elastic blog and couldn't find anything useful. Can someone detail it out or direct me to a blog where i can find the explanation.

Christian_Dahlqvist · February 24, 2019, 8:10pm

This was discussed in this webinar and this blog post, and is basically a way to reason about how raw data transforms into indexed data on disk. Often we first convert the raw data into JSON documents, and how this changes size depends on how we parse and structure the data as well as how much enrichment data we add. This is what we often refer to as the JSON conversion factor, and can vary a lot between different types of data.

Once we index this into Elasticsearch, the size will change again. This will typically depend on data itself, the mappings used as well as index settings and shard size. This is what we refer to as the Index conversion factor.

To get the size of primary shards on disk we basically take the raw data volume and multiply it by these two factors. The reason this was picked is that it is relatively easy to test and use in benchmarks.

system · March 24, 2019, 8:10pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ElasticSearch index size peculiarity Elasticsearch	2	687	July 6, 2017
How to calculate "net expansion factor" Elasticsearch	3	1788	May 30, 2020
Indices size Elasticsearch	4	613	July 6, 2017
Capacity Planning Guidelines? (estimating index size) Elasticsearch	4	3602	July 6, 2017
Performance suggestions for Indexing large documents Elasticsearch	8	415	July 6, 2017

JSON & Index conversion factors Elastic sizing factors

Related topics