I was trying to understand how much size an index would take for a certain
size of input data.
Below is the scenario and observation:
- For the purpose I picked 10 column csv with 1000 rows. The size of the
csv is 111 KB.
- I created 2 Field (of type String) for each column. 1 Analyzed to run
search and 1 Not Analyzed to run facet.
- The index was configured to create 5 segments.
- After indexing I found that the size of the index was 4.5MB. (This
includes all 5 shards , trans log etc...)
- Which means its almost 45 times more than the original size. Which is
very significant increase.
Then I tried not to include _source and found that the index size reduced
by 25%. Came down to 3 mb. Which is still significant.
Am I missing something? Or is there any other ways to reduce the size of