Elasticsearch index storage size

Hi guys , so it's been a while now i injected some data from oracle database to elasticsearch via logstash , so everything works fine , but lately i injected the same data from a csv file via logstash but there is a huge difference in index size , the data from oracle database is about 800mb and the data from csv file is 2.2gb , btw the data is the same from both sources !!
any idea why this happened and is there a way to reduce the size of the csv index file .

logstach version 7.4.0 and elasticsearch version 7.4.0

thank you.

haythem,

I would check the following things:

  • Compare the number of documents in the indices to see if one has missing docs or the other has duplicate docs
  • Compare the mappings between the two indices to see if the larger index has a lot more fields being analyzed
  • Compare some samples of the documents between the two to confirm they are the same as you expected.