I would expect that it is because data is being distributed among shards and nodes differently automatically as you're using randomly generated ID's for your documents. Compression levels will always vary depending upon the data - you could have 100 consecutive rows with almost identical data giving good compression on one run but the next run they could be highly fragmented.
Have you set up shards = 1 / replica = 0 for your index to disable random
shard distribution by doc ID, and did you execute curl
'0:9200/yourindexname/_optimize?max_num_segments=1' after indexing to pack
all segments?
whenever I index the data, index size is different even though I set the
same config and mapping.
What happen to Elastic Search? Anybody knows about it? Please give me
hint...
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.