The mappings are the same, the only difference is the source mode.
The settings are also basically the same, the only difference again is related to the source mode changes.
But the size is really different, the indices does not have the same amount of document, but I think they are pretty close that they can be compared.
The storage response brings the size from the primaries, so the sizes would be something close to 750 MB and 10 GB, there is a 2 million events difference, but a difference of more than 10 times in the size.
Some fields have a similar size, others are 2 to 3 times bigger, and others are 10 or more times bigger.
The _source
of the new index is responsible for almost all the size of the index:
"_source": {
"total": "8.8gb",
"total_in_bytes": 9487965794,
"inverted_index": {
"total": "0b",
"total_in_bytes": 0
},
"stored_fields": "8.8gb",
"stored_fields_in_bytes": 9487965794,
"doc_values": "0b",
"doc_values_in_bytes": 0,
"points": "0b",
"points_in_bytes": 0,
"norms": "0b",
"norms_in_bytes": 0,
"term_vectors": "0b",
"term_vectors_in_bytes": 0,
"knn_vectors": "0b",
"knn_vectors_in_bytes": 0
}
Unless there is a bug somewhere I would consider this increase totatlly unexpected, I would expect the index going from 750 MB to maybe, 2 GB, even 2.5 GB, more than doubling, but going from 750 MB to 10 GB is not expected.
Maybe @stephenb can get some internal insight on this, but this difference is too big, to the point that could lead someone to ditch Elasticsearch and look for another tool.
Do you have logs data streams or only metrics? Can you share a similar comparison for logs data streams?