This is the actual (and in my opinion valid) comparision, using different index with similar data.
Plain reindex, no mapping changes, both indices forcemerged, still +25% size increase on _source.
Reindexing command
POST /_reindex?slices=auto&pretty
{
"source": {
"index": "srcindex"
},
"dest": {
"index": "identmap"
}
}
POST /identmap/_forcemerge?max_num_segments=1&pretty
{
"_shards" : {
"total" : 6,
"successful" : 6,
"failed" : 0
}
}
Index settings+mappings diff
@@ -1,5 +1,5 @@
-GET /srcindex?pretty
+GET /identmap?pretty
{
- "srcindex" : {
+ "identmap" : {
"aliases" : { },
"mappings" : {
@@ -243,5 +243,5 @@
"lifecycle" : {
"name" : "SIEM-LOG",
- "rollover_alias" : ""
+ "origination_date" : "1640390409363"
},
"routing" : {
@@ -258,6 +258,6 @@
"read" : "false"
},
- "provided_name" : "srcindex",
- "creation_date" : "1640390409363",
+ "provided_name" : "identmap",
+ "creation_date" : "1651742088307",
"unassigned" : {
"node_left" : {
@@ -275,7 +275,7 @@
"priority" : "50",
"number_of_replicas" : "1",
- "uuid" : "xxxxxxxxxxxxxxxxxxxxxx",
+ "uuid" : "aaaaaaaaaaaaaaaaaaaaaa",
"version" : {
- "created" : "7150299"
+ "created" : "7160199"
}
}
Disk usage
Comparison by field:
srcindex[MB] identmap[MB] increase [MB]
_seq_no 2149 1587 -562
<snip> -2 to +2
timestamp 2899 2910 11
_id 5505 5819 314
_source 33729 42095 8366
total 67077 75204 8127
Comparison by type:
srcindex[MB] identmap[MB] increase [MB]
points 5409 4974 -435
doc_values 14817 14699 -118
inverted_index 11845 11842 -3
norms 0 0 0
term_vectors 0 0 0
stored_fields 35005 43688 8683
total 67077 75204 8127
_cat segments
srcindex(original):
/_cat/segments/srcindex?h=index,shard,prirep,segment,generation,docs.count,docs.deleted,size,size.memory,committed,searchable,version,compound&v
index shard prirep segment generation docs.count docs.deleted size size.memory committed searchable version compound
srcindex 0 p _xxaa 341509 134835092 0 21.8gb 21404 true true 8.9.0 false
srcindex 0 r _xxaa 341509 134835092 0 21.8gb 21404 true true 8.9.0 false
srcindex 1 p _xxbb 342018 134812277 0 21.8gb 21404 true true 8.9.0 false
srcindex 1 r _xxbb 342018 134812277 0 21.8gb 21404 true true 8.9.0 false
srcindex 2 p _xxcc 341409 134841646 0 21.8gb 21404 true true 8.9.0 false
srcindex 2 r _xxcc 341409 134841646 0 21.8gb 21404 true true 8.9.0 false
identmap(reindexed)
/_cat/segments/identmap?h=index,shard,prirep,segment,generation,docs.count,docs.deleted,size,size.memory,committed,searchable,version,compound&v
index shard prirep segment generation docs.count docs.deleted size size.memory committed searchable version compound
identmap 0 p _aa 697 134835092 0 24.4gb 140556 true true 8.11.1 false
identmap 0 r _aa 697 134835092 0 24.4gb 140556 true true 8.11.1 false
identmap 1 p _bb 727 134812277 0 24.4gb 140556 true true 8.11.1 false
identmap 1 r _bb 727 134812277 0 24.4gb 140556 true true 8.11.1 false
identmap 2 p _cc 699 134841646 0 24.4gb 140556 true true 8.11.1 false
identmap 2 r _cc 699 134841646 0 24.4gb 140556 true true 8.11.1 false
I am sorry for all the previous unnecessary messages. I should have done all this triage before posting.