While migrating form ES2 to ES6 index is taking more space


(Manish mitra) #1

Hi,

We are upgrading Elastic stack from Elastic search 2.3.4 to elastic search 6.2.

When we are migrating index from ES2 to ES6, migrated index are taking more space in ES6 , we migrated thinking that ES6 would take less space. Need help if we are following right path.

Steps : We are taking snapshot from ES2. Then restoring it in ES5 and after reindex we are taking snapshot in ES5 and restoring them in ES6.

When we are reindexing in ES5 . Index of 16MB (in ES2)is taking 18 MB in ES5 . When we are restoring the reindexed index in ES6 it is taking 18 MB.

If any one can help why it is taking more space, or i am doing something wrong. When we are creating fresh index in ES6 they are taking less space.


(David Pilato) #2

What is your mapping for both versions ?


(Manish mitra) #3

you want know the template that i am using for both version.

Template for ES 2.3.4

{
"aliases": {},
"mappings": {
"default": {
"_all": {
"enabled": true,
"omit_norms": true
},
"dynamic_templates": [
{
"string_fields": {
"mapping": {
"fielddata": {
"format": "disabled"
},
"fields": {
"raw": {
"index": "not_analyzed",
"type": "string"
}
},
"index": "analyzed",
"omit_norms": true,
"type": "string"
},
"match": "",
"match_mapping_type": "string"
}
}
],
"properties": {
"@timestamp": {
"type": "date"
},
"@version": {
"index": "not_analyzed",
"type": "string"
},
"geoip": {
"dynamic": true,
"properties": {
"ip": {
"type": "ip"
},
"latitude": {
"type": "float"
},
"location": {
"type": "geo_point"
},
"longitude": {
"type": "float"
}
}
}
}
}
},
"order": 0,
"settings": {
"index": {
"refresh_interval": "5s"
}
},
"template": "example-
"
}

template for ES 6

{
"order": 0,
"version": 60001,
"index_patterns": [
"example-"
],
"settings": {
"index": {
"codec": "best_compression",
"refresh_interval": "5s"
},
"number_of_shards" : 1,
"number_of_replicas" : 0
},
"mappings": {
"doc": {
"dynamic_templates": [
{
"message_field": {
"path_match": "message",
"match_mapping_type": "string",
"mapping": {
"type": "text",
"norms": false
}
}
},
{
"string_fields": {
"match": "
",
"match_mapping_type": "string",
"mapping": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
{
"integers": {
"match_mapping_type": "long",
"mapping": {
"type": "integer"
}
}
}

  ],
  "properties": {
      "@timestamp": {
        "type": "date"
      },
      "@version": {
        "type": "keyword"
      },
      "geoip": {
        "dynamic": true,
        "properties": {
          "ip": {
            "type": "ip"
          },
          "location": {
            "type": "geo_point"
          },
          "latitude": {
            "type": "half_float"
          },
          "longitude": {
            "type": "half_float"
          }
        }
      }
   }
 }

}
}
'


(David Pilato) #4

Please format your code, logs or configuration files using </> icon as explained in this guide and not the citation button. It will make your post more readable.

Or use markdown style like:

```
CODE
```

There's a live preview panel for exactly this reasons.

Lots of people read these forums, and many of them will simply skip over a post that is difficult to read, because it's just too large an investment of their time to try and follow a wall of badly formatted text.
If your goal is to get an answer to your questions, it's in your interest to make it as easy to read and understand as possible.
Please update your post.

Could you share your mapping, not the templates please?


(Manish mitra) #5
`you want know the template that i am using for both version.

Template for ES 2.3.4

{
"aliases": {},
"mappings": {
"default": {
"_all": {
"enabled": true,
"omit_norms": true
},
"dynamic_templates": [
{
"string_fields": {
"mapping": {
"fielddata": {
"format": "disabled"
},
"fields": {
"raw": {
"index": "not_analyzed",
"type": "string"
}
},
"index": "analyzed",
"omit_norms": true,
"type": "string"
},
"match": "",
"match_mapping_type": "string"
}
}
],
"properties": {
"@timestamp": {
"type": "date"
},
"@version": {
"index": "not_analyzed",
"type": "string"
},
"geoip": {
"dynamic": true,
"properties": {
"ip": {
"type": "ip"
},
"latitude": {
"type": "float"
},
"location": {
"type": "geo_point"
},
"longitude": {
"type": "float"
}
}
}
}
}
},
"order": 0,
"settings": {
"index": {
"refresh_interval": "5s"
}
},
"template": "example-"
}

template for ES 6

{
"order": 0,
"version": 60001,
"index_patterns": [
"example-"
],
"settings": {
"index": {
"codec": "best_compression",
"refresh_interval": "5s"
},
"number_of_shards" : 1,
"number_of_replicas" : 0
},
"mappings": {
"doc": {
"dynamic_templates": [
{
"message_field": {
"path_match": "message",
"match_mapping_type": "string",
"mapping": {
"type": "text",
"norms": false
}
}
},
{
"string_fields": {
"match": "",
"match_mapping_type": "string",
"mapping": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
{
"integers": {
"match_mapping_type": "long",
"mapping": {
"type": "integer"
}
}
}

],
"properties": {
"@timestamp": {
"type": "date"
},
"@version": {
"type": "keyword"
},
"geoip": {
"dynamic": true,
"properties": {
"ip": {
"type": "ip"
},
"location": {
"type": "geo_point"
},
"latitude": {
"type": "half_float"
},
"longitude": {
"type": "half_float"
}
}
}
}
}

}
}
'`
// Template for ES6 End//
//****************************Configutation End ****************//


(Manish mitra) #6
Sorry I attached template.

I am attaching mapping

// ***************** MApping in ES 6 Start****************

match => {"tmpmessage" => [ "^\s*<%{NUMBER:syslog_pri}>(?:%{NUMBER})?\s*%{GREEDYDATA:tmpmessage}",
"^\s*%{TIMESTAMP_ISO8601:eventtime}\s*%{GREEDYDATA:tmpmessage}",
"^\s*%{DATESTAMP:eventtime}\s*%{GREEDYDATA:tmpmessage}",
"^\s*%{SYSLOGTIMESTAMP:eventtime}\s*%{GREEDYDATA:tmpmessage}",
"^\s*"%{TIMESTAMP_ISO8601:eventtime}"\s*%{GREEDYDATA:tmpmessage}",
#"^\stype=%{WORD:audit_type}\s+msg=audit(%{NUMBER:audit_epoch}:%{NUMBER:audit_counter}):\s%{GRE$
"^\stype=%{WORD:audit_type}(?:[%{NUMBER}])?\s+msg=audit(%{NUMBER:audit_epoch}:%{NUMBER:audit_cou$
"^\s
%{TIME}\s*%{NOTSPACE}\s*<%{NUMBER:syslog_pri}>\s*%{GREEDYDATA:tmpmessage}",
"^\s*{"message":"%{TIMESTAMP_ISO8601:eventtime}\s+%{GREEDYDATA:tmpmessage}",
"^\s*{"message":"<%{NUMBER}>%{TIMESTAMP_ISO8601:eventtime}\s+%{IPORHOST}\s+%{DATA:processname}$
"^\s*{"message":"%{WORD}\s*%{TIME:eventtime}%{GREEDYDATA:tmpmessage}",
"^\s*{"message":"<%{NUMBER}>%{SYSLOGTIMESTAMP:eventtime}\s*%{GREEDYDATA:tmpmessage}",
"^\s*%{IPORHOST:ramoword}\s*%{GREEDYDATA:tmpmessage}",
"^\s*%{NUMBER:ramonumber}\s*%{GREEDYDATA:tmpmessage}"
]}
overwrite => [ "tmpmessage" ]

// ***************** MApping in ES 6 END****************

// ***************** MApping in ES 2 Start****************
match => {"tmpmessage" => [
# "\s*(?:%{TIMESTAMPA}|%{SYSLOGTIMESTAMP})s*%{PROG:processname}(?:[%{POSINT:procid}]:)?\s*(?:[%{NOTSPACE:procid}])?$
"^\s*%{SYSLOGTIMESTAMP}\s*%{IPORHOST}\s*%{NOTSPACE:ModuleName}\s*%{TIMESTAMP_ISO8601}\s*%{POSINT:procid}\s*%{LOGLEVEL:L$
"^\s*%{SYSLOGTIMESTAMP}\s*%{IPORHOST}\s*%{NOTSPACE:ModuleName}\s*%{TIMESTAMP_ISO8601}\s*%{NUMBER:hostname_Porta}\s*%{LO$
"^\s*%{SYSLOGTIMESTAMP}\s*%{IPORHOST}\s*%{NOTSPACE:ModuleName}\s*%{TIMESTAMP_ISO8601}\s*%{NUMBER:hostname_Porta}\s*%{LO$
"^\s*%{SYSLOGTIMESTAMP}\s*%{IPORHOST}\s*%{DATA:processname}[%{NUMBER:procid}]:\s+%{GREEDYDATA:msg}",
"^\s*(?:%{POSINT}) %{TIMESTAMP_ISO8601} %{DATA}\s*%{WORD:processname} %{POSINT:procid} %{WORD:event_name}(%{NUMBER:eve$
#RegEx for Necluster
"^\s*%{TIMESTAMP_ISO8601}\s*%{IPORHOST}\s*%{NOTSPACE:processname}[%{NUMBER:procid}](?::)?\s*%{TIMESTAMP_ISO8601}\s*%$
#RegEx for CEE
"^\s*%{TIMESTAMP_ISO8601}\s*%{IPORHOST}\s*%{NOTSPACE:processname}[%{NUMBER:procid}](?::)?\s*[%{NOTSPACE:ModuleName}$
"^\s*(?:%{TIMESTAMPA}|%{SYSLOGTIMESTAMP})\s*%{IPORHOST}\s*%{NOTSPACE:processname}[%{NUMBER:procid}](?::)?\s*(?:[)?%$
"^\s*%{IPORHOST}\s*%{DATA}:\s*%{SYSLOGTIMESTAMP}\s*:\s*%{NOTSPACE:processname}(?:[%{NUMBER:procid}]):\s*%{NOTSPACE}\s$
"^\s*%{GREEDYDATA}%{TIMESTAMP_ISO8601}\s+|\s+%{LOGLEVEL:LogLevel}\s+|\s+%{NOTSPACE:Thread}\s+|\s+%{NOTSPACE:processn$
"^\s*%{TIMESTAMPA}\s*%{NOTSPACE:uuid}\s*%{NOTSPACE:job}:\s*[\s*%{GREEDYDATA:msg}]",
"^%{MONTH}\s+%{MONTHDAY}\s+%{TIME}\s+%{IPORHOST:ip_client}:%{NUMBER:port_client}\s+%{GREEDYDATA:msg}"

"^\s*%{SYSLOGTIMESTAMP}\s*%{IPORHOST}\s*%{NOTSPACE:ModuleName}\s*%{TIMESTAMP_ISO8601}\s*%{NUMBER:hostname_Porta}\s*%{LO$
"^\s*%{SYSLOGTIMESTAMP}\s*%{IPORHOST}\s*%{NOTSPACE:ModuleName}\s*%{TIMESTAMP_ISO8601}\s*%{NUMBER:hostname_Porta}\s*%{LO$
"^\s*%{SYSLOGTIMESTAMP}\s*%{IPORHOST}\s*%{DATA:processname}[%{NUMBER:procid}]:\s+%{GREEDYDATA:msg}",
"^\s*(?:%{POSINT}) %{TIMESTAMP_ISO8601} %{DATA}\s*%{WORD:processname} %{POSINT:procid} %{WORD:event_name}(%{NUMBER:eve$
#RegEx for Necluster
"^\s*%{TIMESTAMP_ISO8601}\s*%{IPORHOST}\s*%{NOTSPACE:processname}[%{NUMBER:procid}](?::)?\s*%{TIMESTAMP_ISO8601}\s*%$
#RegEx for CEE
"^\s*%{TIMESTAMP_ISO8601}\s*%{IPORHOST}\s*%{NOTSPACE:processname}[%{NUMBER:procid}](?::)?\s*[%{NOTSPACE:ModuleName}$
"^\s*(?:%{TIMESTAMPA}|%{SYSLOGTIMESTAMP})\s*%{IPORHOST}\s*%{NOTSPACE:processname}[%{NUMBER:procid}](?::)?\s*(?:[)?%$
"^\s*%{IPORHOST}\s*%{DATA}:\s*%{SYSLOGTIMESTAMP}\s*:\s*%{NOTSPACE:processname}(?:[%{NUMBER:procid}]):\s*%{NOTSPACE}\s$
"^\s*%{GREEDYDATA}%{TIMESTAMP_ISO8601}\s+|\s+%{LOGLEVEL:LogLevel}\s+|\s+%{NOTSPACE:Thread}\s+|\s+%{NOTSPACE:processn$
"^\s*%{TIMESTAMPA}\s*%{NOTSPACE:uuid}\s*%{NOTSPACE:job}:\s*[\s*%{GREEDYDATA:msg}]",
"^%{MONTH}\s+%{MONTHDAY}\s+%{TIME}\s+%{IPORHOST:ip_client}:%{NUMBER:port_client}\s+%{GREEDYDATA:msg}"

// ***************** MApping in ES 6 End****************


(David Pilato) #7

This is not the mapping. This is GROK configuration.
And your code is still not formatted.


(Manish mitra) #8
indent preformatted text by 4 spaces

Sorry for the delayed reply. I am attaching the mapping I found by executing below curl command.
curl -XGET 'localhost:9200/unicalog1-mt-2018.04.20/_mapping'

//**************************Mapping of index in ES2 Start ******************************//

{"unicalog-mt-2018.04.20":{"mappings":{"logs":{"_all":{"enabled":true,"omit_norms":true},"dynamic_templates":[{"string_fields":{"mapping":{"fielddata":

{"format":"disabled"},"index":"analyzed","omit_norms":true,"fields":{"raw":

{"index":"not_analyzed","type":"string"}},"type":"string"},"match":"*","match_mapping_type":"string"}}],"properties":{"@timestamp":

{"type":"date","format":"strict_date_optional_time||epoch_millis"},"@version":{"type":"string","index":"not_analyzed"},"CORRELATION_ID":{"type":"string","norms":

{"enabled":false},"fielddata":{"format":"disabled"},"fields":{"raw":{"type":"string","index":"not_analyzed"}}},"comp_level":{"type":"string","norms":

{"enabled":false},"fielddata":{"format":"disabled"},"fields":{"raw":{"type":"string","index":"not_analyzed"}}},"dc_name":{"type":"string","norms":

{"enabled":false},"fielddata":{"format":"disabled"},"fields":{"raw":{"type":"string","index":"not_analyzed"}}},"file":{"type":"string","norms":

{"enabled":false},"fielddata":{"format":"disabled"},"fields":{"raw":{"type":"string","index":"not_analyzed"}}},"geoip":{"dynamic":"true","properties":{"ip":

{"type":"ip"},"latitude":{"type":"float"},"location":{"type":"geo_point"},"longitude":{"type":"float"}}},"host":{"type":"string","norms":

{"enabled":false},"fielddata":{"format":"disabled"},"fields":{"raw":{"type":"string","index":"not_analyzed"}}},"hostname":{"type":"string","norms":

{"enabled":false},"fielddata":{"format":"disabled"},"fields":{"raw":{"type":"string","index":"not_analyzed"}}},"log_family":{"type":"string","norms":

{"enabled":false},"fielddata":{"format":"disabled"},"fields":{"raw":{"type":"string","index":"not_analyzed"}}},"log_type":{"type":"string","norms":

{"enabled":false},"fielddata":{"format":"disabled"},"fields":{"raw":{"type":"string","index":"not_analyzed"}}},"msg":{"type":"string","norms":

{"enabled":false},"fielddata":{"format":"disabled"},"fields":{"raw":{"type":"string","index":"not_analyzed"}}},"path":{"type":"string","norms":

{"enabled":false},"fielddata":{"format":"disabled"},"fields":{"raw":{"type":"string","index":"not_analyzed"}}},"platform":{"type":"string","norms":

{"enabled":false},"fielddata":{"format":"disabled"},"fields":{"raw":{"type":"string","index":"not_analyzed"}}},"processname":{"type":"string","norms":

{"enabled":false},"fielddata":{"format":"disabled"},"fields":{"raw":{"type":"string","index":"not_analyzed"}}},"text":{"type":"string","norms":

{"enabled":false},"fielddata":{"format":"disabled"},"fields":{"raw":{"type":"string","index":"not_analyzed"}}},"timedelta":{"type":"string","norms":

{"enabled":false},"fielddata":{"format":"disabled"},"fields":{"raw":{"type":"string","index":"not_analyzed"}}}}},"default":{"_all":

{"enabled":true,"omit_norms":true},"dynamic_templates":[{"string_fields":{"mapping":{"fielddata":{"format":"disabled"},"index":"analyzed","omit_norms":true,"fields":

{"raw":{"index":"not_analyzed","type":"string"}},"type":"string"},"match":"*","match_mapping_type":"string"}}],"properties":{"@timestamp":

{"type":"date","format":"strict_date_optional_time||epoch_millis"},"@version":{"type":"string","index":"not_analyzed"},"geoip":{"dynamic":"true","properties":{"ip":

{"type":"ip"},"latitude":{"type":"float"},"location":{"type":"geo_point"},"longitude":{"type":"float"}}}}}}}}

//**************************Mapping of index in ES2 End ******************************//

//**************************Mapping of index in ES6 Start ******************************//

{"unicalog1-mt-2018.04.20":{"mappings":{"doc":{"properties":{"@timestamp":{"type":"date"},"@version":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"CORRELATION_ID":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"comp_level":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"dc_name":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"file":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"hostname":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"log_family":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"log_type":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"msg":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"path":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"platform":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"processname":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"text":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"timedelta":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}}}}}}

//**************************Mapping of index in ES6 End ******************************//


(system) closed #9

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.