Try to reduce number of mapped field before migrating from elasticsearch 2.4

Hi,

we are still running elasticsearch 2.4 and planning to migrate to the 7.x in the near future. To prepare for the upgrade we already used the migration plugin and found out that our current mapping is a problem. We have indices with a few thousand fields, some indices with a bit over 10k.

We analyzed our mapping and saw that we are using mostly the same properties for all fields we index.

Currently our mapping looks like the following:

{
    "documents": {
        "properties": {
            "Field 1": {
                "properties": {
                    "editable": {
                        "type": "boolean"
                    },
                    "name": {
                        "type": "string",
                        "index": "not_analyzed"
                    },
                    "possibleValues": {
                        "type": "string",
                        "index": "not_analyzed"
                    },
                    "type": {
                        "type": "string",
                        "index": "not_analyzed"
                    },
                    "value": {
                        "type": "string",
                        "index": "not_analyzed"
                    }
                }
            },
            "Field 2": {
                "properties": {
                    "editable": {
                        "type": "boolean"
                    },
                    "name": {
                        "type": "string",
                        "index": "not_analyzed"
                    },
                    "possibleValues": {
                        "type": "string",
                        "index": "not_analyzed"
                    },
                    "type": {
                        "type": "string",
                        "index": "not_analyzed"
                    },
                    "value": {
                        "type": "date",
                        "format": "strict_date_optional_time||epoch_millis"
                    }
                }
            }
        }
    }
}

Our idea is to change it like this to prevent a mapping explosion:

{
    "documents": {
        "properties": {
            "Fields": {
                "properties": {
                    "id": {
                        "type": "string",
                        "index": "not_analyzed"
                    },
                    "editable": {
                        "type": "boolean"
                    },
                    "name": {
                        "type": "string",
                        "index": "not_analyzed"
                    },
                    "possibleValues": {
                        "type": "string",
                        "index": "not_analyzed"
                    },
                    "type": {
                        "type": "string",
                        "index": "not_analyzed"
                    },
                    "value": {
                        "type": "string",
                        "index": "not_analyzed"
                    },
                    "date_value": {
                        "type": "date",
                        "format": "strict_date_optional_time||epoch_millis"
                    },
                    "numberic_value": {
                        "type": "double"
                    }
                }
            }
        }
    }
}

We would then use bool queries with the Fields.id in our queries. Is this a suitable solution or would we get any additional problems? Also some fields can have multiple values. Is it a problem to have the same id twice which such a mapping?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.