Advice on creating dynamic mapping of very dynamic object

Hi,

We're building a case management system with ES as its backend for storing case data.

One of the issues/requirements we have is the following:

The "case" type contains an object field called "file". The content of this "file" field is very dynamic, but needs to be searchable. We don't know the contents up front, so we can't make a mapping for it. Currently our mapping for the case type looks like this:

{"case": {"properties": {
    "definition": {
        "type": "string",
        "index": "not_analyzed"
    },
    "file": {
        "type": "object",
        "enabled": false
    },
    "id": {
        "type": "string",
        "index": "not_analyzed"
    },
    "parentCaseId": {
        "type": "string",
        "index": "not_analyzed"
    },
    "plan": {
        "type": "nested",
        "properties": {"items": {
            "type": "nested",
            "properties": {
                "caseInstanceId": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "currentState": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "historyState": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "id": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "isRepeating": {"type": "boolean"},
                "isRequired": {"type": "boolean"},
                "lastModified": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "name": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "stageId": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "transition": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "type": {
                    "type": "string",
                    "index": "not_analyzed"
                },
                "user": {
                    "type": "string",
                    "index": "not_analyzed"
                }
            }
        }}
    },
    "rootCaseId": {
        "type": "string",
        "index": "not_analyzed"
    }
}}}```

I'd like to have some advice on how to create a mapping for the file object to make it searchable, but still keep it dynamic. Each instance of a "case" document can contain a complete different content in "file"

Any help on this would be great.

Kind regards,

Danny

Well setting enabled to false means you cannot search on it - enabled | Elasticsearch Guide [8.11] | Elastic

But otherwise, if you just set it as an object type as you have, then you should be ok. Object field type | Elasticsearch Guide [8.11] | Elastic might help as well.

Hi,

Thanks for the response. But this does not solve the issue. The problem is that this is about a case management application (based on cmmn), where the case data can be completely different from case to case. E.g the case data for a social benefits application case, is completely different from a case that handles a leave request.

Due to the dynamic mapping feature of ES, we end up with a "mapping mess".

So what I'm thinking of doing now is the following: there's a one-to-one relation of the case data to the case instance (the case type). Maybe we can create a specific mapping for each case data structure for each type of case model and join the two types to one "object" in the application when needed. The case management application exposes all it's data via a REST API. So for front-end applications this is not an issue. As long as the case data type has the id of the case type in it's data, were OK and application side joins are easy to do. And since we're mainly getting the data by ID, fetching the data should be fast.

I was wondering about if maybe parent-child relations would be of any use in this scenario?
And also, how would we have to handle updates to the case data model (i.e. change the mapping if the case needs to store more data).

Regards,

Danny