We have a situation with limits in the mapping, and I am not sure what is the way to go as there are multiple solutions.
I will start by describing the use case:
- there are multiple tenants, which each have their own Elasticsearch instance and a single index
- there are multiple object types (currently ranging from 10-500 per tenant)
- each object type has multiple attributes, unique to that object type (currently ranging from 10 to 2000 per object type)
- per object type there is a fluctuating amount of documents, some only have 10-100, some have 500.000
In older Elasticsearch versions, this was done by creating a single mapping per object type, and the store each attribute to that mapping. But in the current versions there is only a single mapping.
We implemented this, by creating a unique property key, based on object type and attribute name, and store these in the single mapping of the tenant index.
This causes issues, as the total number of attributes in the mapping will exceed 1000.
I read the documentation about this, and it warns to not increase that limit as it will lead to issues and a mapping explosion.
On this board I found some info:
Using a flattened structure is not usable in our use-case as we need advanced filtering and querying. Splitting the index is also not that easy to do, as we don't know how many documents the object will have, and it is overkill to create a new index for only a hand full of documents. Even for the biggest object types this would be overkill. As most of the time there are 1 or 2 large ones and several small ones, but that is not something known beforehand.
I also read the blogpost: Too many fields! 3 ways to prevent mapping explosion in Elasticsearch | Elastic Blog but that does not help us either, as we actually need that many fields and cannot generate them on the fly.
I found another solution in this ticket: Limit of total fields  exceeded - #17 by dadoonet
Currently we have implemented a proof-of-concept using nested fields:
- we create a nested field for each object type
- the nested field contains the properties of the object type
This seems to work, as you would think this limits us to a 1000 object types instead of 1000 attributes, and 1000 object types is not a real world scenario for us.
However, this hits another limit: by default you are only allowed to create 50 nested property types.
What would be best to do:
- store each property on the main mapping, and increase the property limit from 1.000 to something like 50.000
- increase the nested limit from 50 to something like 1.000
And what would be the performance consequences for each of these 2 solutions?
A typical average use case for a tenant is 100 object types, with 2000 total attributes and about 500.000 documents spread between these object types.