Performance problem within add a new object to a nested array

Hi Everyone,

I have a performance issue when I try to add object to a nested array field. If this array has ~ 0-500 objects performance is ok but when I add more and more the performance is going down quite fast. Eventually when I do bulk query to update 100 documents to add new object to the nested array field of these documents execution time of this query can be 10-40 seconds. Couldn't find a solution to solve this problem. If someone knows how to solve this I will be glad of any help with this.

My mapping:
dynamic_templates: [
...
{ activity: {
match_mapping_type: '*',
match: 'activity',
mapping: {
type: 'nested',
properties: {
activityId: {
type: 'keyword',
},
filterType: {
type: 'keyword',
},
category: {
type: 'keyword',
},
createdAt: {
type: 'date',
},
text: {
type: 'text',
analyzer: 'custom_english_analyzer',
}
}
}
}}
...
]

My bulk operation:
{"update": { "_index": "my_index", "_id": 1111111 }}
{"script": {"source": "if(ctx._source.activity != null) { ctx._source.activity.removeIf(activity -> activity.activityId == params.activity.activityId) } else { ctx._source.activity = new ArrayList() } ctx._source.activity.add(params.activity)", "params": {"activity": {"activityId": "5dd2d60dd7ef16348aec8d9e","filterType": "FILTER_TYPE","category": "CATEGORY","action": "ACTION","createdAt": "2019-01-01T00:00:00.000Z","text": "Searchable text"}}}}
...

Each nested instance within the document is a separate document in Elasticsearch and all need to be reindexed when a new nested object is added, which means that updates get more and more expensive with increased size. This is one of the main drawbacks with nested documents. If you are adding a lot of objects you may want to model your data differently.

Even when I try to keep this array not as a nested field I have this issue. It seems if an array just a field in main document Elasticsearch reindexing only main document, am I right?

Frequent updates to the same document can also be a cause of poor performance as it may lead to a lot of small refreshes that can be quite expensive. Every time a document is updated that has not been refreshed, a new refresh is triggered.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.