Performance issue due to dynamic mapping


(Beatriz) #1

Hello everyone.
I am recently working with Elasticsearch and I have some trouble with the mapping and indices, causing the performance to be a little slow.

I am trying to load 1.500.000 documents (about 30GB) in an Amazon Elasticsearch Service. The index I have created may not be correctly indexed, causing Elastic to be slower and slower (currently indexing a document in 5 seconds with only 4000 documents).

I think the main cause is the indexation Elastic does when uploading a new document, as there is a lot of data indexed when there is nos need to. I have tried to set the dynamic mappings to false, and specify only the ones we need, but there are some likely to change and should be dynamic in my opinion.

Here are my mappings nowadays:

    {
    	"mappings": {
    		"person": {
    			"dynamic": false,
    			"properties": {
    				"name": {
    					"type": "text",
    					"fields": {
    						"keyword": {
    							"type": "keyword",
    							"ignore_above": 256
    						}
    					}
    				},
    				"audience_interests": {
    					"properties": {
    						"Activewear": {
    							"type": "float"
    						},
    						"Art & Design": {
    							"type": "float"
    						},
    						"Beauty & Cosmetics": {
    							"type": "float"
    						},
    						"Beer, Wine & Spirits": {
    							"type": "float"
    						},
    						"Business & Careers": {
    							"type": "float"
    						},
    						"Camera & Photography": {
    							"type": "float"
    						},
    						"Cars & Motorbikes": {
    							"type": "float"
    						},
    						"Clothes, Shoes, Handbags & Accessories": {
    							"type": "float"
    						},
    						"Coffee, Tea & Beverages": {
    							"type": "float"
    						},
    						"Electronics & Computers": {
    							"type": "float"
    						},
    						"Fitness & Yoga": {
    							"type": "float"
    						},
    						"Friends, Family & Relationships": {
    							"type": "float"
    						},
    						"Gaming": {
    							"type": "float"
    						},
    						"Healthcare & Medicine": {
    							"type": "float"
    						},
    						"Healthy Lifestyle": {
    							"type": "float"
    						},
    						"Home Decor, Furniture & Garden": {
    							"type": "float"
    						},
    						"Jewellery & Watches": {
    							"type": "float"
    						},
    						"Luxury Goods": {
    							"type": "float"
    						},
    						"Music": {
    							"type": "float"
    						},
    						"Pets": {
    							"type": "float"
    						},
    						"Restaurants, Food & Grocery": {
    							"type": "float"
    						},
    						"Shopping & Retail": {
    							"type": "float"
    						},
    						"Sports": {
    							"type": "float"
    						},
    						"Television & Film": {
    							"type": "float"
    						},
    						"Tobacco & Smoking": {
    							"type": "float"
    						},
    						"Toys, Children & Baby": {
    							"type": "float"
    						},
    						"Travel, Tourism & Aviation": {
    							"type": "float"
    						},
    						"Wedding": {
    							"type": "float"
    						},
    						"Fashion & Style": {
    							"type": "float"
    						}
    					}
    				},
    				"audienceData": {
    					"properties": {
    						"cities": {
    							"type": "nested",
    							"properties": {
    								"cityPercentage": {
    									"type": "float"
    								},
    								"count": {
    									"type": "long"
    								},
    								"country": {
    									"type": "text",
    									"fields": {
    										"keyword": {
    											"type": "keyword",
    											"ignore_above": 256
    										}
    									}
    								},
    								"id": {
    									"type": "long"
    								},
    								"name": {
    									"type": "text",
    									"fields": {
    										"keyword": {
    											"type": "keyword",
    											"ignore_above": 256
    										}
    									}
    								},
    								"cityName": {
    									"type": "text",
    									"fields": {
    										"keyword": {
    											"type": "keyword",
    											"ignore_above": 256
    										}
    									}
    								},
    								"percentage": {
    									"type": "float"
    								},
    								"state": {
    									"type": "text",
    									"fields": {
    										"keyword": {
    											"type": "keyword",
    											"ignore_above": 256
    										}
    									}
    								}
    							}
    						},
    						"countries": {
    							"type": "nested",
    							"properties": {
    								"count": {
    									"type": "long"
    								},
    								"countryCode": {
    									"type": "text",
    									"fields": {
    										"keyword": {
    											"type": "keyword",
    											"ignore_above": 256
    										}
    									}
    								},
    								"countryName": {
    									"type": "text",
    									"fields": {
    										"keyword": {
    											"type": "keyword",
    											"ignore_above": 256
    										}
    									}
    								},
    								"countryPercentage": {
    									"type": "float"
    								},
    								"percentage": {
    									"type": "float"
    								}
    							}
    						}
    					}
    				}
    			}
    		}
    	}
    }

Audience_interests may change easily and I need to index them in order to search. I have read it is not possible to change existing mappings, and can be added only new properties, but not modify old ones, so they can not be changed once created.

To solve this, I let Elasticsearch to manage all mappings , but dynamically mapping all of them (even the ones I don't need to index) I think is the root of the performance problem as it is slower and slower to update new data. Is there a way to specify a way to make only these property dynamic and ignore the non specified mappings?

Is there any other solution I'm missing?

Thanks in advance.


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.