Saving data to ES 5.4 blocked when ES is creating mapping and updating mapping

elasticyu · April 26, 2018, 8:33am

My workflow is:

Reading data from kafka
Saving data into Elastic Search 5.4

I checked the time spent at every method in my application, the following picture shows normal workflow: most time is spent at reading data from kafka

The following picture shows abnormal workflow: all the time is spent at saving data into ES 5.4, apparently it means ES failed to response.

and I checked the ES log, I found ES was creating mapping and updating mapping when application failed to save data to ES.

Christian_Dahlqvist · April 26, 2018, 8:42am

Updating a mapping will require the cluster state to be updated and then propagated, which will affect throughput. While this goes on you can not index into that index. As the cluster state grows this may get slower and slower. You seem to have a very large number of dynamically mapped fields, which could become problematic over time.

elasticyu · April 26, 2018, 8:52am

Thank you for answering my question. So how could I fix this promblem? Now my default template is as following:
{
"defaulttemplate" : {
"order" : 0,
"template" : "*",
"settings" : {
"index" : {
"translog" : {
"flush_threshold_size" : "1024mb",
"sync_interval" : "10s",
"durability" : "async"
},
"number_of_replicas" : "0",
"refresh_interval" : "10s"
}
},
"mappings" : {
"default" : {
"_source" : {
"enabled" : "false"
},
"_all" : {
"enabled" : "false"
}
}
},
"aliases" : { }
}
}

Christian_Dahlqvist · April 26, 2018, 8:56am

One option would be to provide the mappings you expect through an index template ahead of time. Updating mappings one-by-one as they come along is inefficient.

How many fields do you have in your mappings?

SpuTn1K · April 26, 2018, 9:05am

Hi Christian,

Let me breifly talk about our problem.

We are upgrading from ES 1.X to ES 5.4 cluster. It is true that we got a lot of dynamic mappings, and we didn't run into this problem when we were using ES 1.X which is a single node but not a cluster. However, as far as I know, the speed of creating and updating mappings is mostly affected by the number of shards. We set the number of shards as 5 in the two circumstances. Will the number of nodes slow down the speed significantly?

By the way, we got about 20 fields.

Christian_Dahlqvist · April 26, 2018, 9:10am

The mappings are kept in the cluster state, so I don't think the number of shards has an impact. The updated mappings do however need to be spread across the nodes in the cluster, so increased number of nodes will take longer. From Elasticsearch 2.x, deltas are sent, which is more efficient that the method used in 1.x.

Based on the screenshot it looks like you have more than 20 fields. Can you retrieve the mapping for that index and check?

SpuTn1K · April 26, 2018, 9:16am

{
"2018_04_21" : {
"mappings" : {
"1_40179" : {
"_all" : {
"enabled" : false
},
"_source" : {
"enabled" : false
},
"properties" : {
"channel" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"event_id" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"is_new" : {
"type" : "long"
},
"is_new_player" : {
"type" : "long"
},
"os_type" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"param_type" : {
"type" : "long"
},
"room_name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"server" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"timestamp" : {
"type" : "long"
},
"user_id" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"value" : {
"type" : "float"
}
}
}
}
}
}

Above is one of our mappings, I know the text type would slow down the speed, so I am considering use the below template to make some improvement.

"dynamic_templates": [
{
"string_to_keyword": {
"match_mapping_type": "string",
"mapping": {
"type": "keyword"
}
}
}
]
}

Can you give us any suggestions about the new template?
Thanks in advance!

Christian_Dahlqvist · April 26, 2018, 9:19am

What does the mappings for the 2018_04_26 index (seen in screenshot) look like? Do you by any chance have a lot of types in your indices?

SpuTn1K · April 26, 2018, 9:21am

Yep, we do have a lot of types in one index and all of them share the same mapping structure.

Christian_Dahlqvist · April 26, 2018, 9:24am

Then I guess that is the problem as each new type also requires the mappings to be updated. You should move away from using multiple types as types are being removed. In version 6.x it is already not possible to create new indices with more than one type, and the goal is to over time remove the concept of type completely. You can instead create a new field called type and store the type there, which will allow you to filter on it.

Once you have done this, you can add your fields with the correct mapping to an index template, and that should remove the issues you have been seeing.

SpuTn1K · April 26, 2018, 9:27am

Thanks for your patience! We will try it later.

system · May 24, 2018, 9:30am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Put Mapping call -- performance blow? Elasticsearch	2	543	July 6, 2017
Elasticsearch 2.1.0 update_mapping Elasticsearch	2	1591	July 5, 2017
ES fails to update dynamic mapping; mapping and index creation times out afterwards Elasticsearch	4	5174	July 6, 2017
Update mapping is throttled while update index templates? Elasticsearch	7	594	May 20, 2019
Failed to process cluster event (put-mapping) within 30s Elasticsearch	1	793	November 22, 2018

Saving data to ES 5.4 blocked when ES is creating mapping and updating mapping

Related topics