Elasticsearch 6.0 bulk write is slower in a cluster but fast in single node setup

We have a 5 nodes production setup (1 coordinator, 1 master, 3 data nodes). We have setup a Kafka sink to dump messages to an ES index in this cluster. The URL for the sink is pointed to the Coordinator node. Below is machine configuration of this cluster,

Coordinator (2 cores, 7 GB ram)
Master (2 cores, 7 GB ram)
1 Data node (8 cores, 32 GB ram)
2 Data nodes (2 cores, 7 GB ram)

Similarly, another sink is setup to dump messages to ES index to single node Elasticsearch (Staging setup). This single node is of 8 cores 28 GB ram.

We noticed a wired behavior that the dump to production cluster is lagging 8 hrs behind that of the staging node.

Below is the mapping for the index. We use routing key while building the bulk insert data.

{
	"settings": {
		"index": {
			"number_of_shards": 260,
			"number_of_replicas": 1,
			"sort.field": ["csf", "ds", "tag", "ts"],
			"sort.order": ["asc", "asc", "asc", "asc"]
		}
	},
	"mappings": {
		"data_type": {
			"properties": {
				"csf": {
					"type": "keyword"
				},
				"ds": {
					"type": "keyword"
				},
				"tag": {
					"type": "keyword"
				},
				"ts": {
					"type": "date",
					"format": "yyMMddHHmmssSSSSSS"
				},
				"value": {
					"type": "double"
				}
			}
		}
	}
}

Can somebody help us in figuring out why the bulk inserts in cluster is slow while that of single node is fast?

Elasticsearch assumes all data nodes to be equal, so the fact that you have one large and 2 small will mean that the the smaller ones will work much harder than the larger one and limit performance.

The main point of having a cluster is usually to avoid single points of failure in order to increase stability, availability and resiliency. A production setup with a single dedicated master node is therefore not recommended. Just because you can have specialised node types does not mean that you should. I would recommend that you instead setup a basic 3 node cluster where all nodes are on the same type of hardware and are all master eligible and hold data. Make sure that you set minimum_master_nodes to 2 to avoid split brain scenarios.

How have you arrived at this value? That is a lot of shards to index into, which may lead to bad indexing performance and possibly also bulk rejections.

If you are not updating your data, you may want to consider instead using time-based indices, as this generally is more efficient. You may also want to read this blog post on shards and sharding practices.

1 Like

Thanks @Christian_Dahlqvist for the reply. We shall definitely set up 3 basic nodes as you suggested and see if that solves this problem.

About the shards, we have large number of data streaming into the system each minute. The data could be classified by type which we know before creating an index (There are 260 possible types of data). So we decided to create index with these many shards & provide routing so data of each type will be stored in single shard.

If this approach is to cause problem, we shall setup time-based indices like you had suggested.

It does not sound to me like that is an appropriate use of routing, especially if it leads to that many shards concurrently being written to. Routing does not mean that each routing key gets it's own shard, just that all data related to a specific routing key resides in a single shard within the index.

We have setup 3 basic nodes setup & used default 5 primary & 1 replica shards as you had suggested. This solved the problem & inserts are faster now. Thanks @Christian_Dahlqvist for helping out.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.