Slow bulk indexing

Alexandr_Dorosh · February 15, 2016, 10:40am

Hello

I'm having indexing performance problems. I'm using python for bulk operations
1000 documents per bulk tooks about 30 seconds
Documents quite small, about 15 fields most of which integers or short strings.
Indexing daemon runs almost on every ES node (in 5 to 15 threads), each deamon connects to local ES nods. Besides indexing deamon delete old records using bulk delete (1000 recores per bulk to)

Each index has from 300 millions to 1.5 billions records, devided on 10 shards (largest index with 1.5 billions records has 20 shards) and 1 replica.

My cluster has 24 nodes, 27 indexes 432 shards, 9 billions documents, 6.5TB of data
Elasticsearch version 2.2.0
Java: open-jdk (from 1.7.0_65 to 1.7.0_91)
Client library: elasticsearch 2.2.0 (latest)

Node details:
Ubuntu 12.04 or 14.04
CPU: Intel Xeon 8 cores
RAM: 32GB ( ES_HEAP_SIZE=16g on several nodes 20g)
SSD discs (2 discs per node, some of them in RAID1)

------------ index settings ---------------
{
"index": {
"creation_date": "1450432002298",
"number_of_replicas": "1",
"codec": "best_compression",
"uuid": "riSNQJY-R5K8McxgkbXbCg",
"ttl": {
"disable_purge": "true"
},
"analysis": {
"filter": {
"english_stemmer": {
"type": "stemmer",
"language": "english"
}
},
"analyzer": {
"english": {
"type": "custom",
"filter": [
"lowercase",
"english_morphology"
],
"tokenizer": "standard"
}
}
},
"number_of_shards": "10",
"refresh_interval": "30s",
"version": {
"created": "2010099"
}
}
}

---------- mapping ---------------
"positions": {
"_routing": {
"required": true
},
"_ttl": {
"enabled": true,
"default": 7776000000
},
"properties": {
"dynamic": {
"type": "short"
},
"position": {
"type": "short"
},
"region_queries_count_wide": {
"type": "integer"
},
"right_spell": {
"index": "no",
"doc_values": true,
"type": "string"
},
"keyword": {
"analyzer": "english",
"type": "string"
},
"keyword_id": {
"type": "integer"
},
"date": {
"format": "strict_date_optional_time||epoch_millis",
"type": "date"
},
"geo_names": {
"index": "not_analyzed",
"type": "string"
},
"cost": {
"type": "float"
},
"url": {
"index": "not_analyzed",
"type": "string"
},
"region_queries_count": {
"type": "integer"
},
"url_crc": {
"type": "long"
},
"subdomain": {
"index": "not_analyzed",
"type": "string"
},
"concurrency": {
"type": "short"
},
"domain": {
"index": "not_analyzed",
"type": "string"
},
"found_results": {
"type": "long"
},
"types": {
"index": "not_analyzed",
"type": "string"
}
},
"_all": {
"enabled": false
}
}

------------ elasticsearch.yml ------------
cluster.name: name
node.name: "es18"
node.master: false
node.data: true

path.data: /var/lib/elasticsearch,/home/elasticsearch
path.repo: ["/home/backupfs"]

http.port: 9200
http.host: "127.0.0.1"
network.bind_host: 0.0.0.0
network.publish_host: non_loopback:ipv4
transport.tcp.port: 9300
transport.tcp.compress: true
index.max_result_window: 60000
gateway.recover_after_nodes: 15
gateway.expected_nodes: 17
gateway.recover_after_time: 15m
bootstrap.mlockall: true
indices.recovery.max_bytes_per_sec: 150mb
indices.store.throttle.max_bytes_per_sec: 150mb
index.translog.flush_threshold_size: 500mb
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts:

es-gw1
es-gw2
es1
es2
es3

script.inline: on
script.indexed: on

threadpool.search.type: fixed
threadpool.search.size: 20
threadpool.search.queue_size: 100

threadpool.bulk.type: fixed
threadpool.bulk.size: 20
threadpool.bulk.queue_size: 300

threadpool.index.type: fixed
threadpool.index.size: 20
threadpool.index.queue_size: 100

indices.memory.index_buffer_size: 10%
indices.memory.min_shard_index_buffer_size: 12mb
indices.memory.min_index_buffer_size: 96mb

index.refresh_interval: 30s
index.translog.flush_threshold_ops: 50000

index.search.slowlog.threshold.query.warn: 10s
index.search.slowlog.threshold.query.info: 5s

index.search.slowlog.threshold.fetch.warn: 2s
index.search.slowlog.threshold.fetch.info: 1s

What should I do to increase indexing speed?

Alexandr_Dorosh · February 15, 2016, 10:40am

------- iostat output from several nodes -------
[es8] out: avg-cpu: %user %nice %system %iowait %steal %idle
[es8] out: 13.62 0.00 1.28 1.26 0.00 83.84

[es22] out: avg-cpu: %user %nice %system %iowait %steal %idle
[es22] out: 6.28 0.00 0.82 18.77 0.00 74.13

[es16] out: avg-cpu: %user %nice %system %iowait %steal %idle
[es16] out: 22.35 0.00 1.79 4.05 0.00 71.81

[es12] out: avg-cpu: %user %nice %system %iowait %steal %idle
[es12] out: 25.47 0.00 1.78 4.40 0.00 68.35

[es15] out: avg-cpu: %user %nice %system %iowait %steal %idle
[es15] out: 24.32 0.00 1.93 4.91 0.00 68.84

[es11] out: avg-cpu: %user %nice %system %iowait %steal %idle
[es11] out: 23.94 0.00 1.65 3.04 0.00 71.37

[es2] out: avg-cpu: %user %nice %system %iowait %steal %idle
[es2] out: 9.62 0.00 0.89 1.13 0.00 88.35

[es6] out: avg-cpu: %user %nice %system %iowait %steal %idle
[es6] out: 20.14 0.00 1.28 1.55 0.00 77.03

[es1] out: avg-cpu: %user %nice %system %iowait %steal %idle
[es1] out: 10.55 0.00 0.97 1.36 0.00 87.12

[es18] out: avg-cpu: %user %nice %system %iowait %steal %idle
[es18] out: 16.11 0.00 0.98 5.82 0.00 77.10

[es19] out: avg-cpu: %user %nice %system %iowait %steal %idle
[es19] out: 18.11 0.00 1.02 9.89 0.00 70.97

[es5] out: avg-cpu: %user %nice %system %iowait %steal %idle
[es5] out: 7.82 0.00 0.72 1.47 0.00 89.99

[es3] out: avg-cpu: %user %nice %system %iowait %steal %idle
[es3] out: 12.44 0.00 0.94 0.89 0.00 85.73

[es7] out: avg-cpu: %user %nice %system %iowait %steal %idle
[es7] out: 21.69 0.00 3.53 1.41 0.00 73.36

---------- load average -----------
[es14] out: 12:23:17 up 248 days, 18:54, 2 users, load average: 3.22, 2.95, 2.97
[es7] out: 12:22:37 up 314 days, 1:55, 3 users, load average: 3.07, 3.27, 3.24
[es12] out: 12:23:17 up 257 days, 2:25, 2 users, load average: 6.16, 5.66, 5.57
[es21] out: 11:23:17 up 110 days, 2:16, 2 users, load average: 2.11, 1.46, 1.09
[es20] out: 11:23:17 up 123 days, 59 min, 2 users, load average: 0.64, 0.47, 0.51
[es3] out: 12:23:17 up 462 days, 2:38, 3 users, load average: 1.21, 1.10, 1.11
[es16] out: 11:23:17 up 248 days, 1:59, 2 users, load average: 3.03, 3.86, 3.99
[es2] out: 12:23:17 up 530 days, 18:47, 2 users, load average: 0.83, 0.82, 0.95
[es5] out: 12:23:15 up 391 days, 1:47, 3 users, load average: 0.26, 0.39, 0.46
[es1] out: 12:23:17 up 487 days, 21:55, 2 users, load average: 1.59, 1.42, 1.31
[es8] out: 12:23:17 up 314 days, 2:04, 3 users, load average: 1.98, 1.66, 1.56
[es10] out: 12:23:17 up 391 days, 1:50, 2 users, load average: 0.18, 0.35, 0.39
[es11] out: 12:23:17 up 257 days, 2:19, 2 users, load average: 5.04, 4.67, 4.45
[es6] out: 12:23:17 up 337 days, 1:17, 2 users, load average: 0.65, 1.18, 1.32
[es13] out: 12:23:17 up 248 days, 19:11, 2 users, load average: 1.63, 1.67, 1.69
[es17] out: 12:23:17 up 248 days, 1:20, 2 users, load average: 6.46, 6.37, 6.51

sombut · February 18, 2016, 4:54am

Same issue here. When comparing the index latency between the elasticsearch 2.2 and 1.7.4, the 2.2 latency took more than 10x slower.

This parameter would help a lot in term of in 2.2 but you're at risk of loosing data too.

index.translog.durability: async

Alexandr_Dorosh · February 18, 2016, 9:15am

Thank you for your reply

Now I'm trying to increase indices.memory.index_buffer_size
if it doesn't help I'll try this option

Topic		Replies	Views
Slow bulk indexing performance Elasticsearch	6	1365	December 11, 2018
Slow bulk deletes Elasticsearch	14	811	January 1, 2019
Bulk indexing size? Elasticsearch	5	329	July 6, 2017
Elasticsearch poor indexing performance Elasticsearch	6	848	December 1, 2017
Java bulk API slows down if client is not closed and reopened Elasticsearch	9	520	July 6, 2017

Slow bulk indexing

------- iostat output from several nodes ------- [es8] out: avg-cpu: %user %nice %system %iowait %steal %idle [es8] out: 13.62 0.00 1.28 1.26 0.00 83.84

[es22] out: avg-cpu: %user %nice %system %iowait %steal %idle [es22] out: 6.28 0.00 0.82 18.77 0.00 74.13

[es16] out: avg-cpu: %user %nice %system %iowait %steal %idle [es16] out: 22.35 0.00 1.79 4.05 0.00 71.81

[es12] out: avg-cpu: %user %nice %system %iowait %steal %idle [es12] out: 25.47 0.00 1.78 4.40 0.00 68.35

[es15] out: avg-cpu: %user %nice %system %iowait %steal %idle [es15] out: 24.32 0.00 1.93 4.91 0.00 68.84

[es11] out: avg-cpu: %user %nice %system %iowait %steal %idle [es11] out: 23.94 0.00 1.65 3.04 0.00 71.37

[es2] out: avg-cpu: %user %nice %system %iowait %steal %idle [es2] out: 9.62 0.00 0.89 1.13 0.00 88.35

[es6] out: avg-cpu: %user %nice %system %iowait %steal %idle [es6] out: 20.14 0.00 1.28 1.55 0.00 77.03

[es1] out: avg-cpu: %user %nice %system %iowait %steal %idle [es1] out: 10.55 0.00 0.97 1.36 0.00 87.12

[es18] out: avg-cpu: %user %nice %system %iowait %steal %idle [es18] out: 16.11 0.00 0.98 5.82 0.00 77.10

[es19] out: avg-cpu: %user %nice %system %iowait %steal %idle [es19] out: 18.11 0.00 1.02 9.89 0.00 70.97

[es5] out: avg-cpu: %user %nice %system %iowait %steal %idle [es5] out: 7.82 0.00 0.72 1.47 0.00 89.99

[es3] out: avg-cpu: %user %nice %system %iowait %steal %idle [es3] out: 12.44 0.00 0.94 0.89 0.00 85.73

Related topics