Bulk Processor taking too long

I'm using the bulk API to index my docs and it is taking longer than expected. My bulk processor client is as follows

</BulkProcessor.Builder bulkProcessorBuilder = BulkProcessor.builder(client::bulkAsync, listener);
bulkProcessorBuilder.setFlushInterval(TimeValue.timeValueSeconds(10))
.setBulkActions(1000)
.setBulkSize(new ByteSizeValue(5, ByteSizeUnit.MB))
.setConcurrentRequests(20);/>

So for 5 mb bulk request (which is about 45 docs) it is taking about ~1s. My index settings are default. I disabled refresh when the indexing is in process. Any thoughts would be appreciated

Why do you think it's too long?

It depends a lot on the documents, the mapping, the hardware...

What if you send more documents than 45?

How many parallel indexing threads/processes do you have?

If you mean this setting on the bulk processor - 'setConcurrentRequests()', I tried with 5, 10, 15. no difference

Increasing the bulk size to allow more documents in one bulk request doesn't make much of a difference either. When you say it depends on the mapping, what kind of things in a mapping would affect the performance. Would having too many keyword fields be any concern?

What is the specification of your Elasticsearch cluster? How many nodes? What type of hardware and storage?

What does CPU usage and disk I/O look like in the cluster while indexing? Are you using dynamic mappings? If so are new fields frequently mapped?

I'm running ES in a docker container in my localhost.
My cluster/node settings

{
"_nodes": {
"total": 1,
"successful": 1,
"failed": 0
},
"cluster_name": "docker-cluster",
"nodes": {
"OY2gXh4qQZOFIMiDOcu9Bw": {
"name": "OY2gXh4",
"transport_address": "172.17.0.2:9300",
"host": "172.17.0.2",
"ip": "172.17.0.2",
"version": "6.2.4",
"build_hash": "ccec39f",
"total_indexing_buffer": 103887667,
"roles": [
"master",
"data",
"ingest"
],
"settings": {
"cluster": {
"name": "docker-cluster"
},
"node": {
"name": "OY2gXh4"
},
"path": {
"logs": "/usr/share/elasticsearch/logs",
"home": "/usr/share/elasticsearch"
},
"discovery": {
"type": "single-node",
"zen": {
"minimum_master_nodes": "1"
}
},
"client": {
"type": "node"
},
"http": {
"type": {
"default": "netty4"
}
},
"transport": {
"type": {
"default": "netty4"
}
},
"network": {
"host": "0.0.0.0"
}
},
"os": {
"refresh_interval_in_millis": 1000,
"name": "Linux",
"arch": "amd64",
"version": "4.9.12-moby",
"available_processors": 4,
"allocated_processors": 4
},
"process": {
"refresh_interval_in_millis": 1000,
"id": 1,
"mlockall": false
},
"jvm": {
"pid": 1,
"version": "1.8.0_161",
"vm_name": "OpenJDK 64-Bit Server VM",
"vm_version": "25.161-b14",
"vm_vendor": "Oracle Corporation",
"start_time_in_millis": 1524848868469,
"mem": {
"heap_init_in_bytes": 1073741824,
"heap_max_in_bytes": 1038876672,
"non_heap_init_in_bytes": 2555904,
"non_heap_max_in_bytes": 0,
"direct_max_in_bytes": 1038876672
},
"gc_collectors": [
"ParNew",
"ConcurrentMarkSweep"
],
"memory_pools": [
"Code Cache",
"Metaspace",
"Compressed Class Space",
"Par Eden Space",
"Par Survivor Space",
"CMS Old Gen"
],
"using_compressed_ordinary_object_pointers": "true",
"input_arguments": [
"-Xms1g",
"-Xmx1g",
"-XX:+UseConcMarkSweepGC",
"-XX:CMSInitiatingOccupancyFraction=75",
"-XX:+UseCMSInitiatingOccupancyOnly",
"-XX:+AlwaysPreTouch",
"-Xss1m",
"-Djava.awt.headless=true",
"-Dfile.encoding=UTF-8",
"-Djna.nosys=true",
"-XX:-OmitStackTraceInFastThrow",
"-Dio.netty.noUnsafe=true",
"-Dio.netty.noKeySetOptimization=true",
"-Dio.netty.recycler.maxCapacityPerThread=0",
"-Dlog4j.shutdownHookEnabled=false",
"-Dlog4j2.disable.jmx=true",
"-Djava.io.tmpdir=/tmp/elasticsearch.aDuysB5f",
"-XX:+HeapDumpOnOutOfMemoryError",
"-XX:+PrintGCDetails",
"-XX:+PrintGCDateStamps",
"-XX:+PrintTenuringDistribution",
"-XX:+PrintGCApplicationStoppedTime",
"-Xloggc:logs/gc.log",
"-XX:+UseGCLogFileRotation",
"-XX:NumberOfGCLogFiles=32",
"-XX:GCLogFileSize=64m",
"-Des.cgroups.hierarchy.override=/",
"-Des.path.home=/usr/share/elasticsearch",
"-Des.path.conf=/usr/share/elasticsearch/config"
]
},
"thread_pool": {
"force_merge": {
"type": "fixed",
"min": 1,
"max": 1,
"queue_size": -1
},
"fetch_shard_started": {
"type": "scaling",
"min": 1,
"max": 8,
"keep_alive": "5m",
"queue_size": -1
},
"listener": {
"type": "fixed",
"min": 2,
"max": 2,
"queue_size": -1
},
"index": {
"type": "fixed",
"min": 4,
"max": 4,
"queue_size": 200
},
"refresh": {
"type": "scaling",
"min": 1,
"max": 2,
"keep_alive": "5m",
"queue_size": -1
},
"generic": {
"type": "scaling",
"min": 4,
"max": 128,
"keep_alive": "30s",
"queue_size": -1
},
"warmer": {
"type": "scaling",
"min": 1,
"max": 2,
"keep_alive": "5m",
"queue_size": -1
},
"search": {
"type": "fixed_auto_queue_size",
"min": 7,
"max": 7,
"queue_size": 1000
},
"flush": {
"type": "scaling",
"min": 1,
"max": 2,
"keep_alive": "5m",
"queue_size": -1
},
"fetch_shard_store": {
"type": "scaling",
"min": 1,
"max": 8,
"keep_alive": "5m",
"queue_size": -1
},
"management": {
"type": "scaling",
"min": 1,
"max": 5,
"keep_alive": "5m",
"queue_size": -1
},
"get": {
"type": "fixed",
"min": 4,
"max": 4,
"queue_size": 1000
},
"bulk": {
"type": "fixed",
"min": 4,
"max": 4,
"queue_size": 200
},
"snapshot": {
"type": "scaling",
"min": 1,
"max": 2,
"keep_alive": "5m",
"queue_size": -1
}
},
"transport": {
"bound_address": [
"[::]:9300"
],
"publish_address": "172.17.0.2:9300",
"profiles": {}
},
"http": {
"bound_address": [
"[::]:9200"
],
"publish_address": "172.17.0.2:9200",
"max_content_length_in_bytes": 104857600
}
}

And regarding the cpu and disk I/O should I be looking at this command - 'GET /_nodes/stats/process' to get those metrics

Look at top and iostat -x. Given that you have a single node with 1GB heap, are you seeing any messages about frequent or long GC in the Elasticsearch logs?

Ok I kicked of another upload of around 89000 docs and it took about 20 min. I monitored cpu usage directly via terminal using the 'top' command and as the upload progressed the cpu usage was around 75 - 90%. where as when I ran the command GET /_nodes/stats/process in kibana, the cpu precent showed around 50% with few spikes up to 70%. (not sure why is there a difference between them i.e terminal vs kibana)

The jvm heap used percent was around 50 - 70%

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.