we have a microservice application that is getting data from Kafka and importing them into Elasticsearch using BulkImport operation.
- The microservice application is running in docker using docker-compose and scaling for parallel and multi-thread.
- Elasticsearch (2.4.1) is also running into docker using docker-compose with the following configuration (1master with 4GB javaHeapSize - 1client with 4GB javaHeapSize - 5data with 8GB javaHeapSize - 24shards - 1index~7.89GB).
- The VM have 256GB of RAM, 24 CPU (24core), 500GB disk space ext4
We noted that the application is taking 20s between some BulkImport and continuing with others, at the end to import a fullIndex of 7.49GB (6,2Milions hits) is taking 4h40m.. Not what we expected.
We already tried to:
- Disable refresh and replicas for initial loads
- Setting ulimits higher
- Setting scale configuration of threadpools
Can we have some suggestion in order to increase indexing speed?