ElasticSearch configuration for high performance

Hi,

We have developed .NetCore WebApi Application using NEST library for performing CRUD (Create, read, update, delete) Operation in ElasticSearch.

We have setup ElasticSearch with Ingest plug-in on kubernetes cluster with HeapSize 2gb (On cloud).

Goal: Add/Push 100,000 documents (Per Document Size: 13MB to 15MB) in ElasticSearch in 10 - 15 Minutes

Could you please suggest the ideal ElasticSearch configuration or ElasticSearch configuration for high performance for above requirement.

Thanks in Advanced.

Indexing can be CPU intensive and even more so if you are using ingest node. Given the size of your documents and the speed at which you want to index this, the cluster sounds quite small. How many CPU cores do you have? What type of storage?

What level of throughput are you seeing with the current setup? What is limiting performance?

  • Optimize your mappings: fewer analyzers = (way) faster.
  • Use as many shards as CPU cores available (system uses 1 core / shard).
  • Prefer fewer faster CPU cores (and thus shards see above) over more slower cores (more efficient storage and searches).
  • Use local SSD storage.

Thanks for your reply,

How many CPU cores do you have?
We deployed on IBM kubernetes cluster and we have 3 worker nodes each node have 8 Cores 32 GB RAM

What type of storage?
We have used IBM storage volume.beta.kubernetes.io/storage-class: ibmc-block-silver

What level of throughput are you seeing with the current setup?
Speed is adding the 1000 document (Document size 13 to 15MB) per hour

What is limiting performance?
it is taking time to adding document, observing the running process.

Please suggest us on high performance kubernetes Elasticsearch configuration.

Thanks for your reply.

My Configuration as below

Please let us know Elasticsearch on Kubernetes configuration for High performance.

I would recommend looking at the following resources:

https://www.elastic.co/guide/en/elasticsearch/reference/6.4/tune-for-indexing-speed.html#tune-for-indexing-speed

https://www.elastic.co/guide/en/elasticsearch/reference/6.4/tune-for-disk-usage.html

Then run tests and try to identify what system resource that is limiting performance, e.g. CPU and/or disk I/O. I generally index a lot smaller documents, so am not sure how to best tune for your particular use-case.

If I am calculating correctly, that is about 1.33TB of raw data. If that is the case you will most likely need a lot larger cluster to be able to ingest that in 15 minutes...

Thanks for reply.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.