Shards and scalability

We have a requirement for frequent updates but the total data size per index will not be more than 5 GB. How many shards should be allocated for fast updates? The rate at which the update will happen is around 50k updates per second and each document is less than 1kb in size. Also how many nodes do I need for better performance? Thanks.

You really need to test this to be sure.

true, but right now we have around 24 cores on 3 nodes but we are able to achieve only 3k updates per second and the CPU is constantly above 90%. What am I missing here?

What version are you on?
What JVM, OS?
What hardware?
What is your heap size?
What is your document mapping?
What are you actually updating?
Are you monitoring everything? What does that show?
How are you measuring this 3K number?

Version: 5.5
AWS Elasticsearch
hardware: m4 2xlarge
What is your heap size? 18GB
What is your document mapping? we are using custom mapping
What are you actually updating? the entire document is updated. Some of the fields have the same value
Are you monitoring everything? What does that show? Yes, the CPU shows more than 90% & JVM pressure is about 60%
How are you measuring this 3K number? The 3k documents are indexed in bulk operation and if we increase the number then the queue size becomes full and also the latency shoots up.

Thanks.

You may need better hardware then.

Unfortunately without Monitoring (from X-Pack) you are likely going to find it hard to effectively troubleshoot what your nodes/cluster is doing.

How frequently are you on average updating each document? What bulk request size are you using? How are you performing the updates, e.g. overwriting the full document or using scripted updates? What type of storage do you have? What is disk I/O and iowait looking like?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.