How i can calculate the cluster nodes size like RAM, CPU and Disk

HI,
I am trying to figure how should I calculate my nodes size, its RAM, CPU, and Disk. If I have data of around 10G. On basis of what I should calculate nodes size.

Can anyone guide me, please?

It depends a lot on the use case and expected load. It would help if you described your use case and the load the cluster is expected to be under, e.g. expected insertions, upodates and queries per second.

Hi, below are use case.

  1. Use Case - Search

  2. Data per Day to be ingested (in GB) - 1GB

  3. Retention: Permanent

  4. Replication Count: 2

  5. Average Event Siz e: 1MB

  6. Average Events per second: 100 events per second

  7. Queries Per second: 100 Queries per second

  8. Total Data to be retained: ALL

How can i calculate on basis of above?

Those numbers do not add up. An average of 100 documents/second results in 8,640,000 documents per day. With an average size of 1MB that is a total ingested data volume of 8.2TB, which is considerably larger than the 1GB per day you specified.

Thank you for the update. Here is the updated use case

  1. Use Case - Search
  2. Data per Day to be ingested (in GB) - 2.5GB
  3. Retention: Permanent
  4. Replication Count: 2
  5. Average Event Siz e: 1MB
  6. Average Events per second: 100 events per hours
  7. Queries Per second: 100 Queries per second
  8. Total Data to be retained: ALL

Now 100 documents/Hour would result in 2400 per day and with an average size of 1MB, the total ingested data volume would be 2.4 G. I hope this is correct.
How can I calculate node size with this?

Some useful reading here that may help. Good luck
But this is only for a logging use case, your use case is more a search and that will depends on your search strategy that will influence how data will indexed (Analyzers, Mapping, ...) ...

Hi,
in that blog, it is mentioned how we can calculate on the basis of total data/total storage for elastic cloud. However, I am looking for how can I calculate nodes RAM and CPU core for my on-premises cluster.

It seems like you have very large documents in your use case? Why are they so large?

The blog post that was linked to describes sizing for use cases with lots of indexing, limited queries and quite small documents. This is a very common workload and therefore reasonably well known, which is why there are lots of common guidelines around sizing available for it.

Yours is however a query heavy use case with very large documents, where the type of queries and how much data you return will have a significant impact. Unfortunately I do not think there is any way to analytically determine the system requirements analytically so I would recommend you benchmark in order to find a suitable cluster size and configuration.

Actually currently i don't have data to ingest. i just assumed the document that if i have this number of total documents then how much cluster size could be.

Ok, let me check with the benchmark.

I have another question gor which I already have created another topic for this. However, I haven't got much response yet. That's why asking I am asking here, if you can help
https://discuss.elastic.co/t/elasticsearch-feature-for-rating/284410

The question is,, suppose i have an restaurant application and there user gives rating to the restaurant. Lets suppose there is restaurant abc and its rating changes based on user rating.

How Elastic Search calculate the average rating and ingest the rating to the document automatically. is there any feature or property in elastic search?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.