System Requirements for ElasticSearch stack


(Gopinath Nallappan) #1

Hello all,

I'm new to the ELK stack. I will be logging Windows Events, Syslogs from
firewalls, routers etc into my elasticsearch.

I am expecting daily data of around 2GB to be logged into my elasticsearch
server. I will be creating indices on daily or weekly basis.

And my logs are going to be stored for atleast a year online and offline
after that.

I have been looking around and also searched this forum, but I was not able
to find a definitive guide that explained how to design the architecture -
RAM, # of CPU cores, # of Elastcisearch nodes and shards / node.

The system will be mainly used for logging purposes only. So there won't be
that many concurrent users.

Appreciate any pointers on best practices in setting up the Elasticsearch
deployment.

Thanks,
Gopinath

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/23818203-6fe3-49ae-996d-443c2250ea34%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #2

There is no "one size fits all", no strict measure for RAM, CPU cores,
shard/node. This all depends on your testing results and your requirements.
Do not trust other test results more than your own.

You can index 2G with Elasticsearch in a few minutes, using commodity
hardware. Do not expect problems here.

For sizing, you should also take into consideration the total volume you
have to keep (for disk space setup) and how much query workload you need to
serve (for replica). The number of users is a hint, but it depends on the
type of query too (filters, aggregations, etc.)

For fault tolerance, you should take into consideration the availability of
the system. If you don't care, one node might be sufficient, but production
should at least use three nodes, for better fault tolerance.

Jörg

On Wed, Aug 6, 2014 at 6:02 AM, Gopinath Nallappan <
gopinathnallappan@gmail.com> wrote:

Hello all,

I'm new to the ELK stack. I will be logging Windows Events, Syslogs from
firewalls, routers etc into my elasticsearch.

I am expecting daily data of around 2GB to be logged into my elasticsearch
server. I will be creating indices on daily or weekly basis.

And my logs are going to be stored for atleast a year online and offline
after that.

I have been looking around and also searched this forum, but I was not
able to find a definitive guide that explained how to design the
architecture - RAM, # of CPU cores, # of Elastcisearch nodes and shards /
node.

The system will be mainly used for logging purposes only. So there won't
be that many concurrent users.

Appreciate any pointers on best practices in setting up the Elasticsearch
deployment.

Thanks,
Gopinath

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/23818203-6fe3-49ae-996d-443c2250ea34%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/23818203-6fe3-49ae-996d-443c2250ea34%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH6aNsZht5WSHnNRqSH6jiNngwHQZnR%2BY96W6_DtwEJXg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jaguar) #3

I have found quite a few simliar emails about capacity planning. Although
it make sense that there are a lot of variables/factors, it would be great
for new users to have some sort of baseline, which could be simple , just
single type of indices, not too heavy load. Maybe there are already
blogs/articles covering thus topic, but worth a pointer in official
document.

My 2c

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAP0hgQ06eO3%2BfzjTLrN-xMybFSopC%3DkBbPDd%2BKr-qUc2qpuJTw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(vjbangis) #4

Thanks Gopinath!

Hi Jörg,

*For fault tolerance, you should take into consideration the availability

of the system. If you don't care, one node might be sufficient, but
production should at least use three nodes, for better fault tolerance.*

Is the three nodes like c3.large, each node, or an m3.large or an m3.xlarge
like?

TIA!

On Wednesday, August 6, 2014 5:19:42 PM UTC+8, Jörg Prante wrote:

There is no "one size fits all", no strict measure for RAM, CPU cores,
shard/node. This all depends on your testing results and your requirements.
Do not trust other test results more than your own.

You can index 2G with Elasticsearch in a few minutes, using commodity
hardware. Do not expect problems here.

For sizing, you should also take into consideration the total volume you
have to keep (for disk space setup) and how much query workload you need to
serve (for replica). The number of users is a hint, but it depends on the
type of query too (filters, aggregations, etc.)

For fault tolerance, you should take into consideration the availability
of the system. If you don't care, one node might be sufficient, but
production should at least use three nodes, for better fault tolerance.

Jörg

On Wed, Aug 6, 2014 at 6:02 AM, Gopinath Nallappan <gopinath...@gmail.com
<javascript:>> wrote:

Hello all,

I'm new to the ELK stack. I will be logging Windows Events, Syslogs from
firewalls, routers etc into my elasticsearch.

I am expecting daily data of around 2GB to be logged into my
elasticsearch server. I will be creating indices on daily or weekly basis.

And my logs are going to be stored for atleast a year online and offline
after that.

I have been looking around and also searched this forum, but I was not
able to find a definitive guide that explained how to design the
architecture - RAM, # of CPU cores, # of Elastcisearch nodes and shards /
node.

The system will be mainly used for logging purposes only. So there won't
be that many concurrent users.

Appreciate any pointers on best practices in setting up the Elasticsearch
deployment.

Thanks,
Gopinath

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/23818203-6fe3-49ae-996d-443c2250ea34%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/23818203-6fe3-49ae-996d-443c2250ea34%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4357fe8d-2e51-4ef5-921a-bfe07124f6a9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #5

You can use any machine you want, bare metal, VM, whatever. ES is not bound
to Amazon EC2 conditions at all. It will depend on your data.

Jörg

On Fri, Aug 8, 2014 at 3:48 AM, vjbangis jessviray0708@gmail.com wrote:

Thanks Gopinath!

Hi Jörg,

*For fault tolerance, you should take into consideration the availability

of the system. If you don't care, one node might be sufficient, but
production should at least use three nodes, for better fault tolerance.*

Is the three nodes like c3.large, each node, or an m3.large or an m3.xlarge
like?

TIA!

On Wednesday, August 6, 2014 5:19:42 PM UTC+8, Jörg Prante wrote:

There is no "one size fits all", no strict measure for RAM, CPU cores,
shard/node. This all depends on your testing results and your requirements.
Do not trust other test results more than your own.

You can index 2G with Elasticsearch in a few minutes, using commodity
hardware. Do not expect problems here.

For sizing, you should also take into consideration the total volume you
have to keep (for disk space setup) and how much query workload you need to
serve (for replica). The number of users is a hint, but it depends on the
type of query too (filters, aggregations, etc.)

For fault tolerance, you should take into consideration the availability
of the system. If you don't care, one node might be sufficient, but
production should at least use three nodes, for better fault tolerance.

Jörg

On Wed, Aug 6, 2014 at 6:02 AM, Gopinath Nallappan <gopinath...@gmail.com

wrote:

Hello all,

I'm new to the ELK stack. I will be logging Windows Events, Syslogs from
firewalls, routers etc into my elasticsearch.

I am expecting daily data of around 2GB to be logged into my
elasticsearch server. I will be creating indices on daily or weekly basis.

And my logs are going to be stored for atleast a year online and offline
after that.

I have been looking around and also searched this forum, but I was not
able to find a definitive guide that explained how to design the
architecture - RAM, # of CPU cores, # of Elastcisearch nodes and shards /
node.

The system will be mainly used for logging purposes only. So there won't
be that many concurrent users.

Appreciate any pointers on best practices in setting up the
Elasticsearch deployment.

Thanks,
Gopinath

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/23818203-6fe3-49ae-996d-443c2250ea34%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/23818203-6fe3-49ae-996d-443c2250ea34%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4357fe8d-2e51-4ef5-921a-bfe07124f6a9%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/4357fe8d-2e51-4ef5-921a-bfe07124f6a9%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEpTe1dZS%3DBYhn6gTTG72wknnHufemTS7unChB%3Dr2xGMw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #6