ELK cluster architecture

Is this seems ok architecture?

each system has 96gig RAM and 24 core cpus.

3 master node + kibana (kibana is there just incase I need it)
25gig for elasticsearch and 8 gig for kibana

2 co-ordinating node + kibana
25 gig for elasticsearch and 12 gig for kibana

2 master + logstash ( I assign these just incase two or master has problem together )
20 gig for elasticsearch, 27 gig logstash (each server running 10 pipeline)

4 data node
25 gig for elastisearch and lot of single ssd disks

Do I need anything to think about or redesign?
I can't change data nodes as It has all the disks.

each system has 96gig RAM and 24 core cpus.

That is a lot of RAM, which is good, and means you'll generally want to max out Elasticsearch at about 30GB, because you can.

Also don't see a lot of systems with this much RAM/Cores and more masters than data nodes. The glory of excess physical hardware.

3 master node + kibana (kibana is there just incase I need it)
25gig for elasticsearch and 8 gig for kibana

You have 96G of RAM, so assuming you run nothing, else just give 30GB to ES, the same on every node for simplicity. Basic Kibana can run in much less than 8GB but you have the memory, so should be fine.

2 co-ordinating node + kibana
25 gig for elasticsearch and 12 gig for kibana

These your main Kibana nodes? Just watch their CPU if you do both a lot of heavy Kibana work and heavy ingest/query work, but with only 4 data nodes & lots of heap, it's unlikely you can overload these.

2 master + logstash ( I assign these just incase two or master has problem together )
20 gig for elasticsearch, 27 gig logstash (each server running 10 pipeline)

Why have masters on Logstash machines? This will give you 5 master-eligible machines, which is not bad, but if one of these actually becomes master and the logstash are very busy, you can get CPU contention - unlikely with 24 cores, though, but why run ES here at all?

I'd think for simplicity you'd only run LS here. Makes it easier to patch, update, reboot, manage performance, etc.

4 data node
25 gig for elastisearch and lot of single ssd disks

Certainly bump this up to 30GB for ES, leave the rest for file cache.

Thank you Steve for clearing my doubt about it.
only problem I have is with kibana. one-two user complain that it is slow.

they way I have define this
elkkib CNAME to two ip address elkkib1 and elkkib2

i.e when user type
http://elkkib:5601 (he will go to one of this server)

I think something wrong with that desing

for example
elkkib1 -> 10.10.1.1 (had forward/reverse dns registration)
elkkib2 -> 10.10.1.2

elkkib -> 10.10.1.1 and 10.10.1.2 (only forward ip lookup)

only problem I have is with kibana. one-two user complain that it is slow.

Okay but is the slowness Kibana or the queries themselves? Maybe hard to know, as I'm not a big Kibana person on how you can profile that but I think there are built-in tools to check this.

they way I have define this
elkkib CNAME to two ip address elkkib1 and elkkib2
i.e when user type
http://elkkib:5601 (he will go to one of this server)

You can do that, though some type of load balancer probably better, and once the DNS cache runs out, he may go to the other Kibana and lose his session or login; no idea how Kibana manages that but I'd think a sticky LB would help.

I think something wrong with that design

Why do you think it's wrong? I guess I'm confused on what you are asking. You have so much hardware and power here that it's hard see how it's slow, unless the data is huge, the queries complex, etc.

Steve
now this one fixed the problem. all good now. Thank you

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.