Best configuration for analysing multiple servers

Daz · June 21, 2015, 3:10pm

I'm hoping that someone can provide some advice or experience. So far I have a proof of concept working, analysing a single web server.
In production we have multiple servers (web, application, DB's) as well as multiple OS's (linux, centos, Windows) that I want to analyse. At the moment it is only my team looking at the data as an debug tool, but at a later stage I might also want to restrict/grant access to certain dashboards in kabana to other departments. What would be the best way of setting up an ELK stack for this? So far my ideas are:

one index for the lot and try to get logstash to map to the same terms where possible
An index for each server (not quite sure how to do this)
Some other method I've not thought of.

Occasionally, I'd like to compare certain stats between servers (to trace the actions of a singe IP through the system, or to compare loads between web frontends, app servers and DBs), but I'm not sure that this is possible if I use separate indexes?

I'm after the pros/cons of each and thoughts on if I'm on the right track or completely missing the point of how to set this up at scale.

thanks for your input.

eperry · June 21, 2015, 4:35pm

You have quite a few objectives, and to keep things simple to start and explain the architecture I have which is +90 servers and 300GB a day of logs, of course there are many designs and this one was my first attempt so there maybe good and bad idea's in here

Sorting out and making indexes

First thing to think about is your indexes. The bigger they are the more brute force you need when searching.

So I tend to use the "type" field to sort out my data but feel free to get creative for your needs.

input {
  file {
    type => "apache-Access-log"
    path => ["/var/log/http/access_log"]
}
output {
  elasticsearch { 
    host => localhost 
    index => "%{type}-%{+YYYY.MM.dd}"
  }
	 # stdout { codec=>"json" }
}

Architecture

Some ideas on design

Logstash Forwarder -> Queue -> parsing indexer -> Elasticsearch
Logstash Parser and Indexer -> Elasticsearch

Or what I have

Logstash and Parsing -> queue -> (Sub-parser ) & Indexer -> Elasticsearch

Explanation of my architecture

I do the bulk of my indexing on every server. Like Merging of multiple lines, parsing the data, and tagging the lines with important information. Like "web_acces_logs", "Prod" , etc . (90 servers parsing is better then a couple IMHO)

The Queue I run is Redis, but I think I will be going to Apache Kafka , but select the one you like the most.

The Indexer, is where I sort out the data. Determine which "Index" I want to place it in, throw away data, send information to Nagios, as well as some generic parsing I might want.
Note: This indexer has to be able to process every log, you may need multiples which is why there is a queue.

Elasticsearch:

here I do a couple of things:

Indexes are daily
I Create Mapping's with aliases , an alias of "loadavg" is easier then "loadavg-2015.06.21, loadavg-2015.06.22" or "loadavg*" which could bring in other folders I might not want.
I use this in my crontab to clean up older data after all we only have so much diskspace. You might find other ways like snapshots or exporting the data
https://github.com/elastic/curator

Daz · June 21, 2015, 10:11pm

What do you use as the indexer?

So you have multiple indexes in ES? Do you ever need to do queries across multiple indexes in Kibana?

eperry · June 21, 2015, 11:09pm

Logstash is what I use for an Indexer, I have 2 to 4 running for redundancy also I tend to add Added Heap space and added threads "-w " cli option to have more threads to process the data (that might have been over kill)

I don't normally query across indexes but that is what aliases would help you with. Or you can always query like Host:9200/index1,index2/_search

Daz · June 22, 2015, 1:08pm

Thanks for your ideas. You've given me a path to explore down, so I don't feel totally lost

When you say aliases, do you mean ES aliases : https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html

eperry · June 22, 2015, 1:12pm

Glad to give you a few bread crumbs, of course the above is my solution based on discovery you milage may vary.

Yes ES Aliases is what I was talking about for CROSS Index searching, also you can

Topic		Replies	Views
Elastic Stack Setup for Multiple Server Elasticsearch	10	2813	July 23, 2018
Correct Stack setup, especially ES Elasticsearch	3	523	March 16, 2018
How to configure different logs from different servers Logstash	5	2234	August 21, 2018
Recommended ELK architecture for production? Elasticsearch	15	12185	July 5, 2017
3 server design recomendations Elasticsearch	6	821	July 5, 2017

Best configuration for analysing multiple servers

Sorting out and making indexes

Architecture

Explanation of my architecture

Related topics