Hi all,
I am new to ELK,
I have a system that collect system logs from appx 150 servers,
all send syslog to one machine:
its users 16 cores and 32gb of memory - networking running on a bond with 2 nics.
its running on centos 7 all updated to the latest version.
the problem is that its very very slow to respond specifically kibana.
it takes very very long initilize the kibana index, and after it does, it take very long to show the data
or search in it.
after reading many forums
I'm totally clue less, what am I doing wrong?
at first it was running ok, but after a week or so it became painfully slow
I'll post my configuration in the next post because i can't do more than 5k lines.
input {
udp {
port => 514
type => syslog
codec => json
}
udp {
port => 3515
codec => "json"
type => "WindowsLog"
}
}
filter {
if [type] == "syslog" {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:[%{POSINT:syslog_pid}])?: %{GREEDYDATA:syslog_message}" }
add_field => [ "received_at", "%{@timestamp}" ]
add_field => [ "received_from", "%{host}" ]
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:[%{POSINT:syslog_pid}])?: %{COMBINEDAPACHELOG}" }
}
}
if [type] == "WindowsLog" {
json {
source => "message"
}
if [SourceModuleName] == "EventLog" {
mutate { replace => [ "message", "%{Message}" ] } } mutate { remove_field => [ "Message" ] }
}
date {
match => [ "MMM d HH:mm:ss", "MMM dd HH:mm:ss", "ISO8601" ]
}
mutate {
remove_field => [ "geoip.ip","geoip.latitude" ]
remove_tag => [ "geoip.ip","geoip.latitude" ]
}
}
output {
elasticsearch {
hosts => "x.x.x.x"
index => "logstash-%{+YYYY.MM.dd}"
}
}
this is my elasticsearch.yml
======================== Elasticsearch Configuration =========================
NOTE: Elasticsearch comes with reasonable defaults for most settings.
Before you set out to tweak and tune the configuration, make sure you
understand what are you trying to accomplish and the consequences.
The primary way of configuring a node is via this file. This template lists
the most important settings you may want to configure for a production cluster.
Please see the documentation for further information on configuration options:
http://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration.html
---------------------------------- Cluster -----------------------------------
Use a descriptive name for your cluster:
cluster.name: my-application
------------------------------------ Node ------------------------------------
Use a descriptive name for the node:
node.name: node-1
Add custom attributes to the node:
node.rack: r1
----------------------------------- Paths ------------------------------------
Path to directory where to store the data (separate multiple locations by comma):
path.data: /path/to/data
Path to log files:
path.logs: /path/to/logs
----------------------------------- Memory -----------------------------------
Lock the memory on startup:
bootstrap.mlockall: true
Make sure that the
ES_HEAP_SIZE
environment variable is set to about half the memoryavailable on the system and that the owner of the process is allowed to use this limit.
Elasticsearch performs poorly when the system is swapping the memory.
---------------------------------- Network -----------------------------------
Set the bind address to a specific IP (IPv4 or IPv6):
network.host: x.x.x.x
Set a custom port for HTTP:
http.port: 9200
For more information, see the documentation at:
http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-network.html
--------------------------------- Discovery ----------------------------------
Pass an initial list of hosts to perform discovery when new node is started:
The default list of hosts is ["127.0.0.1", "[::1]"]
discovery.zen.ping.unicast.hosts: ["host1", "host2"]
Prevent the "split brain" by configuring the majority of nodes (total number of nodes / 2 + 1):
discovery.zen.minimum_master_nodes: 3
For more information, see the documentation at:
http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery.html
---------------------------------- Gateway -----------------------------------
Block initial recovery after a full cluster restart until N nodes are started:
gateway.recover_after_nodes: 3
For more information, see the documentation at:
http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-gateway.html
---------------------------------- Various -----------------------------------
Disable starting multiple nodes on a single system:
node.max_local_storage_nodes: 1
Require explicit names when deleting indices:
action.destructive_requires_name: true
index.number_of_shards: 20
index.number_of_replicas: 0
bootstrap.mlockall: true
also, note that top showes that the server is not on high loads
I've disabled swapping as mentioned in lots of forums,
i don't know what to do anymore, please please assist, if anymore information is needed please tell me and i'll provide.
p.s. - i've changed the address with x.x.x.x for obvious resons.
many, many thanks in Advance!!!
How much data are you indexing per day? How long do you intend to keep data in the cluster? Which version of Elasticsearch and Logstash are you using? What type of storage does the node have?
I want to achieve 10gigs a day now Its writing about 3gigs a day , I'm using centralized storage (Dell Equalogic) im on the latest version of all (es-2.3.3,ls-2.3.2,kibana-4.5.4) and i want to hold data for 90 days (write now i have 20 days already). also, I'm not on a cluster, I run it on one powerful machine (16 cores, 32gb Mem) and while "top"ing i don't see that the machine on a large load and system utilization, also i don't reach even 10% of the Storage IOPs capability also, no CPU wait either. so as you see, I am kind of clueless...
If those are the volumes, why have you set the default shard count to 20??? Having lots of small shards carries overhead and can be very inefficient, for indexing as well as querying. Given the volumes you have mentioned I would recommend setting this to 1 in order to achieve a shard size between a few GB and a few tens of GB in size.
I've changed it to 1
and sitll doesn't do anything
moreover, now almost every search falls down to the timeout
This change will only affect future indices, so will not help immediately. How much data do you have in the cluster? How many shards?
Christian, Thanks for your help, I had this Eurika Moment and I have found the problem, here is what I've done.
(I am working on a CentOS7 so it applies to Centos7/RHEL7 in Debians that may differ.
-
edit /etc/security/limits.conf
I have Added:
elasticsearch - nofile 65535
elasticsearch - memlock unlimited -
edit /etc/sysconfig/elasticsearch
I have Added:
MAX_LOCKED_MEMORY=unlimited
MAX_OPEN_FILES=65536
ES_HEAP_SIZE=16G -
edit /etc/elasticsearch/elasticsearch.yml
and added:
index.number_of_shards: 1
bootstrap.mlockall: true -
I have commented the swap partition from /etc/fstab
and issued "swapoff -a"
and suddenly kibana and elasticsearch started to fly...
basically i've let the machine use all of its resources - i have only one node - but a powerful one.
well that occurred to me since you said that i have lots of small syslog messages so, writing it to memory fast and
than queuing it to the disk is the solution, Christian, Many Thanks!