Testing Elastic Stack and winlogbeat / query exceeds 1000 shards

kernelpanic · March 9, 2017, 1:46am

Hello all, I'll begin by listing the components and versions I've installed first; all of the below are installed on one FreeBSD 11 p8 box:

Elasticsearch 5.0.2
Logstash 5.0.2
Kibana 5.0.2

Winlogbeat 5.2.2 is installed on a Windows 7 laptop.

I'm looking at using the Elastic Stack for managing logs at my place of work and I have followed the documentation for sending Windows event logs to Logstash and Elastic search. I have manually loaded the template to ES as per the instructions and have configured the Logstash conf file to accept beats. I then loaded the sample Kibana dashboards but when I entered the winlogbeat-* index pattern, Kibana complained with the following error:

Discover: Trying to query 3570 shards, which is over the limit of 1000. This limit exists because querying many shards at the same time can make the job of the coordinating node very CPU and/or memory intensive.

How did I get so many shards? Did I need to manually design the index/shard settings? Perhaps naively I thought loading the template would take care of all that. My conf file settings are below:

winlogbeat.yml

winlogbeat.event_logs:
  - name: Application
    ignore_older: 72h
  - name: Security
  - name: System

name: LAPTOP01

output.logstash:

  hosts: ["192.168.1.100:5044"]

logstash.conf

input {
    
        beats {
    port => 5044
  }

        file {
		type => "syslog"
		# path => [ "/var/log/*.log", "/var/log/messages", "/var/log/syslog" ]
		path => "/var/log/messages"
		start_position => "beginning"
  }
}

filter {
# An filter may change the regular expression used to match a record or a field,
# alter the value of parsed fields, add or remove fields, etc.
#
	if [type] == "syslog" {
		grok {
			match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} (%{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}|%{GREEDYDATA:syslog_message})" }
			add_field => [ "received_at", "%{@timestamp}" ]
			add_field => [ "received_from", "%{@source_host}" ]
		}
    
		if !("_grokparsefailure" in [tags]) {
			mutate {
				replace => [ "@source_host", "%{syslog_hostname}" ]
				replace => [ "@message", "%{syslog_message}" ]
			}
		}
		mutate {
			remove_field => [ "syslog_hostname", "syslog_message" ]
		}
		date {
			match => [ "syslog_timestamp","MMM  d HH:mm:ss", "MMM dd HH:mm:ss", "ISO8601" ] 
		}
		syslog_pri { }
	}
}

output {
  elasticsearch {
    hosts => "192.168.1.100:9200"
    manage_template => false
    index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
    document_type => "%{[@metadata][type]}"
  }
}

elasticsearch.yml

cluster.name: esearch-cluster
node.name: node-1
path.data: /zdata/elasticsearch-db
path.logs: /zdata/elasticsearch-log
path.scripts: /usr/local/libexec/elasticsearch
network.host: 192.168.1.100
http.port: 9200

Thanks for any help.

jkuang · March 9, 2017, 3:41am

What's your cat shard output?

GET _cat/shards/winlog*

kernelpanic · March 9, 2017, 9:25am

Hello, thanks for getting back to me; the shard output command shows the following line 7140 times:

winlogbeat-2015.01.31 4 p UNASSIGNED

jkuang · March 9, 2017, 5:31pm

How about when you do ?

GET _cat/indices/winlog*

I want to confirm you are indeed generating only daily indices and not hourly indices. From your config it looks like daily.

kernelpanic · March 9, 2017, 5:48pm

Hello, I'm at work at the moment and my test set up is at home so I'll update this when I'm back home.

Just to clarify, I ran into this problem just a few minutes after winlogbeat started sending data to the cluster.

jkuang · March 9, 2017, 5:51pm

No problem, while you are at it when you get home please run one more command.

GET winlog*/_settings

kernelpanic · March 9, 2017, 5:52pm

Will do, thanks for your efforts.

jkuang · March 9, 2017, 6:22pm

No problem ^^

kernelpanic · March 9, 2017, 7:38pm

Hello again, GET _cat/indices/winlog returns 715 entries similar to the following:

yellow open winlogbeat-2016.10.28 K-jUP-rxRq6CvARnU7hFgw 5 1 180 0 425.6kb 425.6kb

GET winlog/_settings* retuens over 14,000 lines but they're all very similar to the following:

{
  "winlogbeat-2016.10.28": {
    "settings": {
      "index": {
        "mapping": {
          "total_fields": {
            "limit": "10000"
          }
        },
        "refresh_interval": "5s",
        "number_of_shards": "5",
        "provided_name": "winlogbeat-2016.10.28",
        "creation_date": "1489020550246",
        "number_of_replicas": "1",
        "uuid": "K-jUP-rxRq6CvARnU7hFgw",
        "version": {
          "created": "5000299"
        }
      }
    }
  },

Thanks.

jkuang · March 9, 2017, 10:35pm

Are you using an alias when querying?

kernelpanic · March 10, 2017, 12:10am

I'm very much a newbie so I wouldn't know how to do that

As I understand it winlogbeat has created 715 indices with 5 shards each - that would explain the 3575 figure that Kibana is complaining about - was it supposed to create the data that way?

jkuang · March 10, 2017, 5:11am

You are correct. With 715 indices I'm not surprised there are 3575 shards. My thinking pattern is, how are you querying the data that requires it to span over 1000 shards and then we need to work on cutting down the number of shards. Why is ignore not working or is it working but not enough.

Please provide your query.

kernelpanic · March 10, 2017, 5:37am

I didn't write a particular query, I simply followed the instructions here: https://www.elastic.co/guide/en/beats/winlogbeat/current/winlogbeat-sample-dashboards.html

...to see what data winlogbeat was loading into Elasticsearch - I simply added winlogbeat-* in the Discover page in Kibana which then resulted in the error.

jkuang · March 10, 2017, 5:40am

That is problem 1) , winlogbeat-* means all winlogbeat indices. Because you have 715 indices = 3575 shards. Please specify a particular winlogbeat.

For example:

winlogbeat-2016.10.28

kernelpanic · March 10, 2017, 5:55am

Thanks, running GET _cat/indices/winlogbeat-2014.10.26 in the developer console does return a value, but what do I run in the Discover console and why do the instructions tell me to run winlogbeat-* ?

jkuang · March 10, 2017, 5:58am

Running winlogbeat-* in normal circumstances is fine , < 1000 shards.

Run again with winlogbeat-2014* , winlogbeat-2015-*, and so on until you get some value.

kernelpanic · March 10, 2017, 6:11am

I suppose what I don't understand is why do I have so many indices/shards? This is a test setup, I've only loaded a few hundred events from one Windows laptop.

jkuang · March 10, 2017, 6:16am

Yeah, finding out why you have so many shards is the second part of our problem. I have an idea of why. You mention you only loaded a few hundred events, however, I see in your _cat/indices consisting of winlogbet-2014*. From 2016 to 2014 would be 2 years. It looks like it loaded 2 years of events.

kernelpanic · March 10, 2017, 6:23am

Yeah its possible its going back a few years, I think it will ignore events older than 3 days just for the Windows Application log:

winlogbeat.event_logs:
  - name: Application
    ignore_older: 72h
  - name: Security
  - name: System

Still confused as to why its created so many shards from one Windows machine - what would've happened if I'd added hundreds! I wonder if its something to do with winlogbeats being a later version that the Elastic stack components I've installed?

kernelpanic · March 10, 2017, 8:14am

Its possible I've got the output option in winlogbeat.yaml wrong - in my file I have:

output.logstash:
   hosts: ["192.168.1.100:5044"]

But looking at https://www.elastic.co/guide/en/beats/winlogbeat/current/logstash-output.html it has the following example:

output.logstash:
  hosts: ["localhost:5044"]
  index: winlogbeat

Topic		Replies	Views
3/6 shards failing when trying to visualize on Kibana Metrics	15	5418	August 26, 2019
7.0 Upgrade Assistant open shards Elasticsearch	5	1813	April 22, 2019
Limit of total fields ... has been exceeded Beats winlogbeat	3	915	March 16, 2020
New install - shard failure with 1 client Kibana	4	614	August 29, 2017
My logstash output is creating hundreds of shards per index Elasticsearch	9	1775	July 5, 2017

Testing Elastic Stack and winlogbeat / query exceeds 1000 shards

Related topics