Metricbeat Perfmon is creating to many single events

lueneburger · January 10, 2018, 10:10am

Hi Everyone,

im using the 6.0.0-Alpha2 until now and want to switch to 6.0.1, but when upgrade to 6.0.1 i see that every counter/result is now creating 1 event instead to put all counters/results in one event like before.

so let's say i use 10 counters in metricbeat:

6.0.0-Alpha2 metricbeat is sending 1 event with all 10 results to ES
6.0.1 metricbeat is sending 10 events to ES

thats quite a big increase in space usage if i upgrade on all Servers to 6.0.1

so...is that the normal behavior and something special was in 6.0.0-Alpha2 included?
Cause i really want to only sent 1 event with all counters/results in it.

below is my config and the ES documents

- module: windows
  metricsets: ["perfmon"]
  tags: "myserver00_service.mydomain.com"
  enabled: true
  period: 10s
  perfmon.counters:
    - instance_label: "service.mydomain.com cache api entries"
      instance_name: "service"
      measurement_label: "cache.api.entries"
      query: '\ASP.NET Apps (service)\Cache API Entries'
    - instance_label: "service.mydomain.com cache api hit ratio"
      instance_name: "service"
      measurement_label: "cache.api.hit.ratio"
      query: '\ASP.NET Apps (service)\Cache API Hit Ratio'
    - instance_label: "service.mydomain.com cache api hits"
      instance_name: "service"
      measurement_label: "cache.api.hits"
      query: '\ASP.NET Apps (service)\Cache API Hits'
    - instance_label: "service.mydomain.com cache api misses"
      instance_name: "service"
      measurement_label: "cache.api.misses"
      query: '\ASP.NET Apps (service)\Cache API Misses'
    - instance_label: "service.mydomain.com cache api trims"
      instance_name: "service"
      measurement_label: "cache.api.trims"
      query: '\ASP.NET Apps (service)\Cache API Trims'
    - instance_label: "service.mydomain.com cache api turnover rate"
      instance_name: "service"
      measurement_label: "cache.api.turnover.rate"
      query: '\ASP.NET Apps (service)\Cache API Turnover Rate'
    - instance_label: "service.mydomain.com cache total entries"
      instance_name: "service"
      measurement_label: "cache.total.entries"
      query: '\ASP.NET Apps (service)\Cache Total Entries'
    - instance_label: "service.mydomain.com cache total hit ratio"
      instance_name: "service"
      measurement_label: "cache.total.hit.ratio"
      query: '\ASP.NET Apps (service)\Cache Total Hit Ratio'
    - instance_label: "service.mydomain.com cache total hits"
      instance_name: "service"
      measurement_label: "cache.total.hits"
      query: '\ASP.NET Apps (service)\Cache Total Hits'
    - instance_label: "service.mydomain.com cache total misses"
      instance_name: "service"
      measurement_label: "cache.total.misses"
      query: '\ASP.NET Apps (service)\Cache Total Misses'
    - instance_label: "service.mydomain.com cache total trims"
      instance_name: "service"
      measurement_label: "cache.total.trims"
      query: '\ASP.NET Apps (service)\Cache Total Trims'
    - instance_label: "service.mydomain.com cache total turnover rate"
      instance_name: "service"
      measurement_label: "cache.total.turnover.rate"
      query: '\ASP.NET Apps (service)\Cache Total Turnover Rate'
    - instance_label: "service.mydomain.com.arrival.rate"
      instance_name: "service.mydomain.com"
      measurement_label: "cache.api.entries"
      query: '\HTTP Service Request Queues(service.mydomain.com)\ArrivalRate'
    - instance_label: "service.mydomain.com.cache.hit.rate"
      instance_name: "service.mydomain.com"
      measurement_label: "cache.hit.rate"
      alias: "cache.hit.rate"
      query: '\HTTP Service Request Queues(service.mydomain.com)\CacheHitRate'
    - instance_label: "service.mydomain.com.current.queue.size"
      instance_name: "service.mydomain.com"
      measurement_label: "current.queue.size"
      query: '\HTTP Service Request Queues(service.mydomain.com)\CurrentQueueSize'
    - instance_label: "service.mydomain.com.max.queue.item.age"
      instance_name: "service.mydomain.com"
      measurement_label: "max.queue.item.age"
      query: '\HTTP Service Request Queues(service.mydomain.com)\MaxQueueItemAge'
    - instance_label: "service.mydomain.com.rejected.requests"
      instance_name: "service.mydomain.com"
      measurement_label: "rejected.requests"
      query: '\HTTP Service Request Queues(service.mydomain.com)\RejectedRequests'
    - instance_label: "service.mydomain.com.rejected.rate"
      instance_name: "service.mydomain.com"
      measurement_label: "rejected.rate"
      query: '\HTTP Service Request Queues(service.mydomain.com)\RejectionRate'
    - instance_label: "service.mydomain.com.number.of.active.connectionpoolgroups"
      instance_name: "service.mydomain.com"
      measurement_label: "number.of.active.connectionpoolgroups"
      query: '\.NET Data Provider for SqlServer(_LM_W3SVC_5*)\NumberOfActiveConnectionPoolGroups'
    - instance_label: "service.mydomain.com.number.of.active.connectionpools"
      instance_name: "service.mydomain.com"
      measurement_label: "number.of.active.connectionpools"
      query: '\.NET Data Provider for SqlServer(_LM_W3SVC_5*)\NumberOfActiveConnectionPools'
    - instance_label: "service.mydomain.com.number.of.pooled.connections"
      instance_name: "service.mydomain.com"
      measurement_label: "number.of.pooled.connections"
      query: '\.NET Data Provider for SqlServer(_LM_W3SVC_5*)\NumberOfPooledConnections'
    - instance_label: "service.mydomain.com.number.of.reclaimed.connections"
      instance_name: "service.mydomain.com"
      measurement_label: "number.of.reclaimed.connections"
      query: '\.NET Data Provider for SqlServer(_LM_W3SVC_5*)\NumberOfReclaimedConnections'

output section

output.elasticsearch:
  # Array of hosts to connect to.
  hosts: ["https://myurltoElasticCloud:9243"]
  index: "my-index-%{+yyyy.MM.dd}"

  # Optional protocol and basic auth credentials.
  #protocol: "https"
  username: "elastic"
  password: "changeme"

setup.template.name: "my-index-*"
setup.template.pattern: "my-index-*"

Note: need to post the ES document results in a second post.

thanks in advance for any help

Cheers,
Dirk

lueneburger · January 10, 2018, 10:11am

the ElasticSearch document that i received before and with 6.0.1

6.0.0-Alpha2

{
  "_index": "my-index-2018.01.02",
  "_type": "doc",
  "_id": "xx",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2018-01-02T22:59:56.184Z",
    "beat": {
      "hostname": "myserver00",
      "name": "myserver00",
      "version": "6.0.0-alpha2"
    },
    "metricset": {
      "module": "windows",
      "name": "perfmon",
      "rtt": 2000
    },
    "tags": [
      "myserver00_service.mydomain.com"
    ],
    "windows": {
      "perfmon": {
        "arrival": {
          "rate": x
        },
        "cache": {
          "api": {
            "entries": x,
            "hit": {
              "ratio": x
            },
            "hits": x,
            "misses": x,
            "trims": x,
            "turnover": {
              "rate": x
            }
          },
          "hit": {
            "rate": x
          },
          "total": {
            "entries": x,
            "hit": {
              "ratio": x
            },
            "hits": x,
            "misses": x,
            "trims": x,
            "turnover": {
              "rate": x
            }
          }
        },
        "current": {
          "queue": {
            "size": x
          }
        },
        "max": {
          "queue": {
            "item": {}
          }
        },
        "number": {
          "of": {
            "active": {
              "connectionpoolgroups": x,
              "connectionpools": x
            },
            "pooled": {
              "connections": x
            }
          },
          "reclaimed": {
            "connections": x
          }
        },
        "rejected": {
          "rate": x,
          "requests": x
        }
      }
    }
  }

6.0.1

{
  "_index": "my-index-2018.01.10",
  "_type": "doc",
  "_id": "xx",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2018-01-10T09:54:31.384Z",
    "windows": {
      "perfmon": {
        "service": {
          "myservice.mydomain.com": {
            "com": {
              "rejected": {
                "rate": "service.mydomain.com"
              }
            }
          }
        },
        "rejected": {
          "rate": x
        }
      }
    },
    "metricset": {
      "module": "windows",
      "name": "perfmon",
      "rtt": 1002
    },
    "tags": [
      "myserver00_service.mydomain.com"
    ],
    "beat": {
      "name": "myserver00",
      "hostname": "myserver00",
      "version": "6.0.1"
    }
  }

mazoutte · January 11, 2018, 10:42am

Hello,

I'm interested too with a solution, keeping the metricbeat agent.
I have the same issue with the latest version 6.1.1

Too much document are created, which is terrible for space disk. And obviously, 1 document with all counters is better than all these documents alone (1 per counter).

And when we compare the date of each document, this is always the same ...

A workaround is possible if you are using filebeat + integrated Windows perfmon writing to CSV + logstash parser.
You need to restart the Data Collector Set on a schedule (with a dedicated schedule task), to let it overwrite the CSV file.

The main problem of this solution is the perfmon "package" deployment and maintaining the configuration of the collector set.
If you have a few servers, this can be a good workaround.
If you have more than hundred servers, this is difficult to maintain if you plan to change regularly the config.

lueneburger · January 11, 2018, 11:48am

Hi @mazoutte,

thanks for the workaround, but then i would stay for now with 6.0.0-alpha2.

but would be great to get some more information about this strange behavior of creating so many documents for 1 counter

andrewkroh · January 11, 2018, 9:52pm

I think this is related to the wildcard query changes. See the discussion in https://github.com/elastic/beats/pull/4502#issuecomment-308565638.

Maybe there is some middle ground where we can combine data for non-wildcard queries.

lueneburger · January 12, 2018, 8:50am

im confused that no one else is complaining about this (ok, mazoutte did )

cause thats a big change compared to 6.0.0-aplha2 for the output.

if there is no other solution, then i will stay on 6.0.0-aplha2, not the best way but better then the unneeded grow on index space

system · February 9, 2018, 8:50am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

andrewkroh · March 16, 2018, 9:57pm

I just wrote a proposal for grouping counters related to an object into a single event. Comments are welcome on this issue.

Topic		Replies	Views
Metricbeat Windows Module 6.0.0-alpha2, Output question Beats metricbeat	6	1000	August 31, 2017
Metricbeat 6.x is not starting (Windows) Beats metricbeat	8	4390	February 1, 2018
Metricbeat + Windows perfmon counters sent in separate JSONs in disagreement with docs Beats metricbeat	2	394	May 9, 2018
Perfmon help for metricbeat 6 beta Beats metricbeat	2	659	October 20, 2017
Monitoring .NET Core EventCounters with Metricbeat or Filebeat? Beats filebeat , metricbeat	1	325	July 4, 2022

Metricbeat Perfmon is creating to many single events

Related topics