State_service metricset fails on headless services

I was going to open a github issue, but it told me to open a discuss thread first. I didn't see anything similar in discuss or in github for this topic. It is easy to reproduce using the kubernetes module and having ECK deployed though (which uses headless services by default). I'm not sure what behavior makes sense here though.


The kubernetes module's state_service metricset fails to index headless kubernetes services, which explicitly have None as the ClusterIP.

This is the meat of the error being thrown:

{"type":"mapper_parsing_exception","reason":"failed to parse field [kubernetes.service.cluster_ip] of type [ip] in document with id 'yf_HN3EBEveQZJtR4sNf'. Preview of field's value: 'None'","caused_by":{"type":"illegal_argument_exception","reason":"'None' is not an IP string literal.

And the full error:

{"level":"warn","timestamp":"2020-04-01T22:06:18.853Z","caller":"elasticsearch/client.go:517","message":"Cannot index event publisher.Event{Content:beat.Event{Timestamp:time.Time{wall:0xbf9961f66a396c38, ext:118425703920, loc:(*time.Location)(0x7eb3060)}, Meta:null, Fields:{\"agent\":{\"ephemeral_id\":\"655841f1-0966-41c1-822a-c6ea9af7fdc8\",\"hostname\":\"kube-elastic-metricbeat-7dd4c74c74-jz25z\",\"id\":\"7b3c4cbc-103d-4684-8ec6-4255fa2cd7c5\",\"type\":\"metricbeat\",\"version\":\"7.6.1\"},\"cloud\":{\"availability_zone\":\"europe-west1-d\",\"instance\":{\"id\":\"1037830539447785865\",\"name\":\"gke-sabo-dev-cluster-default-pool-617f5774-gl30\"},\"machine\":{\"type\":\"n1-standard-8\"},\"project\":{\"id\":\"elastic-cloud-dev\"},\"provider\":\"gcp\"},\"ecs\":{\"version\":\"1.4.0\"},\"event\":{\"dataset\":\"kubernetes.service\",\"duration\":29916059,\"module\":\"kubernetes\"},\"host\":{\"name\":\"kube-elastic-metricbeat-7dd4c74c74-jz25z\"},\"kubernetes\":{\"labels\":{\"common_k8s_elastic_co_type\":\"elasticsearch\",\"elasticsearch_k8s_elastic_co_cluster_name\":\"kube-elastic-monitor\",\"elasticsearch_k8s_elastic_co_statefulset_name\":\"kube-elastic-monitor-es-default\"},\"namespace\":\"default\",\"service\":{\"cluster_ip\":\"None\",\"created\":\"2020-04-01T15:22:51.000Z\",\"name\":\"kube-elastic-monitor-es-default\",\"type\":\"ClusterIP\"}},\"metricset\":{\"name\":\"state_service\",\"period\":10000},\"service\":{\"address\":\"kube-elastic-kube-state-metrics.default:8080\",\"type\":\"kubernetes\"}}, Private:interface {}(nil), TimeSeries:true}, Flags:0x0, Cache:publisher.EventCache{m:common.MapStr(nil)}} (status=400): {\"type\":\"mapper_parsing_exception\",\"reason\":\"failed to parse field [kubernetes.service.cluster_ip] of type [ip] in document with id 'yf_HN3EBEveQZJtR4sNf'. Preview of field's value: 'None'\",\"caused_by\":{\"type\":\"illegal_argument_exception\",\"reason\":\"'None' is not an IP string literal.\"}}"}

And a service yaml that causes the error:

17:09 $ kubectl get svc kube-elastic-monitor-es-default -o yaml
apiVersion: v1
kind: Service
metadata:
  creationTimestamp: "2020-04-01T15:22:51Z"
  labels:
    common.k8s.elastic.co/type: elasticsearch
    elasticsearch.k8s.elastic.co/cluster-name: kube-elastic-monitor
    elasticsearch.k8s.elastic.co/statefulset-name: kube-elastic-monitor-es-default
  name: kube-elastic-monitor-es-default
  namespace: default
  ownerReferences:
  - apiVersion: elasticsearch.k8s.elastic.co/v1
    blockOwnerDeletion: true
    controller: true
    kind: Elasticsearch
    name: kube-elastic-monitor
    uid: 47ba3d61-9c4d-479b-9689-4bb184aa7541
  resourceVersion: "2612356"
  selfLink: /api/v1/namespaces/default/services/kube-elastic-monitor-es-default
  uid: ff1c86e3-f00a-423b-9bbb-dfc8df613ee1
spec:
  clusterIP: None
  selector:
    common.k8s.elastic.co/type: elasticsearch
    elasticsearch.k8s.elastic.co/cluster-name: kube-elastic-monitor
    elasticsearch.k8s.elastic.co/statefulset-name: kube-elastic-monitor-es-default
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}

For confirmed bugs, please report:

Hey @Anya_Sabo!

Thanks for reporting this! I think this is a corner case we should handle! Could you open a bug issue for this please?

Done, thanks:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.