Filebeat monitoring metrics are "dropped" when a GEOIP pipeline is used

Hi, Stephen!
Thank you for looking into this!

I'm totally open to try to write the pipeline correctly - much better to use a correct and supported approach going forward too - for many reasons...

At the moment, I cannot ingest events via Filebeat into the ES8 cluster due to other issues (misconfigured index templates - that I'm trying to resolve with our IT support, as I don't have admin rights to this cluster ....)

To overcome this - I'm frantically trying to get my local setup working :joy: - and I am almost there , except that my local Filebeat does not want to talk to the local ES8 - and I posted a different question on that: Filebeat on local laptop does not talk to the Elasticsearch (also on local laptop) - dial tcp [::1]:9200: connect: cannot assign requested address

But, in the meantime, I can get samples of docs from another, running and working, ES7 cluster: I can give you an input event that the Filebeat usually gets from PubSub:

{
   "event_uuid":"m_id_1020_2",
   "logstash_id":"m_id_1020_2",
   "cid":"12345",
   "event_timestamp_millis":"1666285498000",
   "activity_date":"2022-10-20",
   "remote_ip":"165.155.130.139",
   "user_agent":"Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
   "referer":"https://www.my.site2.com/",
   "ref_param":"https://www.nyt.com",
   "request_status":"500",
   "request_method":"POST",
   "request_size":"52",
   "response_size":"124",
   "latency":"1.3"
}

and a corresponding fully indexed event in ES after it runs through the GEOIP pipeline:

GET ibc-parsed-logs/_search
{
  "query": {
    "term": {
      "message.logstash_id": {
        "value": "m_id_1020_2"
      }
    }
  }
}

Results:
{
  "took" : 563,
  "timed_out" : false,
  "_shards" : {
    "total" : 480,
    "successful" : 480,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "ibc-parsed-logs-2022.10.20-000547",
        "_type" : "_doc",
        "_id" : "m_id_1020_2",
        "_score" : 0.2876821,
        "_source" : {
          "input" : {
            "type" : "gcp-pubsub"
          },
          "agent" : {
            "hostname" : "0b19a09b8d1d",
            "name" : "0b19a09b8d1d",
            "id" : "284cf4e5-05fb-4fe2-a93e-c83f25247f30",
            "type" : "filebeat",
            "ephemeral_id" : "31723f5a-17bf-43be-a33e-68ec3fad1f79",
            "version" : "7.15.0"
          },
          "@timestamp" : "2022-10-20T17:35:43.159Z",
          "ecs" : {
            "version" : "1.11.0"
          },
          "host" : {
            "hostname" : "0b19a09b8d1d",
            "os" : {
              "kernel" : "5.10.47-linuxkit",
              "codename" : "Core",
              "name" : "CentOS Linux",
              "type" : "linux",
              "family" : "redhat",
              "version" : "7 (Core)",
              "platform" : "centos"
            },
            "containerized" : true,
            "ip" : [
              "172.17.0.2"
            ],
            "name" : "0b19a09b8d1d",
            "id" : "38c2fd0d69ba05ae64d8a4d4fc156791",
            "mac" : [
              "02:42:ac:11:00:02"
            ],
            "architecture" : "x86_64"
          },
          "event" : {
            "created" : "2022-10-20T17:35:44.178Z",
            "id" : "59279bf715-5952995738261221"
          },
          "message" : {
            "request_status" : "500",
            "referer" : "https://www.my.site2.com/",
            "ref_param" : "https://www.nyt.com",
            "remote_ip_geo" : {
              "continent_name" : "North America",
              "region_iso_code" : "US-NY",
              "city_name" : "The Bronx",
              "country_iso_code" : "US",
              "country_name" : "United States",
              "region_name" : "New York",
              "location" : {
                "lon" : -73.8616,
                "lat" : 40.847
              }
            },
            "latency" : "1.3",
            "logstash_id" : "m_id_1020_2",
            "activity_date" : "2022-10-20",
            "request_method" : "POST",
            "response_size" : "124",
            "remote_ip" : "165.155.130.139",
            "event_timestamp_millis" : "1666285498000",
            "request_size" : "52",
            "user_agent" : "Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
            "cid" : "12345"
          }
        }
      }
    ]
  }
}


I have also created one more version of filebeat.yml with GEO pipeline disabled:

# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
  enabled: true
  index: "ibc-parsed-logs"
  #parameters.pipeline: "geoip-info"
  hosts: ${ES_HOSTS}
  protocol: "https"
  # Authentication credentials - either API key or username/password.
  api_key: ${ES_API_KEY}

and sent the same event (different ID):
here is the full indexed event in ES7:

{
  "took" : 216,
  "timed_out" : false,
  "_shards" : {
    "total" : 480,
    "successful" : 480,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.6931471,
    "hits" : [
      {
        "_index" : "ibc-parsed-logs-2022.10.20-000547",
        "_type" : "_doc",
        "_id" : "m_id_1020_3",
        "_score" : 0.6931471,
        "_source" : {
          "@timestamp" : "2022-10-20T17:44:11.891Z",
          "agent" : {
            "type" : "filebeat",
            "version" : "7.15.0",
            "hostname" : "8f211a5386b7",
            "ephemeral_id" : "bc5d2130-605a-42ad-960b-a030c4efec8b",
            "id" : "711a2bee-f3ed-407d-82dc-0d5ec5a0b145",
            "name" : "8f211a5386b7"
          },
          "ecs" : {
            "version" : "1.11.0"
          },
          "event" : {
            "id" : "59279bf715-5953168694008103",
            "created" : "2022-10-20T17:44:12.924Z"
          },
          "message" : {
            "activity_date" : "2022-10-20",
            "request_method" : "POST",
            "request_size" : "52",
            "latency" : "1.3",
            "remote_ip" : "165.155.130.139",
            "user_agent" : "Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
            "ref_param" : "https://www.nyt.com",
            "response_size" : "124",
            "logstash_id" : "m_id_1020_3",
            "event_timestamp_millis" : "1666285498000",
            "referer" : "https://www.my.site2.com/",
            "request_status" : "500",
            "cid" : "12345"
          },
          "input" : {
            "type" : "gcp-pubsub"
          },
          "host" : {
            "name" : "8f211a5386b7",
            "hostname" : "8f211a5386b7",
            "architecture" : "x86_64",
            "os" : {
              "kernel" : "5.10.47-linuxkit",
              "codename" : "Core",
              "type" : "linux",
              "platform" : "centos",
              "version" : "7 (Core)",
              "family" : "redhat",
              "name" : "CentOS Linux"
            },
            "id" : "38c2fd0d69ba05ae64d8a4d4fc156791",
            "containerized" : true,
            "ip" : [
              "172.17.0.2"
            ],
            "mac" : [
              "02:42:ac:11:00:02"
            ]
          }
        }
      }
    ]
  }
}

Not sure if this is helpful - otherwise, I will repeat the experiment in my local ES8 + Filebeat8 setup once I get Filebeat talking to ES8!

Thank you!!!