Filebeat monitoring metrics are "dropped" when a GEOIP pipeline is used

Hi, this question comes as a result of solving one mystery in this post: Filebeat monitoring metrics not visible in ElasticSearch - #30 by stephenb - huge Thank You to @stephenb for his help! - where Filebeat monitoring metrics were not being indexed into Elasticsearch IF a GEOIP processing pipeline was enabled in the filebeat.yml config.

So this post's goal is now solve the follow up mystery - WHY having a GEOIP pipeline enabled is causing all monitoring events from the Filebeat to be dropped/not indexed into ES?

Here is the geoip pipeline I have created in ES:

PUT _ingest/pipeline/geoip-info
{
  "description": "Add geoip info",
  "processors": [
    {
      "geoip": {
        "field": "message.remote_ip",
        "target_field": "message.remote_ip_geo",
        "ignore_missing": true
      }
    }
  ]
}

This is the filebeat config I'm using to process events from GCP PubSub, enrich with GEO info using the geoip pipeline, push into ES (this part of event processing works just fine!) - and at the same time enabling Filebeat monitoring metrics to also be sent into the same ES cluster

###################### Filebeat Configuration Example #########################

queue.mem:
  events: 4096
  flush.min_events: 2048
  flush.timeout: 1s

# ============================== Filebeat inputs ===============================

filebeat.inputs:
- type: gcp-pubsub
  enabled: true
  project_id: ${PROJECT_ID}
  topic: ${PUBSUB_INPUT_TOPIC}
  subscription.name: ${SUBSCRIPTION_NAME}
  fields_under_root: true

# ======================= Elasticsearch template setting =======================
setup.template.name: "ibc-parsed-logs"
setup.template.pattern: "ibc-parsed-logs-*"
setup.template.json.enabled: true
setup.template.json.path: "ibc_es_template.json"
setup.template.json.name: "ibc-parsed-logs-template"
setup.template.enabled: true
setup.ilm.enabled: false

# =============================== Elastic Cloud ================================
# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
# ENG IBC ES
#cloud.id: '${CLOUD_ID}'

# ================================== Outputs ===================================
output.console:
  enabled: false
  pretty: true

# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
  enabled: true
  index: "ibc-parsed-logs"
  #parameters.pipeline: "geoip-info"
  hosts: ${ES_HOSTS}
  protocol: "https"
  api_key: ${ES_API_KEY}

# ============================= X-Pack Monitoring ==============================
monitoring.enabled: true
monitoring.cluster_uuid: "9PxnN-9Pxxx"

# ================================= Processors =================================
processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded
  - decode_json_fields:
      fields: ["message"]
      add_error_key: true
      document_id: "event_uuid"

# ================================== Logging ===================================
logging.metrics.enabled: true
logging.enabled: true
logging.level: debug
logging.to_files: true
logging.files:
  path: /usr/share/filebeat/f_logs
  name: filebeat
  keepfiles: 10
  permissions: 0640

# At debug level, you can selectively enable logging only for some components.
# To enable all selectors use ["*"]. Examples of other selectors are "beat",
# "publisher", "service".
logging.selectors: ["*"]

Now, IF the geoip pipeline line is commented out in the filebeat.yml:

 #parameters.pipeline: "geoip-info"

then In the filebeat logs I can see the monitoring events are indeed sent to ES:

{"log.level":"debug","@timestamp":"2022-10-19T18:18:22.649Z","log.logger":"monitoring","log.origin":{"file.name":"elasticsearch/client.go","file.line":99},"message":"XPack monitoring is enabled","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-10-19T18:18:22.649Z","log.logger":"monitoring","log.origin":{"file.name":"elasticsearch/elasticsearch.go","file.line":234},"message":"Successfully connected to X-Pack Monitoring endpoint.","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-10-19T18:18:22.650Z","log.logger":"monitoring","log.origin":{"file.name":"elasticsearch/elasticsearch.go","file.line":240},"message":"Finish monitoring endpoint init loop.","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-10-19T18:18:22.650Z","log.logger":"monitoring","log.origin":{"file.name":"elasticsearch/elasticsearch.go","file.line":248},"message":"Start monitoring state metrics snapshot loop with period 1m0s.","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-10-19T18:18:22.650Z","log.logger":"monitoring","log.origin":{"file.name":"elasticsearch/elasticsearch.go","file.line":248},"message":"Start monitoring stats metrics snapshot loop with period 10s.","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-10-19T18:18:32.433Z","log.logger":"input","log.origin":{"file.name":"input/input.go","file.line":137},"message":"Run input","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-10-19T18:18:32.625Z","log.logger":"monitoring","log.origin":{"file.name":"processing/processors.go","file.line":210},"message":"Publish event: {\n  \"@timestamp\": \"2022-10-19T18:18:32.618Z\",\n  \"@metadata\": {\n    \"beat\": \"filebeat\",\n    \"type\": \"_doc\",\n    \"version\": \"8.4.3\",\n    \"interval_ms\": 10000,\n    \"params\": {\n      \"interval\": \"10s\"\n    },\n    \"cluster_uuid\": \"9PxnN-9Pxxx\",\n    \"type\": \"beats_stats\"\n  },\n  \"beat\": {\n    \"version\": \"8.4.3\",\n    \"name\": \"54a5f02bc902\",\n    \"host\": \"54a5f02bc902\",\n    \"uuid\": \"c8eccb7e-f287-4e23-8f58-4af91ccb1a8d\",\n    \"type\": \"filebeat\"\n  },\n  \"metrics\": {\n    \"beat\": {\n      \"cpu\": {\n        \"total\": {\n          \"value\": 430,\n          \"ticks\": 430,\n          \"time\": {\n            \"ms\": 430\n          }\n        },\n        \"user\": {\n          \"time\": {\n            \"ms\": 250\n          },\n          \"ticks\": 250\n        },\n        \"system\": {\n          \"ticks\": 180,\n          \"time\": {\n            \"ms\": 180\n          }\n        }\n      },\n      \"runtime\": {\n        \"goroutines\": 76\n      },\n      \"info\": {\n        \"uptime\": {\n          \"ms\": 10343\n        },\n        \"ephemeral_id\": \"74839794-ee85-4d0a-8e4a-580ccb19df7a\",\n        \"name\": \"filebeat\",\n        \"version\": \"8.4.3\"\n      },\n      \"cgroup\": {\n        \"cpuacct\": {\n          \"total\": {\n            \"ns\": 495299162\n          },\n          \"id\": \"/\"\n        },\n        \"memory\": {\n          \"mem\": {\n            \"limit\": {\n              \"bytes\": 9223372036854771712\n            },\n            \"usage\": {\n              \"bytes\": 51306496\n            }\n          },\n          \"id\": \"/\"\n        },\n        \"cpu\": {\n          \"id\": \"/\",\n          \"cfs\": {\n            \"period\": {\n              \"us\": 100000\n            },\n            \"quota\": {\n              \"us\": 0\n            }\n          },\n          \"stats\": {\n            \"periods\": 0,\n            \"throttled\": {\n              \"periods\": 0,\n              \"ns\": 0\n            }\n          }\n        }\n      },\n      \"handles\": {\n        \"limit\": {\n          \"hard\": 1048576,\n          \"soft\": 1048576\n        },\n        \"open\": 20\n      },\n      \"memstats\": {\n        \"gc_next\": 18153696,\n        \"rss\": 137138176,\n        \"memory_total\": 58737312,\n        \"memory_alloc\": 13683544,\n        \"memory_sys\": 34423816\n      }\n    },\n    \"system\": {\n      \"cpu\": {\n        \"cores\": 8\n      },\n      \"load\": {\n        \"1\": 0.02,\n        \"5\": 0.03,\n        \"15\": 0,\n        \"norm\": {\n          \"5\": 0.0038,\n          \"15\": 0,\n          \"1\": 0.0025\n        }\n      }\n    },\n    \"registrar\": {\n      \"states\": {\n        \"current\": 0,\n        \"update\": 0,\n        \"cleanup\": 0\n      },\n      \"writes\": {\n        \"success\": 0,\n        \"total\": 0,\n        \"fail\": 0\n      }\n    },\n    \"filebeat\": {\n      \"events\": {\n        \"active\": 0,\n        \"added\": 0,\n        \"done\": 0\n      },\n      \"harvester\": {\n        \"running\": 0,\n        \"open_files\": 0,\n        \"skipped\": 0,\n        \"started\": 0,\n        \"closed\": 0\n      },\n      \"input\": {\n        \"netflow\": {\n          \"packets\": {\n            \"received\": 0,\n            \"dropped\": 0\n          },\n          \"flows\": 0\n        },\n        \"log\": {\n          \"files\": {\n            \"renamed\": 0,\n            \"truncated\": 0\n          }\n        }\n      }\n    },\n    \"libbeat\": {\n      \"config\": {\n        \"scans\": 0,\n        \"reloads\": 0,\n        \"module\": {\n          \"starts\": 0,\n          \"stops\": 0,\n          \"running\": 0\n        }\n      },\n      \"output\": {\n        \"events\": {\n          \"batches\": 0,\n          \"total\": 0,\n          \"acked\": 0,\n          \"failed\": 0,\n          \"dropped\": 0,\n          \"duplicates\": 0,\n          \"active\": 0,\n          \"toomany\": 0\n        },\n        \"write\": {\n          \"errors\": 0,\n          \"bytes\": 0\n        },\n        \"read\": {\n          \"bytes\": 0,\n          \"errors\": 0\n        },\n        \"type\": \"elasticsearch\"\n      },\n      \"pipeline\": {\n        \"queue\": {\n          \"acked\": 0,\n          \"max_events\": 4096\n        },\n        \"clients\": 1,\n        \"events\": {\n          \"filtered\": 0,\n          \"published\": 0,\n          \"failed\": 0,\n          \"dropped\": 0,\n          \"retry\": 0,\n          \"active\": 0,\n          \"total\": 0\n        }\n      }\n    }\n  }\n}","service.name":"filebeat","ecs.version":"1.6.0"}

and I can see the corresponding .monitoring-beat-xxx index in ES with the monitoring events:

GET /_cat/indices/*monitoring*
results:
green open .monitoring-es-7-2022.10.18                   ZSLCzJRBRiS2r4qihIw71w 1 1    385 34 690.3kb 347.2kb
green open .ds-.monitoring-kibana-8-mb-2022.10.18-000001 Xvq_P_9NRiKy3hYxPwBwmQ 1 1  43220  0  22.7mb  11.3mb
green open .monitoring-beats-7-2022.10.19                3HuDjP9NTFeBRjI5ElGkVQ 1 1     84  0 798.9kb 386.7kb
green open .ds-.monitoring-es-8-mb-2022.10.18-000001     2H4hLmyGS8q2-EQfbuynQQ 1 1 318236  0 382.2mb 190.4mb
green open .monitoring-kibana-7-2022.10.18               4kofsQNNTzylE-YWoEMYXg 1 1     76  0 371.8kb 165.7kb


GET .monitoring-beats-7-2022.10.19/_search
result:
  "hits": {
    "total": {
      "value": 88,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": ".monitoring-beats-7-2022.10.19",
        "_id": "Fkd48YMBgEVm2LhbFsxT",
        "_score": 1,
        "_source": {
          "timestamp": "2022-10-19T18:18:42.617Z",
          "interval_ms": 10000,
          "cluster_uuid": "9PxnN-9Pxxx",
          "type": "beats_stats",
          "beats_stats": {
            "beat": {
              "uuid": "c8eccb7e-f287-4e23-8f58-4af91ccb1a8d",
              "type": "filebeat",
              "version": "8.4.3",
              "name": "54a5f02bc902",
              "host": "54a5f02bc902"
            },
            "metrics": {
              "beat": {
                "cgroup": {
                  "cpu": {
                    "id": "/",
                    "cfs": {
                      "period": {
                        "us": 100000
 

BUT, if I enable the geoip pipeline (uncomment) - I see the same monitoring events being shown as sent to the ES in the filebeat logs - but no actual events reach ES and no .monitoring-beat-xxx index is created...

Any idea why? :slight_smile:
Thank you!!!

@ppine7 We will get this solved but I need a couple tests and data, I am confident we will fix this

Lets not focus on the monitoring data in this thread I have a theory why that breaks but I want to focus on getting the geoip-info Working and I think we get it working correctly the monitoring will work as well.

In short I have run 100s of pipeline over the last several years and I have never defined them like this that is not valid as h

#parameters.pipeline: "geoip-info"

In fact I did not even know that was a thing but I have now read about it a bit.

We need to do a couple experiments we will take them 1 at a time...

1st Run filebeat without the geoip-info pipeline and provide a couple sample message / docs _source from elasticsearch that would have run through the pipeline... and post them here... I want to see what the source documents look like.

Then we will take the next steps.

Hi, Stephen!
Thank you for looking into this!

I'm totally open to try to write the pipeline correctly - much better to use a correct and supported approach going forward too - for many reasons...

At the moment, I cannot ingest events via Filebeat into the ES8 cluster due to other issues (misconfigured index templates - that I'm trying to resolve with our IT support, as I don't have admin rights to this cluster ....)

To overcome this - I'm frantically trying to get my local setup working :joy: - and I am almost there , except that my local Filebeat does not want to talk to the local ES8 - and I posted a different question on that: Filebeat on local laptop does not talk to the Elasticsearch (also on local laptop) - dial tcp [::1]:9200: connect: cannot assign requested address

But, in the meantime, I can get samples of docs from another, running and working, ES7 cluster: I can give you an input event that the Filebeat usually gets from PubSub:

{
   "event_uuid":"m_id_1020_2",
   "logstash_id":"m_id_1020_2",
   "cid":"12345",
   "event_timestamp_millis":"1666285498000",
   "activity_date":"2022-10-20",
   "remote_ip":"165.155.130.139",
   "user_agent":"Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
   "referer":"https://www.my.site2.com/",
   "ref_param":"https://www.nyt.com",
   "request_status":"500",
   "request_method":"POST",
   "request_size":"52",
   "response_size":"124",
   "latency":"1.3"
}

and a corresponding fully indexed event in ES after it runs through the GEOIP pipeline:

GET ibc-parsed-logs/_search
{
  "query": {
    "term": {
      "message.logstash_id": {
        "value": "m_id_1020_2"
      }
    }
  }
}

Results:
{
  "took" : 563,
  "timed_out" : false,
  "_shards" : {
    "total" : 480,
    "successful" : 480,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "ibc-parsed-logs-2022.10.20-000547",
        "_type" : "_doc",
        "_id" : "m_id_1020_2",
        "_score" : 0.2876821,
        "_source" : {
          "input" : {
            "type" : "gcp-pubsub"
          },
          "agent" : {
            "hostname" : "0b19a09b8d1d",
            "name" : "0b19a09b8d1d",
            "id" : "284cf4e5-05fb-4fe2-a93e-c83f25247f30",
            "type" : "filebeat",
            "ephemeral_id" : "31723f5a-17bf-43be-a33e-68ec3fad1f79",
            "version" : "7.15.0"
          },
          "@timestamp" : "2022-10-20T17:35:43.159Z",
          "ecs" : {
            "version" : "1.11.0"
          },
          "host" : {
            "hostname" : "0b19a09b8d1d",
            "os" : {
              "kernel" : "5.10.47-linuxkit",
              "codename" : "Core",
              "name" : "CentOS Linux",
              "type" : "linux",
              "family" : "redhat",
              "version" : "7 (Core)",
              "platform" : "centos"
            },
            "containerized" : true,
            "ip" : [
              "172.17.0.2"
            ],
            "name" : "0b19a09b8d1d",
            "id" : "38c2fd0d69ba05ae64d8a4d4fc156791",
            "mac" : [
              "02:42:ac:11:00:02"
            ],
            "architecture" : "x86_64"
          },
          "event" : {
            "created" : "2022-10-20T17:35:44.178Z",
            "id" : "59279bf715-5952995738261221"
          },
          "message" : {
            "request_status" : "500",
            "referer" : "https://www.my.site2.com/",
            "ref_param" : "https://www.nyt.com",
            "remote_ip_geo" : {
              "continent_name" : "North America",
              "region_iso_code" : "US-NY",
              "city_name" : "The Bronx",
              "country_iso_code" : "US",
              "country_name" : "United States",
              "region_name" : "New York",
              "location" : {
                "lon" : -73.8616,
                "lat" : 40.847
              }
            },
            "latency" : "1.3",
            "logstash_id" : "m_id_1020_2",
            "activity_date" : "2022-10-20",
            "request_method" : "POST",
            "response_size" : "124",
            "remote_ip" : "165.155.130.139",
            "event_timestamp_millis" : "1666285498000",
            "request_size" : "52",
            "user_agent" : "Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
            "cid" : "12345"
          }
        }
      }
    ]
  }
}


I have also created one more version of filebeat.yml with GEO pipeline disabled:

# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
  enabled: true
  index: "ibc-parsed-logs"
  #parameters.pipeline: "geoip-info"
  hosts: ${ES_HOSTS}
  protocol: "https"
  # Authentication credentials - either API key or username/password.
  api_key: ${ES_API_KEY}

and sent the same event (different ID):
here is the full indexed event in ES7:

{
  "took" : 216,
  "timed_out" : false,
  "_shards" : {
    "total" : 480,
    "successful" : 480,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.6931471,
    "hits" : [
      {
        "_index" : "ibc-parsed-logs-2022.10.20-000547",
        "_type" : "_doc",
        "_id" : "m_id_1020_3",
        "_score" : 0.6931471,
        "_source" : {
          "@timestamp" : "2022-10-20T17:44:11.891Z",
          "agent" : {
            "type" : "filebeat",
            "version" : "7.15.0",
            "hostname" : "8f211a5386b7",
            "ephemeral_id" : "bc5d2130-605a-42ad-960b-a030c4efec8b",
            "id" : "711a2bee-f3ed-407d-82dc-0d5ec5a0b145",
            "name" : "8f211a5386b7"
          },
          "ecs" : {
            "version" : "1.11.0"
          },
          "event" : {
            "id" : "59279bf715-5953168694008103",
            "created" : "2022-10-20T17:44:12.924Z"
          },
          "message" : {
            "activity_date" : "2022-10-20",
            "request_method" : "POST",
            "request_size" : "52",
            "latency" : "1.3",
            "remote_ip" : "165.155.130.139",
            "user_agent" : "Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
            "ref_param" : "https://www.nyt.com",
            "response_size" : "124",
            "logstash_id" : "m_id_1020_3",
            "event_timestamp_millis" : "1666285498000",
            "referer" : "https://www.my.site2.com/",
            "request_status" : "500",
            "cid" : "12345"
          },
          "input" : {
            "type" : "gcp-pubsub"
          },
          "host" : {
            "name" : "8f211a5386b7",
            "hostname" : "8f211a5386b7",
            "architecture" : "x86_64",
            "os" : {
              "kernel" : "5.10.47-linuxkit",
              "codename" : "Core",
              "type" : "linux",
              "platform" : "centos",
              "version" : "7 (Core)",
              "family" : "redhat",
              "name" : "CentOS Linux"
            },
            "id" : "38c2fd0d69ba05ae64d8a4d4fc156791",
            "containerized" : true,
            "ip" : [
              "172.17.0.2"
            ],
            "mac" : [
              "02:42:ac:11:00:02"
            ]
          }
        }
      }
    ]
  }
}

Not sure if this is helpful - otherwise, I will repeat the experiment in my local ES8 + Filebeat8 setup once I get Filebeat talking to ES8!

Thank you!!!

Hi, @stephenb , now that I have my local setup fully working - I want to try to get the GEOIP pipeline setup correctly.

First, a big development - the actual problem with metrics not being pushed into our ES cluster when GEOIP pipeline was enabled is SOLVED ! -
What it turned out to be was actually an error in our Terraform script that was creating ES with all setup - it was creating the geoip pipeline with a wrong name!
it was creating it with name "JSON" instead of "GEOIP" !
Once we finally caught this bug - even the current filebeat.yml with the "wrong" way to define the pipeline started working! :slight_smile: - and both filebeat events and filebeat metrics started flawing into ES beautifully :slight_smile:

Now, I still want to do it the "right" way as I hate dirty hacks that will end up being a maintenance nightmare later on ....
So here is the current filebeat.yml config:

###################### Filebeat Configuration Example #########################
queue.mem:
  events: 4096
  flush.min_events: 2048
  flush.timeout: 1s

# ============================== Filebeat inputs ===============================
filebeat.inputs:
- type: gcp-pubsub
  enabled: true
  project_id: ${PROJECT_ID}
  topic: ${PUBSUB_INPUT_TOPIC}
  subscription.name: ${SUBSCRIPTION_NAME}
  fields_under_root: true

# ======================= Elasticsearch template setting =======================
setup.template.name: "ibc-parsed-logs"
setup.template.pattern: "ibc-parsed-logs-*"
setup.template.json.enabled: true
setup.template.json.path: "ibc_es_template.json"
setup.template.json.name: "ibc-parsed-logs-template"
setup.template.enabled: true
setup.ilm.enabled: false

# ================================== Outputs ===================================
output.console:
  enabled: false
  pretty: true

# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
  enabled: true
  index: "ibc-parsed-logs"
  parameters.pipeline: "geoip-info"
  #hosts: ${ES_HOSTS}
  hosts: "http://localhost:9200"
  #protocol: "https"
  #api_key: ${ES_API_KEY}

# ============================= X-Pack Monitoring ==============================
monitoring.enabled: true
monitoring.cluster_uuid: ${MON_CLUSTER_UUID}

# ================================= Processors =================================
processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded
  - decode_json_fields:
      fields: ["message"]
      add_error_key: true
      document_id: "event_uuid"

how should I update it to define the pipeline in a supported/standard way?

Thank you!

Should be...

output.elasticsearch:
  enabled: true
  index: "ibc-parsed-logs"
  pipeline: "geoip-info"

Assuming the pipeline is named correctly

thank you, @stephenb ,
unfortunately, this config does not work ....
Details:

With this pipeline:

GET _ingest/pipeline/geoip-info

{
  "geoip-info": {
    "description": "Add geoip info",
    "processors": [
      {
        "geoip": {
          "field": "message.remote_ip",
          "target_field": "message.remote_ip_geo",
          "ignore_missing": true
        }
      }
    ]
  }
}

and these type of input events:

{
   "event_uuid":"m_id_1024_3",
   "logstash_id":"m_id_1024_3",
   "cid":"12345",
   "event_timestamp_millis":"1666639334000",
   "activity_date":"2022-10-24",
   "remote_ip":"165.155.130.139",
   "user_agent":"Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
   "referer":"https://www.my.site1.com/",
   "ref_param":"https://www.nyt.com",
   "request_status":"500",
   "request_method":"POST",
   "request_size":"52",
   "response_size":"124",
   "latency":"1.3"
}

when I use old config for pipeline:

# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
  enabled: true
  index: "ibc-parsed-logs"
  parameters.pipeline: "geoip-info"
  hosts: "http://localhost:9200"

I get the following event in ES, with full geo IP info in the "remote_ip_geo" field:

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.6931471,
    "hits": [
      {
        "_index": "ibc-parsed-logs-2022.10.24-000001",
        "_id": "m_id_1024_2",
        "_score": 0.6931471,
        "_source": {
          "input": {
            "type": "gcp-pubsub"
          },
          "agent": {
            "name": "mac-lt2-mpopova.fios-router.home",
            "id": "e0b4f8e6-d0c6-4c38-a62d-ac6ff81a555a",
            "type": "filebeat",
            "ephemeral_id": "dc517ee9-68b4-4e52-8720-88eccf8ff967",
            "version": "8.4.3"
          },
          "@timestamp": "2022-10-24T19:30:00.533Z",
          "ecs": {
            "version": "8.0.0"
          },
          "host": {
            "hostname": "mac-lt2-mpopova.fios-router.home",
            "os": {
              "build": "21E258",
              "kernel": "21.4.0",
              "name": "macOS",
              "type": "macos",
              "family": "darwin",
              "version": "12.3.1",
              "platform": "darwin"
            },
            "ip": [
              "xxx", ...
            ],
            "name": "mac-lt2-mpopova.fios-router.home",
            "id": "xxx443",
            "mac": [
              "xxx:5d"
...
            ],
            "architecture": "x86_64"
          },
          "event": {
            "created": "2022-10-24T19:30:01.726Z",
            "id": "59279bf715-5532412804884987"
          },
          "message": {
            "request_status": "500",
            "ref_param": "https://www.nyt.com",
            "referer": "https://www.my.site2.com/",
            "remote_ip_geo": {
              "continent_name": "North America",
              "region_iso_code": "US-NY",
              "city_name": "The Bronx",
              "country_iso_code": "US",
              "country_name": "United States",
              "region_name": "New York",
              "location": {
                "lon": -73.8616,
                "lat": 40.847
              }
            },
            "latency": "1.3",
            "activity_date": "2022-10-24",
            "logstash_id": "m_id_1024_2",
            "request_method": "POST",
            "response_size": "124",
            "remote_ip": "165.155.130.139",
            "event_timestamp_millis": "1666639334000",
            "request_size": "52",
            "user_agent": "Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
            "cid": "12345"
          }
        }
      }
    ]
  }
}

but when I change filebeat.yml to:

# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
  enabled: true
  index: "ibc-parsed-logs"
  pipeline: "geoip-info"
  hosts: "http://localhost:9200"

no geo field is being added to the event in ES:

{
  "took": 8,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.6931471,
    "hits": [
      {
        "_index": "ibc-parsed-logs-2022.10.24-000001",
        "_id": "m_id_1024_3",
        "_score": 0.6931471,
        "_source": {
          "@timestamp": "2022-10-24T20:21:26.956Z",
          "input": {
            "type": "gcp-pubsub"
          },
          "agent": {
            "name": "mac-lt2-mpopova.fios-router.home",
            "type": "filebeat",
            "version": "8.4.3",
            "ephemeral_id": "8f1c47c8-4a02-4e36-a4c9-fe8479ed7dae",
            "id": "e0b4f8e6-d0c6-4c38-a62d-ac6ff81a555a"
          },
          "ecs": {
            "version": "8.0.0"
          },
          "host": {
            "os": {
              "type": "macos",
              "platform": "darwin",
              "version": "12.3.1",
              "family": "darwin",
              "name": "macOS",
              "kernel": "21.4.0",
              "build": "21E258"
            },
            "name": "mac-lt2-mpopova.fios-router.home",
            "id": "xxx443",
            "ip": [
              ...
            ],
            "mac": [
              "82:cf:fe:c0:c4:00",
              "xxx",
              ...
            ],
            "hostname": "mac-lt2-mpopova.fios-router.home",
            "architecture": "x86_64"
          },
          "event": {
            "created": "2022-10-24T20:21:28.059Z",
            "id": "59279bf715-6046081929135344"
          },
          "message": {
            "cid": "12345",
            "remote_ip": "165.155.130.139",
            "request_status": "500",
            "event_timestamp_millis": "1666639334000",
            "activity_date": "2022-10-24",
            "request_method": "POST",
            "response_size": "124",
            "latency": "1.3",
            "logstash_id": "m_id_1024_3",
            "user_agent": "Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
            "referer": "https://www.my.site1.com/",
            "ref_param": "https://www.nyt.com",
            "request_size": "52"
          }
        }
      }
    ]
  }
}

thanks!

There is something else afoot lets see if we can find it...

First try to _simulate the ingest pipeline

POST _ingest/pipeline/geoip-info/_simulate
{
  "docs":
  [
    {
      "_index": "ibc-parsed-logs-2022.10.24-000001",
      "_id": "m_id_1024_2",
      "_score": 0.6931471,
      "_source": {
        "input": {
          "type": "gcp-pubsub"
        },
        "agent": {
          "name": "mac-lt2-mpopova.fios-router.home",
          "id": "e0b4f8e6-d0c6-4c38-a62d-ac6ff81a555a",
          "type": "filebeat",
          "ephemeral_id": "dc517ee9-68b4-4e52-8720-88eccf8ff967",
          "version": "8.4.3"
        },
        "@timestamp": "2022-10-24T19:30:00.533Z",
        "ecs": {
          "version": "8.0.0"
        },
        "host": {
          "hostname": "mac-lt2-mpopova.fios-router.home",
          "os": {
            "build": "21E258",
            "kernel": "21.4.0",
            "name": "macOS",
            "type": "macos",
            "family": "darwin",
            "version": "12.3.1",
            "platform": "darwin"
          },
          "ip": [
            "172.17.0.2"
            ],
            "name": "mac-lt2-mpopova.fios-router.home",
            "id": "xxx443",
            "mac": [
              
              "00:00:ac:11:00:02"
              
              ],
              "architecture": "x86_64"
        },
        "event": {
          "created": "2022-10-24T19:30:01.726Z",
          "id": "59279bf715-5532412804884987"
        },
        "message": {
          "request_status": "500",
          "ref_param": "https://www.nyt.com",
          "referer": "https://www.my.site2.com/",
          "latency": "1.3",
          "activity_date": "2022-10-24",
          "logstash_id": "m_id_1024_2",
          "request_method": "POST",
          "response_size": "124",
          "remote_ip": "165.155.130.139",
          "event_timestamp_millis": "1666639334000",
          "request_size": "52",
          "user_agent": "Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
          "cid": "12345"
        }
      }
    }
    ]
}

Then Try to actually POST a document

POST ibc-parsed-logs-2022.10.24-000001/_doc?pipeline=geoip-info
{
  "input": {
    "type": "gcp-pubsub"
  },
  "agent": {
    "name": "mac-lt2-mpopova.fios-router.home",
    "id": "e0b4f8e6-d0c6-4c38-a62d-ac6ff81a555a",
    "type": "filebeat",
    "ephemeral_id": "dc517ee9-68b4-4e52-8720-88eccf8ff967",
    "version": "8.4.3"
  },
  "@timestamp": "2022-10-24T19:30:00.533Z",
  "ecs": {
    "version": "8.0.0"
  },
  "host": {
    "hostname": "mac-lt2-mpopova.fios-router.home",
    "os": {
      "build": "21E258",
      "kernel": "21.4.0",
      "name": "macOS",
      "type": "macos",
      "family": "darwin",
      "version": "12.3.1",
      "platform": "darwin"
    },
    "ip": [
      "172.17.0.2"
      ],
      "name": "mac-lt2-mpopova.fios-router.home",
      "id": "xxx443",
      "mac": [
        
        "00:00:ac:11:00:02"
        
        ],
        "architecture": "x86_64"
  },
  "event": {
    "created": "2022-10-24T19:30:01.726Z",
    "id": "59279bf715-5532412804884987"
  },
  "message": {
    "request_status": "500",
    "ref_param": "https://www.nyt.com",
    "referer": "https://www.my.site2.com/",
    "latency": "1.3",
    "activity_date": "2022-10-24",
    "logstash_id": "m_id_1024_2",
    "request_method": "POST",
    "response_size": "124",
    "remote_ip": "165.155.130.139",
    "event_timestamp_millis": "1666639334000",
    "request_size": "52",
    "user_agent": "Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
    "cid": "12345"
  }
}

What are the results?

Also try to put pipeline here per docs here

filebeat.inputs:
- type: gcp-pubsub
  enabled: true
  project_id: ${PROJECT_ID}
  topic: ${PUBSUB_INPUT_TOPIC}
  subscription.name: ${SUBSCRIPTION_NAME}
  fields_under_root: true
  pipeline : geoip-info

Also did you create your own mappings / template for this index?
How is your message field / object defined...

What other inputs or modules are there if any? Not sure if I asked that already? Looks like none.

Ok, here are the results of running the Experiments 1 and 2 (simulate and then POST) - commands and results:

POST _ingest/pipeline/geoip-info/_simulate
{
  "docs":
  [
    {
        "_index": "ibc-parsed-logs-2022.10.24-000001",
        "_id": "m_id_1024_3",
        "_score": 0.6931471,
        "_source": {
          "@timestamp": "2022-10-24T20:21:26.956Z",
          "input": {
            "type": "gcp-pubsub"
          },
          "agent": {
            "name": "mac-lt2-mpopova.fios-router.home",
            "type": "filebeat",
            "version": "8.4.3",
            "ephemeral_id": "xxxdae",
            "id": "xxxa"
          },
          "ecs": {
            "version": "8.0.0"
          },
          "host": {
            "os": {
              "type": "macos",
              "platform": "darwin",
              "version": "12.3.1",
              "family": "darwin",
              "name": "macOS",
              "kernel": "21.4.0",
              "build": "21E258"
            },
            "name": "mac-lt2-mpopova.fios-router.home",
            "id": "xxx443",
            "ip": [
              "xxx"
            ],
            "mac": [
              "xxx"
            ],
            "hostname": "mac-lt2-mpopova.fios-router.home",
            "architecture": "x86_64"
          },
          "event": {
            "created": "2022-10-24T20:21:28.059Z",
            "id": "59279bf715-6046081929135344"
          },
          "message": {
            "cid": "12345",
            "remote_ip": "165.155.130.139",
            "request_status": "500",
            "event_timestamp_millis": "1666639334000",
            "activity_date": "2022-10-24",
            "request_method": "POST",
            "response_size": "124",
            "latency": "1.3",
            "logstash_id": "m_id_1024_3",
            "user_agent": "Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
            "referer": "https://www.my.site1.com/",
            "ref_param": "https://www.nyt.com",
            "request_size": "52"
          }
        }
      }
    ]
}

Response:
{
  "docs": [
    {
      "doc": {
        "_index": "ibc-parsed-logs-2022.10.24-000001",
        "_id": "m_id_1024_3",
        "_version": "-3",
        "_source": {
          "input": {
            "type": "gcp-pubsub"
          },
          "agent": {
            "name": "mac-lt2-mpopova.fios-router.home",
            "id": "xxx555a",
            "type": "filebeat",
            "ephemeral_id": "xxxdae",
            "version": "8.4.3"
          },
          "@timestamp": "2022-10-24T20:21:26.956Z",
          "ecs": {
            "version": "8.0.0"
          },
          "host": {
            "hostname": "mac-lt2-mpopova.fios-router.home",
            "os": {
              "build": "21E258",
              "kernel": "21.4.0",
              "name": "macOS",
              "type": "macos",
              "family": "darwin",
              "version": "12.3.1",
              "platform": "darwin"
            },
            "ip": [
              "xxx"
            ],
            "name": "mac-lt2-mpopova.fios-router.home",
            "id": "xxx443",
            "mac": [
              "xxx"
            ],
            "architecture": "x86_64"
          },
          "event": {
            "created": "2022-10-24T20:21:28.059Z",
            "id": "59279bf715-6046081929135344"
          },
          "message": {
            "request_status": "500",
            "referer": "https://www.my.site1.com/",
            "ref_param": "https://www.nyt.com",
            "remote_ip_geo": {
              "continent_name": "North America",
              "region_iso_code": "US-NY",
              "city_name": "The Bronx",
              "country_iso_code": "US",
              "country_name": "United States",
              "region_name": "New York",
              "location": {
                "lon": -73.8616,
                "lat": 40.847
              }
            },
            "latency": "1.3",
            "activity_date": "2022-10-24",
            "logstash_id": "m_id_1024_3",
            "request_method": "POST",
            "response_size": "124",
            "remote_ip": "165.155.130.139",
            "event_timestamp_millis": "1666639334000",
            "request_size": "52",
            "user_agent": "Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
            "cid": "12345"
          }
        },
        "_ingest": {
          "timestamp": "2022-10-24T21:58:07.270453Z"
        }
      }
    }
  ]
}


POST ibc-parsed-logs-2022.10.24-000001/_doc?pipeline=geoip-info
{
          "@timestamp": "2022-10-24T20:21:26.956Z",
          "input": {
            "type": "gcp-pubsub"
          },
          "agent": {
            "name": "mac-lt2-mpopova.fios-router.home",
            "type": "filebeat",
            "version": "8.4.3",
            "ephemeral_id": "xxxdae",
            "id": "xxx5a"
          },
          "ecs": {
            "version": "8.0.0"
          },
          "host": {
            "os": {
              "type": "macos",
              "platform": "darwin",
              "version": "12.3.1",
              "family": "darwin",
              "name": "macOS",
              "kernel": "21.4.0",
              "build": "21E258"
            },
            "name": "mac-lt2-mpopova.fios-router.home",
            "id": "xxx443",
            "ip": [
              "xxx"
            ],
            "mac": [
              "xxx"
            ],
            "hostname": "mac-lt2-mpopova.fios-router.home",
            "architecture": "x86_64"
          },
          "event": {
            "created": "2022-10-24T20:21:28.059Z",
            "id": "59279bf715-6046081929135344"
          },
          "message": {
            "cid": "12345",
            "remote_ip": "165.155.130.139",
            "request_status": "500",
            "event_timestamp_millis": "1666639334000",
            "activity_date": "2022-10-24",
            "request_method": "POST",
            "response_size": "124",
            "latency": "1.3",
            "logstash_id": "m_id_1024_4",
            "user_agent": "Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
            "referer": "https://www.my.site1.com/",
            "ref_param": "https://www.nyt.com",
            "request_size": "52"
          }
        }

Result:
{
  "_index": "ibc-parsed-logs-2022.10.24-000001",
  "_id": "IJAIDIQBnd4XCebeDrEv",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 3,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 3,
  "_primary_term": 1
}


GET ibc-parsed-logs/_search
{
  "query": {
    "term": {
      "message.logstash_id": {
        "value": "m_id_1024_4"
      }
    }
  }
}

Result:
{
  "took": 561,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1.2039728,
    "hits": [
      {
        "_index": "ibc-parsed-logs-2022.10.24-000001",
        "_id": "IJAIDIQBnd4XCebeDrEv",
        "_score": 1.2039728,
        "_source": {
          "input": {
            "type": "gcp-pubsub"
          },
          "agent": {
            "name": "mac-lt2-mpopova.fios-router.home",
            "id": "xxx555a",
            "type": "filebeat",
            "ephemeral_id": "xxxdae",
            "version": "8.4.3"
          },
          "@timestamp": "2022-10-24T20:21:26.956Z",
          "ecs": {
            "version": "8.0.0"
          },
          "host": {
            "hostname": "mac-lt2-mpopova.fios-router.home",
            "os": {
              "build": "21E258",
              "kernel": "21.4.0",
              "name": "macOS",
              "type": "macos",
              "family": "darwin",
              "version": "12.3.1",
              "platform": "darwin"
            },
            "ip": [
              "xxx"
            ],
            "name": "mac-lt2-mpopova.fios-router.home",
            "id": "xxx443",
            "mac": [
              "xxx"
            ],
            "architecture": "x86_64"
          },
          "event": {
            "created": "2022-10-24T20:21:28.059Z",
            "id": "59279bf715-6046081929135344"
          },
          "message": {
            "request_status": "500",
            "referer": "https://www.my.site1.com/",
            "ref_param": "https://www.nyt.com",
            "remote_ip_geo": {
              "continent_name": "North America",
              "region_iso_code": "US-NY",
              "city_name": "The Bronx",
              "country_iso_code": "US",
              "country_name": "United States",
              "region_name": "New York",
              "location": {
                "lon": -73.8616,
                "lat": 40.847
              }
            },
            "latency": "1.3",
            "activity_date": "2022-10-24",
            "logstash_id": "m_id_1024_4",
            "request_method": "POST",
            "response_size": "124",
            "remote_ip": "165.155.130.139",
            "event_timestamp_millis": "1666639334000",
            "request_size": "52",
            "user_agent": "Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
            "cid": "12345"
          }
        }
      }
    ]
  }
}

after changing filebeat.yml to:

# ============================== Filebeat inputs ===============================
filebeat.inputs:
- type: gcp-pubsub
  enabled: true
  project_id: ${PROJECT_ID}
  topic: ${PUBSUB_INPUT_TOPIC}
  subscription.name: ${SUBSCRIPTION_NAME}
  fields_under_root: true
  pipeline: "geoip-info"

# ======================= Elasticsearch template setting =======================
setup.template.name: "ibc-parsed-logs"
setup.template.pattern: "ibc-parsed-logs-*"
setup.template.json.enabled: true
setup.template.json.path: "ibc_es_template.json"
setup.template.json.name: "ibc-parsed-logs-template"
setup.template.enabled: true
setup.ilm.enabled: false

# ================================== Outputs ===================================
output.console:
  enabled: false
  pretty: true

# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
  enabled: true
  index: "ibc-parsed-logs"
  #pipeline: "geoip-info"
  #parameters.pipeline: "geoip-info"
  #hosts: ${ES_HOSTS}
  hosts: "http://localhost:9200"

and pushing an event (id_5) though PubSub - result in ES:

{
  "took": 932,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "ibc-parsed-logs-2022.10.24-000001",
        "_id": "m_id_1024_5",
        "_score": 0.2876821,
        "_source": {
          "@timestamp": "2022-10-24T22:12:16.874Z",
          "ecs": {
            "version": "8.0.0"
          },
          "host": {
            "ip": [
              "xxx"
            ],
            "mac": [
              "xxx"
            ],
            "hostname": "mac-lt2-mpopova.local",
            "architecture": "x86_64",
            "os": {
              "version": "12.3.1",
              "family": "darwin",
              "name": "macOS",
              "kernel": "21.4.0",
              "build": "21E258",
              "type": "macos",
              "platform": "darwin"
            },
            "id": "xxx443",
            "name": "mac-lt2-mpopova.local"
          },
          "agent": {
            "ephemeral_id": "65bc0e45-8d28-4777-a91b-5b18e08b4575",
            "id": "e0b4f8e6-d0c6-4c38-a62d-ac6ff81a555a",
            "name": "mac-lt2-mpopova.local",
            "type": "filebeat",
            "version": "8.4.3"
          },
          "event": {
            "id": "59279bf715-6046887273151727",
            "created": "2022-10-24T22:12:18.003Z"
          },
          "message": {
            "remote_ip": "165.155.130.139",
            "referer": "https://www.my.site1.com/",
            "request_status": "500",
            "latency": "1.3",
            "event_timestamp_millis": "1666639334000",
            "request_size": "52",
            "logstash_id": "m_id_1024_5",
            "ref_param": "https://www.nyt.com",
            "activity_date": "2022-10-24",
            "user_agent": "Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
            "request_method": "POST",
            "response_size": "124",
            "cid": "12345"
          },
          "input": {
            "type": "gcp-pubsub"
          }
        }
      }
    ]
  }
}

so no GEO info either ...

To answer your other questions:

  1. no modules, and no inputs - that was the full filebeat.yml - no extra stuff
  2. yes, I did create my own mapping for the index:
PUT /_index_template/ibc-parsed-logs-template
{
    "index_patterns" : ["ibc-parsed-logs-*"],
    "template" : {
        "settings" : {
          "index" : {
            "number_of_shards" : "3",
            "number_of_replicas" : "2",
            "lifecycle.name" : "ibc-parsed-logs-ilm",
            "lifecycle.rollover_alias" : "ibc-parsed-logs"
          }
        },
        "mappings" : {
          "_source" : {
            "enabled" : true
          },
          "properties" : {
            "message" : {
              "properties" : {
                "event_uuid": {
                  "type" : "keyword"
                },
                "logstash_id" : {
                  "type" : "keyword"
                },
                "cid" : {
                  "type" : "keyword"
                },
                "event_timestamp_millis" : {
                  "type" : "date"
                },
                "activity_date" : {
                  "format" : "yyyy-MM-dd",
                  "type" : "date"
                },
                "remote_ip" : {
                  "type" : "ip"
                },
                "remote_ip_geo" : {
                  "properties": {
                    "location": { "type": "geo_point" }
                  }
                },
                "user_agent" : {
                  "type" : "text"
                },
                "referer" : {
                  "type" : "keyword"
                },
                "ref_param" : {
                  "type" : "keyword"
                },
                "request_status" : {
                  "type" : "keyword"
                },
                "request_method" : {
                  "type" : "keyword"
                },
                "request_size" : {
                  "type" : "integer"
                },
                "response_size" : {
                  "type" : "integer"
                },
                "latency" : {
                  "type" : "float"
                },
                "version" : {
                  "type" : "float"
                }
              }
            }
          }
        },
        "aliases" : { }
    }
}

I also have an ILM policy defined for this index type,not sure if that's important.... and I had to create a bootstrap index before starting ILM to make both date_based index name patter + ILM rotation work :slight_smile: - not sur eif related, but just to mention all custom setup....
Here is the bootstrap index I have created when setting up this cluster:

PUT %3Cibc-parsed-logs-%7Bnow%2Fd%7D-000001%3E
{
  "aliases": {
    "ibc-parsed-logs": {
      "is_write_index": true
    }
  }
}

thanks!!

You did not do experiment #2

Directly post a document not through filebeat

You can post it to the write alias with the pipeline

POST ibc-parsed-logs/_doc?pipeline=geoip-info
{
  "input": {
    "type": "gcp-pubsub"
  },
  "agent": {
.....

Then get that document and look at it...

And you are generating and using your own document id _id?

So you are sure there are no collisions?

So you should also try a post with an ID

POST ibc-parsed-logs/_doc/m_id_1024_5?pipeline=geoip-info

also since you are running the tar.gz you can run

./filebeat -e -d "*"

And there will be quite verbose output :).

You should be able see where the pipeline is set.

And one more thing what exactly are all the versions ... I see 8.4.3 and above I see 7.15.0? in this page so I am not clear.

thanks, @stephenb !
Yea, I think I posted the results of the direct POST but now that you mentioned - not sure if I changed all IDs to avoid collision.... So I am inserting a fresh new doc today, into a brand new index for today (10-25) to make sure there are no collision possible ...

I'm not sure whether this part of the big event payload created by Filebeat and sent to ES should be a unique ID or not:

"event": {
    "created": "2022-10-24T19:30:01.726Z",
    "id": "59279bf715-5532412804884987"
  }

so I changed it to be unique for at least today.

"event": {
            "created": "2022-10-25T10:21:28.059Z",
            "id": "10-25-id-1"
          }

So here is my direct POST command:

POST ibc-parsed-logs-2022.10.25-000002/_doc/m_id_1025_1?pipeline=geoip-info
{
          "@timestamp": "2022-10-25T10:21:26.956Z",
          "input": {
            "type": "gcp-pubsub"
          },
          "agent": {
            "name": "mac-lt2-mpopova.fios-router.home",
            "type": "filebeat",
            "version": "8.4.3",
            "ephemeral_id": "8f1c47c8-4a02-4e36-a4c9-fe8479ed7dae",
            "id": "e0b4f8e6-d0c6-4c38-a62d-ac6ff81a555a"
          },
          "ecs": {
            "version": "8.0.0"
          },
          "host": {
            "os": {
              "type": "macos",
              "platform": "darwin",
              "version": "12.3.1",
              "family": "darwin",
              "name": "macOS",
              "kernel": "21.4.0",
              "build": "21E258"
            },
            "name": "mac-lt2-mpopova.fios-router.home",
            "id": "xxx443",
            "ip": [
              "xxx"
            ],
            "mac": [
              "xxx"
            ],
            "hostname": "mac-lt2-mpopova.fios-router.home",
            "architecture": "x86_64"
          },
          "event": {
            "created": "2022-10-25T10:21:28.059Z",
            "id": "10-25-id-1"
          },
          "message": {
            "cid": "12345",
            "remote_ip": "165.155.130.139",
            "request_status": "500",
            "event_timestamp_millis": "1666707272000",
            "activity_date": "2022-10-25",
            "request_method": "POST",
            "response_size": "124",
            "latency": "1.3",
            "logstash_id": "m_id_1025_1",
            "user_agent": "Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
            "referer": "https://www.my.site1.com/",
            "ref_param": "https://www.nyt.com",
            "request_size": "52"
          }
        }

response:

{
  "_index": "ibc-parsed-logs-2022.10.25-000002",
  "_id": "m_id_1025_1",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 3,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 0,
  "_primary_term": 1
}

and now getting this doc by ID:

GET ibc-parsed-logs/_search
{
  "query": {
    "term": {
      "message.logstash_id": {
        "value": "m_id_1025_1"
      }
    }
  }
}

result:

{
  "took": 888,
  "timed_out": false,
  "_shards": {
    "total": 6,
    "successful": 6,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "ibc-parsed-logs-2022.10.25-000002",
        "_id": "m_id_1025_1",
        "_score": 0.2876821,
        "_source": {
          "input": {
            "type": "gcp-pubsub"
          },
          "agent": {
            "name": "mac-lt2-mpopova.fios-router.home",
            "id": "e0b4f8e6-d0c6-4c38-a62d-ac6ff81a555a",
            "type": "filebeat",
            "ephemeral_id": "8f1c47c8-4a02-4e36-a4c9-fe8479ed7dae",
            "version": "8.4.3"
          },
          "@timestamp": "2022-10-25T10:21:26.956Z",
          "ecs": {
            "version": "8.0.0"
          },
          "host": {
            "hostname": "mac-lt2-mpopova.fios-router.home",
            "os": {
              "build": "21E258",
              "kernel": "21.4.0",
              "name": "macOS",
              "type": "macos",
              "family": "darwin",
              "version": "12.3.1",
              "platform": "darwin"
            },
            "ip": [
              "xxx"
            ],
            "name": "mac-lt2-mpopova.fios-router.home",
            "id": "xxx443",
            "mac": [
              "xxx"
            ],
            "architecture": "x86_64"
          },
          "event": {
            "created": "2022-10-25T10:21:28.059Z",
            "id": "10-25-id-1"
          },
          "message": {
            "request_status": "500",
            "referer": "https://www.my.site1.com/",
            "ref_param": "https://www.nyt.com",
            "remote_ip_geo": {
              "continent_name": "North America",
              "region_iso_code": "US-NY",
              "city_name": "The Bronx",
              "country_iso_code": "US",
              "country_name": "United States",
              "region_name": "New York",
              "location": {
                "lon": -73.8616,
                "lat": 40.847
              }
            },
            "latency": "1.3",
            "activity_date": "2022-10-25",
            "logstash_id": "m_id_1025_1",
            "request_method": "POST",
            "response_size": "124",
            "remote_ip": "165.155.130.139",
            "event_timestamp_millis": "1666707272000",
            "request_size": "52",
            "user_agent": "Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
            "cid": "12345"
          }
        }
      }
    ]
  }
}

Now sending a new event through Filebeat (with pipeline in input):

{
   "event_uuid":"m_id_1025_2",
   "logstash_id":"m_id_1025_2",
   "cid":"12345",
   "event_timestamp_millis":"1666707272000",
   "activity_date":"2022-10-25",
   "remote_ip":"165.155.130.139",
   "user_agent":"Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
   "referer":"https://www.my.site1.com/",
   "ref_param":"https://www.nyt.com",
   "request_status":"500",
   "request_method":"POST",
   "request_size":"52",
   "response_size":"124",
   "latency":"1.3"
}

logs from Filebeat (not sure I got everything that is of interest ....):

{"log.level":"debug","@timestamp":"2022-10-25T10:25:44.936-0400","log.logger":"processors","log.origin":{"file.name":"processing/processors.go","file.line":210},"message":"Publish event: {\n  \"@timestamp\": \"2022-10-25T14:25:44.022Z\",\n  \"@metadata\": {\n    \"beat\": \"filebeat\",\n    \"type\": \"_doc\",\n    \"version\": \"8.4.3\",\n    \"_id\": \"m_id_1025_2\"\n  },\n  \"agent\": {\n    \"version\": \"8.4.3\",\n    \"ephemeral_id\": \"710d2939-cea1-4b6b-aa12-3d8c7767606f\",\n    \"id\": \"e0b4f8e6-d0c6-4c38-a62d-ac6ff81a555a\",\n    \"name\": \"mac-lt2-mpopova.fios-router.home\",\n    \"type\": \"filebeat\"\n  },\n  \"event\": {\n    \"id\": \"59279bf715-5532469019380529\",\n    \"created\": \"2022-10-25T14:25:44.935Z\"\n  },\n  \"message\": {\n    \"referer\": \"https://www.my.site1.com/\",\n    \"request_method\": \"POST\",\n    \"logstash_id\": \"m_id_1025_2\",\n    \"ref_param\": \"https://www.nyt.com\",\n    \"request_status\": \"500\",\n    \"response_size\": \"124\",\n    \"cid\": \"12345\",\n    \"remote_ip\": \"165.155.130.139\",\n    \"user_agent\": \"Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36\",\n    \"event_timestamp_millis\": \"1666707272000\",\n    \"activity_date\": \"2022-10-25\",\n    \"request_size\": \"52\",\n    \"latency\": \"1.3\"\n  },\n  \"input\": {\n    \"type\": \"gcp-pubsub\"\n  },\n  \"ecs\": {\n    \"version\": \"8.0.0\"\n  },\n  \"host\": {\n    \"os\": {\n      \"type\": \"macos\",\n      \"platform\": \"darwin\",\n      \"version\": \"12.3.1\",\n      \"family\": \"darwin\",\n      \"name\": \"macOS\",\n      \"kernel\": \"21.4.0\",\n      \"build\": \"21E258\"\n    },\n    \"id\": \"xxx443\",\n    \"ip\": [\n      \"fe80"\n    ],\n    \"name\": \"mac-lt2-mpopova.fios-router.home\",\n    \"mac\": [\n      \"82"\"\n    ],\n    \"hostname\": \"mac-lt2-mpopova.fios-router.home\",\n    \"architecture\": \"x86_64\"\n  }\n}","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-10-25T10:25:45.937-0400","log.logger":"publisher_pipeline_output","log.origin":{"file.name":"pipeline/client_worker.go","file.line":139},"message":"Connecting to backoff(elasticsearch(http://localhost:9200))","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-10-25T10:25:45.938-0400","log.logger":"esclientleg","log.origin":{"file.name":"eslegclient/connection.go","file.line":267},"message":"ES Ping(url=http://localhost:9200)","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-10-25T10:25:45.940-0400","log.logger":"esclientleg","log.origin":{"file.name":"transport/logging.go","file.line":42},"message":"Completed dialing successfully","service.name":"filebeat","network":"tcp","address":"localhost:9200","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-10-25T10:25:45.948-0400","log.logger":"esclientleg","log.origin":{"file.name":"eslegclient/connection.go","file.line":290},"message":"Ping status code: 200","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-10-25T10:25:45.948-0400","log.logger":"esclientleg","log.origin":{"file.name":"eslegclient/connection.go","file.line":291},"message":"Attempting to connect to Elasticsearch version 8.4.3","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-10-25T10:25:45.948-0400","log.logger":"esclientleg","log.origin":{"file.name":"eslegclient/connection.go","file.line":346},"message":"GET http://localhost:9200/_license?human=false  <nil>","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-10-25T10:25:45.957-0400","log.logger":"esclientleg","log.origin":{"file.name":"eslegclient/connection.go","file.line":267},"message":"ES Ping(url=http://localhost:9200)","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-10-25T10:25:45.958-0400","log.logger":"esclientleg","log.origin":{"file.name":"eslegclient/connection.go","file.line":290},"message":"Ping status code: 200","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-10-25T10:25:45.958-0400","log.logger":"esclientleg","log.origin":{"file.name":"eslegclient/connection.go","file.line":291},"message":"Attempting to connect to Elasticsearch version 8.4.3","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-10-25T10:25:45.958-0400","log.logger":"esclientleg","log.origin":{"file.name":"eslegclient/connection.go","file.line":346},"message":"HEAD http://localhost:9200/_index_template/ibc-parsed-logs-template  <nil>","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-10-25T10:25:45.973-0400","log.logger":"template_loader","log.origin":{"file.name":"template/load.go","file.line":115},"message":"Template \"ibc-parsed-logs-template\" already exists and will not be overwritten.","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-10-25T10:25:45.974-0400","log.logger":"index-management","log.origin":{"file.name":"idxmgmt/std.go","file.line":267},"message":"Loaded index template.","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-10-25T10:25:45.975-0400","log.logger":"esclientleg","log.origin":{"file.name":"eslegclient/connection.go","file.line":346},"message":"GET http://localhost:9200/  <nil>","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-10-25T10:25:45.976-0400","log.logger":"publisher_pipeline_output","log.origin":{"file.name":"pipeline/client_worker.go","file.line":147},"message":"Connection to backoff(elasticsearch(http://localhost:9200)) established","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-10-25T10:25:46.053-0400","log.logger":"elasticsearch","log.origin":{"file.name":"elasticsearch/client.go","file.line":247},"message":"PublishEvents: 1 events have been published to elasticsearch in 77.121136ms.","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-10-25T10:25:46.054-0400","log.logger":"publisher","log.origin":{"file.name":"memqueue/eventloop.go","file.line":498},"message":"broker ACK events: count=1, start-seq=1, end-seq=1\n","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-10-25T10:25:46.054-0400","log.logger":"acker","log.origin":{"file.name":"beater/acker.go","file.line":64},"message":"stateless ack","service.name":"filebeat","count":1,"ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-10-25T10:25:46.055-0400","log.logger":"publisher","log.origin":{"file.name":"memqueue/ackloop.go","file.line":95},"message":"ackloop: return ack to broker loop:1","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-10-25T10:25:46.055-0400","log.logger":"publisher","log.origin":{"file.name":"memqueue/ackloop.go","file.line":98},"message":"ackloop:  done send ack","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2022-10-25T10:25:47.601-0400","log.origin":{"file.name":"numcpu/numcpu.go","file.line":41},"message":"Accurate CPU counts not available on platform, falling back to runtime.NumCPU for metrics","service.name":"filebeat","ecs.version":"1.6.0"}
{

getting this event from ES:

GET ibc-parsed-logs/_search
{
  "query": {
    "term": {
      "message.logstash_id": {
        "value": "m_id_1025_2"
      }
    }
  }
}

result:
{
  "took": 52,
  "timed_out": false,
  "_shards": {
    "total": 6,
    "successful": 6,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.6931471,
    "hits": [
      {
        "_index": "ibc-parsed-logs-2022.10.25-000002",
        "_id": "m_id_1025_2",
        "_score": 0.6931471,
        "_source": {
          "@timestamp": "2022-10-25T14:25:44.022Z",
          "host": {
            "os": {
              "family": "darwin",
              "name": "macOS",
              "kernel": "21.4.0",
              "build": "21E258",
              "type": "macos",
              "platform": "darwin",
              "version": "12.3.1"
            },
            "id": "xxx443",
            "ip": [
              "fe80"
            ],
            "name": "mac-lt2-mpopova.fios-router.home",
            "mac": [
              "82"
            ],
            "hostname": "mac-lt2-mpopova.fios-router.home",
            "architecture": "x86_64"
          },
          "agent": {
            "type": "filebeat",
            "version": "8.4.3",
            "ephemeral_id": "710d2939-cea1-4b6b-aa12-3d8c7767606f",
            "id": "e0b4f8e6-d0c6-4c38-a62d-ac6ff81a555a",
            "name": "mac-lt2-mpopova.fios-router.home"
          },
          "event": {
            "id": "59279bf715-5532469019380529",
            "created": "2022-10-25T14:25:44.935Z"
          },
          "message": {
            "event_timestamp_millis": "1666707272000",
            "activity_date": "2022-10-25",
            "request_size": "52",
            "latency": "1.3",
            "referer": "https://www.my.site1.com/",
            "request_method": "POST",
            "logstash_id": "m_id_1025_2",
            "ref_param": "https://www.nyt.com",
            "cid": "12345",
            "remote_ip": "165.155.130.139",
            "user_agent": "Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
            "request_status": "500",
            "response_size": "124"
          },
          "input": {
            "type": "gcp-pubsub"
          },
          "ecs": {
            "version": "8.0.0"
          }
        }
      }
    ]
  }
}

no GEOIP info ...

thanks!!

I would add another processor to the pipeline. Something very simple like adding a tag or a field Just so we can validate one last thing that the entire pipeline is not being called. It's not just somehow not doing geoip.

What I was looking for in the filebeat logs was a mention of the pipeline... Could probably grep for it. but you need to start with -d "*" and or put the logging at debug level.

You can always go back to the parameters setting ... Now I have a better understanding what that does and understanding why that works... Seems like that works for you, although I've never in the hundreds of pipelines had to set that.

Although I have never ran the GCP pub/Sub if I get a chance to set that up I will. But next couple days are going to be pretty busy.

I have one other question. I see lots of mentions of logstash ... How are you using logstash?

Are you actually sending from filebeat to elasticsearch? Or is logstash in the middle?

@ppine7

I have another debug option

take out output elasticsearch

put in

output.console:
  enabled: true

then just run

./filebeat -e

And then you should see what would be sent to elasticsearch

Example and note is should show the pipeline in the @metadata section

"pipeline":"geoip-info"

{"@timestamp":"2022-10-25T17:00:43.292Z","@metadata":{"beat":"filebeat","type":"_doc","version":"8.4.1","pipeline":"geoip-info"},"input":{"type":"filestream"},"agent":{"type":"filebeat","version":"8.4.1","ephemeral_id":"b48fe8c3-456f-458e-91e2-d054ab8de3c1","id":"76231972-56f8-4242-a2fd-179ad6224ce9","name":"hyperion"},"ecs":{"version":"8.0.0"},"host":{"name":"hyperion"},"log":{"file":{"path":"/Users/sbrown/workspace/sample-data/discuss/geoip/gcloud-pubsub-geoipdata.json"},"offset":1359},"message":{"response_size":"124","event_timestamp_millis":"1666285411000","referer":"https://www.my.site2.com/","remote_ip":"165.155.130.142","ref_param":"https://www.nyt.com","latency":"1.3","user_agent":"Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36","request_method":"POST","request_size":"52","logstash_id":"m_id_1020_3","cid":"12345","activity_date":"2022-10-20","request_status":"500"}}

this goes back to my question about logstash... are you using it in the middle?

Thank you, Stephen , for the suggestions!
First, answers to your questions:

  1. all versions I use are 8.4.3 - I downloaded each archove (ES, Kibana , Filebeat) and installed by just unzipping...
  2. there is no Logstash in the middle. The whole pipeline is: GCP PubSub --> Filebeat 8 (local) --> ES/Kibana 8 (local). the 'logstash_id' field you are seing is a legacy remnant that is still needed downstream in some of our processes.... the 'event_uuid' field is the successor. This whole project's goal is to get rid of the Logstash we have in our on-prem log processing pipelines - and replace it with simple Filebeat running in GCP.
  3. I grepped for 'geoip' in the filebeat output (previous experiment) - it was not mentioned anywhere, so I think your guess is correct, the whole pipeline is just not executed for some reason
  4. I have not tried adding a new pipeline yet - will do that next

Here are the results of your next debugging suggestion/run:
Disabled elasticsearch.output, enabled console output:

# ================================== Outputs ===================================
output.console:
  enabled: true
  pretty: true

# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
  enabled: false
  index: "ibc-parsed-logs"
  #pipeline: "geoip-info"
  #parameters.pipeline: "geoip-info"
  hosts: "http://localhost:9200"

sent one event through PubSub:
filebeat logs:

{"log.level":"info","@timestamp":"2022-10-25T13:21:34.352-0400","log.logger":"crawler","log.origin":{"file.name":"beater/crawler.go","file.line":148},"message":"Starting input (ID: 2356973183423816217)","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-10-25T13:21:34.352-0400","log.logger":"crawler","log.origin":{"file.name":"beater/crawler.go","file.line":106},"message":"Loading and starting Inputs completed. Enabled inputs: 1","service.name":"filebeat","ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2022-10-25T13:21:34.352-0400","log.logger":"gcp.pubsub","log.origin":{"file.name":"gcppubsub/input.go","file.line":142},"message":"Pub/Sub input worker has started.","service.name":"filebeat","pubsub_project":"tt-temp-2021030444","pubsub_topic":"logs-for-es-marina","pubsub_subscription":{"Name":"logs-for-es-marina-sub","NumGoroutines":1,"MaxOutstandingMessages":1000,"Create":true},"ecs.version":"1.6.0"}
{
  "@timestamp": "2022-10-25T17:22:03.050Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "8.4.3",
    "_id": "m_id_1025_3",
    "pipeline": "geoip-info"
  },
  "host": {
    "name": "mac-lt2-mpopova.fios-router.home",
    "os": {
      "platform": "darwin",
      "version": "12.3.1",
      "family": "darwin",
      "name": "macOS",
      "kernel": "21.4.0",
      "build": "21E258",
      "type": "macos"
    },
    "id": "xxx443",
    "ip": [
      "fe80"
    ],
    "mac": [
      "82"
    ],
    "hostname": "mac-lt2-mpopova.fios-router.home",
    "architecture": "x86_64"
  },
  "agent": {
    "version": "8.4.3",
    "ephemeral_id": "63882b38-5566-47dd-89a1-beae1ef6939b",
    "id": "e0b4f8e6-d0c6-4c38-a62d-ac6ff81a555a",
    "name": "mac-lt2-mpopova.fios-router.home",
    "type": "filebeat"
  },
  "message": {
    "request_method": "POST",
    "response_size": "124",
    "cid": "12345",
    "remote_ip": "165.155.130.139",
    "ref_param": "https://www.nyt.com",
    "request_size": "52",
    "latency": "1.3",
    "request_status": "500",
    "logstash_id": "m_id_1025_3",
    "event_timestamp_millis": "1666707272000",
    "activity_date": "2022-10-25",
    "referer": "https://www.my.site1.com/",
    "user_agent": "Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36"
  },
  "event": {
    "id": "59279bf715-6093616684786143",
    "created": "2022-10-25T17:22:04.491Z"
  },
  "input": {
    "type": "gcp-pubsub"
  },
  "ecs": {
    "version": "8.0.0"
  }
}

looks good to me :slight_smile:
The question is - why is it not executed when the elastic output is used?

will try creating a new simple pipeline and using it instead of geo next

thanks!!
Marina

actually, it does not looks good.... I see the pipeline mentioned in metadata, but the actual geo info is not added ...

  1. the geoip-info pipeline is executed on the elasticsearch side NOT the filebeat side so
{
  "@timestamp": "2022-10-25T17:22:03.050Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "8.4.3",
    "_id": "m_id_1025_3",
    "pipeline": "geoip-info"
  },

That look correct!
I assume that in this case you put the pipeline info in the input section...

Now you need to take out the

"ignore_missing": true
and add a simple add_field

{
  "geoip-info": {
    "description": "Add geoip info",
    "processors": [
      {
        "geoip": {
          "field": "message.remote_ip",
          "target_field": "message.remote_ip_geo",
        }
      },
     {
        "set": {
          "field": "my_field",
          "value": "My Value"
       }
     }
    ]
  }
}

Thanks, Stephen!
So, instead of modifying my current pipeline - I just created one more test one:

{
  "geoip-test": {
    "description": "Add geoip info",
    "processors": [
      {
        "geoip": {
          "field": "message.remote_ip",
          "target_field": "message.remote_ip_geo"
        }
      },
      {
        "set": {
          "field": "my_field",
          "value": "My Value"
        }
      }
    ]
  }
}

and update filebeat.yml to use it:

# ============================== Filebeat inputs ===============================
filebeat.inputs:
- type: gcp-pubsub
  enabled: true
  project_id: ${PROJECT_ID}
  topic: ${PUBSUB_INPUT_TOPIC}
  subscription.name: ${SUBSCRIPTION_NAME}
  fields_under_root: true
  pipeline: "geoip-test"

processed new message through PubSub:

{
   "event_uuid":"m_id_1025_5",
   "logstash_id":"m_id_1025_5",
   "cid":"12345",
   "event_timestamp_millis":"1666707272000",
   "activity_date":"2022-10-25",
   "remote_ip":"165.155.130.139",
   "user_agent":"Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
   "referer":"https://www.my.site1.com/",
   "ref_param":"https://www.nyt.com",
   "request_status":"500",
   "request_method":"POST",
   "request_size":"52",
   "response_size":"124",
   "latency":"1.3"
}

metadata from Filebeat logs:

{
  "@timestamp": "2022-10-25T20:44:54.172Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "8.4.3",
    "_id": "m_id_1025_5",
    "pipeline": "geoip-test"
  },
....

switched back to the elastic output:
processed the same message - with m_id_1025_6

result from ES:

{
  "took": 953,
  "timed_out": false,
  "_shards": {
    "total": 9,
    "successful": 9,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "ibc-parsed-logs-2022.10.25-000003",
        "_id": "m_id_1025_6",
        "_score": 0.2876821,
        "_source": {
          "input": {
            "type": "gcp-pubsub"
          },
          "agent": {
            "name": "dhcp-10-250-50-96.harvard.edu",
            "id": "e0b4f8e6-d0c6-4c38-a62d-ac6ff81a555a",
            "ephemeral_id": "e1a0d081-a514-4930-b2ed-e712efeef25b",
            "type": "filebeat",
            "version": "8.4.3"
          },
          "@timestamp": "2022-10-25T20:47:23.763Z",
          "ecs": {
            "version": "8.0.0"
          },
          "host": {
            "hostname": "dhcp-10-250-50-96.harvard.edu",
            "os": {
              "build": "21E258",
              "kernel": "21.4.0",
              "name": "macOS",
              "family": "darwin",
              "type": "macos",
              "version": "12.3.1",
              "platform": "darwin"
            },
            "ip": [
              "fe80"
            ],
            "name": "dhcp-10-250-50-96.harvard.edu",
            "id": "xxx443",
            "mac": [
              "46"
            ],
            "architecture": "x86_64"
          },
          "my_field": "My Value",
          "event": {
            "created": "2022-10-25T20:47:24.965Z",
            "id": "59279bf715-6095194393771101"
          },
          "message": {
            "request_status": "500",
            "ref_param": "https://www.nyt.com",
            "referer": "https://www.my.site1.com/",
            "remote_ip_geo": {
              "continent_name": "North America",
              "region_iso_code": "US-NY",
              "city_name": "The Bronx",
              "country_iso_code": "US",
              "country_name": "United States",
              "region_name": "New York",
              "location": {
                "lon": -73.8616,
                "lat": 40.847
              }
            },
            "latency": "1.3",
            "logstash_id": "m_id_1025_6",
            "activity_date": "2022-10-25",
            "request_method": "POST",
            "response_size": "124",
            "remote_ip": "165.155.130.139",
            "event_timestamp_millis": "1666707272000",
            "request_size": "52",
            "user_agent": "Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
            "cid": "12345"
          }
        }
      }
    ]
  }
}

both GEO and new fields are there!!!!

is it the "ignore_missing" that is screwing things up??

will try without it next .... unless you have more ideas to try :slight_smile:
thanks!

Interesting I don't know ... perhaps a bug..

YES!!!
Just tried with the original geoip pipeline - but without "ignore-missing" - and the geo ip info is added!!!

pipeline:

PUT _ingest/pipeline/geoip-no-missing
{
  "description": "Add geoip info",
  "processors": [
    {
      "geoip": {
        "field": "message.remote_ip",
        "target_field": "message.remote_ip_geo"
      }
    }
  ]
}

filebeat.yml:

# ============================== Filebeat inputs ===============================
filebeat.inputs:
- type: gcp-pubsub
  enabled: true
  project_id: ${PROJECT_ID}
  topic: ${PUBSUB_INPUT_TOPIC}
  subscription.name: ${SUBSCRIPTION_NAME}
  fields_under_root: true
  pipeline: "geoip-no-missing"

resulting event in ES:

{
  "took": 519,
  "timed_out": false,
  "_shards": {
    "total": 9,
    "successful": 9,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "ibc-parsed-logs-2022.10.25-000003",
        "_id": "m_id_1025_7",
        "_score": 0.2876821,
        "_source": {
          "input": {
            "type": "gcp-pubsub"
          },
          "agent": {
            "name": "dhcp-10-250-50-96.harvard.edu",
            "id": "e0b4f8e6-d0c6-4c38-a62d-ac6ff81a555a",
            "ephemeral_id": "83c397dd-ec26-4570-bad6-7a7600671487",
            "type": "filebeat",
            "version": "8.4.3"
          },
          "@timestamp": "2022-10-25T20:52:32.895Z",
          "ecs": {
            "version": "8.0.0"
          },
          "host": {
            "hostname": "dhcp-10-250-50-96.harvard.edu",
            "os": {
              "build": "21E258",
              "kernel": "21.4.0",
              "name": "macOS",
              "family": "darwin",
              "type": "macos",
              "version": "12.3.1",
              "platform": "darwin"
            },
            "ip": [
              "fe80"
            ],
            "name": "dhcp-10-250-50-96.harvard.edu",
            "id": "xxx443",
            "mac": [
              "82"
            ],
            "architecture": "x86_64"
          },
          "message": {
            "request_status": "500",
            "referer": "https://www.my.site1.com/",
            "ref_param": "https://www.nyt.com",
            "remote_ip_geo": {
              "continent_name": "North America",
              "region_iso_code": "US-NY",
              "city_name": "The Bronx",
              "country_iso_code": "US",
              "country_name": "United States",
              "region_name": "New York",
              "location": {
                "lon": -73.8616,
                "lat": 40.847
              }
            },
            "latency": "1.3",
            "logstash_id": "m_id_1025_7",
            "activity_date": "2022-10-25",
            "request_method": "POST",
            "response_size": "124",
            "remote_ip": "165.155.130.139",
            "event_timestamp_millis": "1666707272000",
            "request_size": "52",
            "user_agent": "Mozilla/5.0 (X11; CrOS aarch64 13421.102.0) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/86.0.4240.199 Safari/537.36",
            "cid": "12345"
          },
          "event": {
            "created": "2022-10-25T20:52:32.972Z",
            "id": "59279bf715-6095295297789991"
          }
        }
      }
    ]
  }
}

Do you think this is a bug?
thank you!

Yeah I suspect so.. I would need to run a few test BUT you can certainly opening one ... you have earned ... it!! :slight_smile:

Darn should have taken that out in the beginning...