Filebeat not working with pipeline nor * in csv is working

PUT _ingest/pipeline/csv_pipeline
{
  "description": "A pipeline to parse CSV data",
  "processors": [
    {
      "csv": {
        "field": "message",
        "target_fields": ["cluster", "index", "ilm_policy", "time_since_index_creation"],
        "ignore_missing": false
      }
    }
  ]
}
###################### Filebeat Configuration Example #########################

# ============================== Filebeat modules ==============================
filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false

# ============================== Filebeat inputs ===============================
filebeat.inputs:
- type: filestream
  id: my-csv-filestream-id
  enabled: true
  paths:
    - /Users/taran/stocks/ansible/indices_ilm_policies_prod1.es.us-central1.gcp.cloud.es.io.csv
    - /Users/taran/stocks/ansible/indices_ilm_policies_prod2.es.asia-south1.gcp.elastic-cloud.com.csv

- type: filestream  # Input for macOS logs
  enabled: true
  paths:
    - /var/log/system.log

# ======================= Elasticsearch template setting =======================
setup.template.settings:
  index.number_of_shards: 1
setup.template.name: "ilmprod2"  # Replace with your template name
setup.template.pattern: "ilm-*"  # Replace with your template pattern
setup.ilm.overwrite: true

# ================================== General ===================================

# The name of the shipper that publishes the network data.
#name:

# The tags of the shipper are included in their field with each
# transaction published.
#tags: ["service-X", "web-tier"]

# Optional fields that you can specify to add additional information to the
# output.
#fields:
#  env: staging

# =================================== Kibana ===================================

setup.kibana:
  host: "prod1.kb.us-central1.gcp.cloud.es.io:9243"
  ssl.verification_mode: "none"

# =============================== Elastic Cloud ================================

cloud.id: "7455d8019d7e455ebf45b0704a20d83e:d"
cloud.auth: ""

# ================================== Outputs ===================================

# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
  hosts: ["prod1.es.us-central1.gcp.cloud.es.io"]
  protocol: "https"
  username: "elastic"
  password: ""
  indices:
    - index: "combined-csv-index"  # Single index for both CSV files
      pipeline: csv_pipeline
      when.contains:
        log.file.path: "/Users/taran/stocks/ansible/indices_ilm_policies"
    - index: "macos-log"
      when.contains:
        log.file.path: "/var/log/system.log"

# ================================= Processors =================================

processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded
  - add_cloud_metadata: ~
  - add_docker_metadata: ~
  - add_kubernetes_metadata: ~

# ================================== Logging ===================================

# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
#logging.level: debug

# At debug level, you can selectively enable logging only for some components.
# To enable all selectors, use ["*"]. Examples of other selectors are "beat",
# "publisher", "service".
#logging.selectors: ["*"]

# ============================= X-Pack Monitoring ==============================
# Filebeat can export internal metrics to a central Elasticsearch monitoring
# cluster. This requires xpack monitoring to be enabled in Elasticsearch.
# The reporting is disabled by default.

# Set to true to enable the monitoring reporter.
#monitoring.enabled: false

# Sets the UUID of the Elasticsearch cluster under which monitoring data for this
# Filebeat instance will appear in the Stack Monitoring UI.
#monitoring.cluster_uuid:

# Uncomment to send the metrics to Elasticsearch.
#monitoring.elasticsearch:

# ============================== Instrumentation ===============================

# Instrumentation support for the filebeat.
#instrumentation:
    # Set to true to enable instrumentation of filebeat.
    #enabled: false

    # Environment in which filebeat is running on (eg: staging, production, etc.)
    #environment: ""

    # APM Server hosts to report instrumentation results to.
    #hosts:
    #  - http://localhost:8200

    # API Key for the APM Server(s).
    #api_key:

    # Secret token for the APM Server(s).
    #secret_token:

# ================================= Migration ==================================

# This allows to enable 6.7 migration aliases
#migration.6_to_7.enabled: true
Cluster1,.ds-logs-enterprise_search.api-default-2023.11.27-000002,logs-enterprise_search.api-default,1.66d

Cluster1,.internal.alerts-observability.apm.alerts-default-000001,.alerts-ilm-policy,8.67d

Cluster1,.elastic-connectors-v1,None,Not available

Cluster1,.ds-.monitoring-beats-8-mb-2023.11.26-000003,.monitoring-8-ilm-policy,2.61d

Cluster1,.ds-.monitoring-ent-search-8-mb-2023.11.26-000003,.monitoring-8-ilm-policy,2.61d

Cluster1,.ds-logs-enterprise_search.api-default-2023.11.20-000001,logs-enterprise_search.api-default,8.66d

Hi @mastinder

I formatted your post.

What version?

When you say not working what exactly does that mean?

Are there no logs in Discover?

Are they there but not parsed?

I see some issues with your filebeat.yml you setup config and index names are probably not going to work

Firstly, I have multiple CSV files in a folder, but when I use the asterisk (*) to select them, they are not being picked up. I have also defined the type as "log", but it still doesn't work.

Secondly, the pipeline is not functioning properly. It only works if I reindex the index by copying it after injection. However, it fails to work before injection.

To address these issues, I have created an index component and made necessary adjustments to the ILM and settings. I have even tried changing the index name to "policy" or any other name, but the problem persists.

The main challenges I am facing are twofold. Firstly, I need to find a way to add multiple files to the pipeline. Secondly, the pipeline itself is not functioning as expected when using filebeat. I have attempted alternative methods, but the pipeline continues to fail.

It is crucial to resolve these issues promptly to ensure smooth and efficient data processing.

filebeat.inputs:

  • type: log
    enabled: true
    paths:
    • /path/to/your/csv/files/*.csv
      output.elasticsearch:
      hosts: ["localhost:9200"]
      pipeline: "csv_pipeline"

PUT _component_template/time-series-mappings
{
"template": {
"mappings": {
"properties": {
"@timestamp": {
"type": "date"
},

    "message": {
      "type": "text"
    }
  }
}

}
}

PUT _index_template/my-metrics-template
{
"priority": 500,
"index_patterns": [
"my-metrics-*"
],
"composed_of": [
"time-series-mappings",
"time-series-settings"
]
}

PUT _component_template/time-series-mappings
{
"template": {
"mappings": {
"properties": {
"@timestamp": {
"type": "date"
},
"message": {
"type": "text"
},
"Index": { "type": "keyword" },
"ILM Policy": { "type": "keyword" },
"Time Since Index Creation": { "type": "text" },

    "data_stream.type": {
      "type": "constant_keyword"
    }
  }
}

}
}

had used decoded plugin as well but stil don't see new fields on elastic

Please format your code, logs or configuration files using </> icon as explained in this guide and not the citation button. It will make your post more readable.

Or use markdown style like:

```
CODE
```

This is the icon to use if you are not using markdown format:

There's a live preview panel for exactly this reasons.

Lots of people read these forums, and many of them will simply skip over a post that is difficult to read, because it's just too large an investment of their time to try and follow a wall of badly formatted text.
If your goal is to get an answer to your questions, it's in your interest to make it as easy to read and understand as possible.
Please update your post.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.