Cannot seem to get ingest pipelines to work, please help!

I am trying to understand ES ingest pipelines and Filebeat.
I am using version 7.11 ES and K and FB in docker containers.
If I do not add the pipeline, I get the log lines in ES/K from FB, but they are not formatted very well, (It's all indexed into a 'message' keyword. So I want to create a pipeline to further breakdown the values in the log lines into keywords using 'tab' separators.

I have created a pipeline in Kibana Dev Tools as:

PUT _ingest/pipeline/rcp_log_pipeline_tab
{
  "description" : "rcp log tab pattern",
  "processors" : [
    {
      "csv" : {
        "field" : "message",
        "target_fields" : [
          "timestamp",
          "relativeTime",
          "thread",
          "processName",
          "sourceName",
          "logType",
          "logMessage"
        ],
        "separator" : " "
      }
    },
    {
      "rename" : {
        "field" : "timestamp",
        "target_field" : "@timestamp"
      }
    },
    {
      "rename" : {
        "field" : "@timestamp",
        "target_field" : "index_timestamp"
      }
    }

  ]
}

and I have simulated it with

POST _ingest/pipeline/rcp_log_pipeline_tab/_simulate
{
  "docs": [
    {
      "_source": {
        "message": "2021-01-02T00:01:00.134-08:00 1047176101054 0x0017  US-W10L2.Axxion.ToolSpud.  IoProvider  Background  Performing BankReadIOPoints"
      }
    }
  ]
}

And I get the results I am expecting. Perfect!

So I updated my filebeat.yml to use this pipeline

output.elasticsearch:
  hosts: ['${ELASTICSEARCH_HOST_PORT}']
  username: '${ELASTIC_USERNAME}'
  password: '${ELASTIC_PASSWORD}'
  pipeline: 'rcp_log_pipeline_tab'

deleted the filebeat-* index in Kibana, removed the registry in Filebeat so it will resend the log files, restarted the Filebeat container and I get nothing from Filebeat. The index is created, but it is empty. :frowning:

What I find scouring the web seems to indicate this is what I should be doing.. so why is it not working?
Thanks for taking a look!

What do your Filebeat logs show?

oooo... where are they on a Mac? Not /var/log as far as I can tell.

Installed via Homebrew?

I'm running in Docker containers

The docker log would be what you are after.

1 Like

seems there is a problem with screen and accessing the tty on the docker vm and so access to the docker logs are not available (I am still searching for alternative ways to get to the logs) I do appreciate your patience and help

ok, so 'docker logs ' is working

I was so bent on finding the Filebeat logs I completely forgot to actually look at the docker logs (oh, well)
That was a big help. It sems to be not parsing the same from Filebeat it did with Dev Tools, but at least I am getting something I can work with now. Thank you so much for knocking me in the direction of Docker logs=

1 Like

Hi @mhare
Perhaps Add some failure handling to the ingest pipeline. See here

It will help you debug.. ahh I see you just found the logs... I was typing this up...

2nd When you just simulate pipelines, types are not checked as well, its a good first check but it does not actually try to insert a doc so if there is a type mismatch it will not throw an error, if it is throwing and error when the docs is inserted you will not see it AND then doc will not be inserted.

3rd I am not quite sure what you are trying to accomplish
You rename timestamp to @timestamp then rename @timestamp to index_timestamp this mean @timestamp will not be actually set in the doc by this pipeline, these all happen before the doc is written.

4th doing a rename on the timestamp field is actually doing some smarts under the covers which you can not always count on you really should use the date processor, turns out your date has a good format so the rename will work (I think)

5th be super careful with that tab in the processor make sure you smart tabs is not turning into a space

So I suggest (besides finding the logs... which will probably say the doc is not being inserted, I would add error handline something like this

PUT _ingest/pipeline/rcp_log_pipeline_tab
{
  "description": "rcp log tab pattern",
  "processors": [
    {
      "csv": {
        "field": "message",
        "target_fields": [
          "timestamp",
          "relativeTime",
          "thread",
          "processName",
          "sourceName",
          "logType",
          "logMessage"
        ],
        "separator": " ",
        "on_failure": [
          {
            "set": {
              "field": "error.message_csv",
              "value": "error in csv processor"
            }
          }
        ]
      }
    },
    {
      "date": {
        "field": "timestamp",
        "target_field": "@timestamp",
        "formats": ["date_optional_time||strict_date_optional_time"], 
        "on_failure": [
          {
            "set": {
              "field": "error.message_date",
              "value": "error in date processor"
            }
          }
        ]
      }
    }
  ]
}


# Create a mapping with an actual @timestamp field
DELETE my-test-data

PUT my-test-data
{
  "mappings": {
    "properties": {
      "@timestamp" : {
        "type": "date"
      }
    }
  }
}

# Now actual test writing a doc
POST my-test-data/_doc?pipeline=rcp_log_pipeline_tab
{
  "message": "2021-01-02T00:01:00.134-08:00 1047176101054 0x0017  US-W10L2.Axxion.ToolSpud.  IoProvider  Background  Performing BankReadIOPoints"
}

# See what it looks like 
GET my-test-data/_search
1 Like

Awesome stuff. Got this too late last night to try anything, but this afternoon looks like a good time to get to it. Thank you for taking the time and offering these suggestions

OK Finally got the cycles to get back to this. @stephenb you're response led me to a correct implementation. Only real difference was using the ISO for the date format. Appreciate the help!

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.