Retrofitting Filebeat into my ELK stack

I am ingesting 8 different CSV file schemas using 8 logstash pipelines and reading from a Windows file share. IT has been recommending to not read across the file share but use Filebeat.
Not finding just a whole bunch of info in this area.
Here are 2 of the 'inputs' section:

--------------------------------------------------------------

MemoryLeak Input

  • type: log
    enabled: false
    paths:
    • C:\MemoryLeakTest\v03*-memleak*.log
      tags: ["memleak"]
      fields: {log_type: memleak}

--------------------------------------------------------------

AM-300 TaskStat Input

  • type: log
    enabled: false
    paths:
    • C:\MemoryLeakTest\v03*_am-300-taskstat*.log
      tags: ["taskstat-am-300"]
      fields: {log_type: taskstat-am-300}

The output is:
output.logstash:

The Logstash hosts

hosts: [":5044"]

Now, how the heck do I get this to work with my 8 pipeline files?

I put this at the top of one of the pipeline files:

input {
beats {
host => "0.0.0.0"
port => "5044"
client_inactivity_timeout => 180
}
}

My question is, does this look right so far?
And then, how do I split the data across the 8 filters in the 8 pipeline files?

Thanks, Mike

You could use pipeline to pipeline communication

You would be using a "distributor". The if logic in the output section could reference the [fields][log_type] value you have set in filebeat.

OK, the Distributor pattern looks like what I need.
In the description in your link, they have a set of decision points like:

output {
        if [type] == apache {
          pipeline { send_to => weblogs }
        } else if [type] == system {
          pipeline { send_to => syslog }
        } else {
          pipeline { send_to => fallback }
        }

I am assuming I need to set [type] in my pipeline. My question now is, how?
In my example I posted above, I have:

tags: ["memleak"]
fields: {log_type: memleak}

I am guessing 'tag' is not needed and I can use a decision point as:

if [log_type] == memleak {
(do pipeline connection stuff here)
}

Nope.. didn't work. the 'fields' in my filbeat.yml is invalid.
So my question remains, how do I set the 'type' value for the decision point?

I think I'm getting close, still don't understand how to set the [type] value to an arbitrary value

For filebeat.yml, I have:

- type: log
  enabled: true
  paths:
    - C:\Users\xyzzy\CommonLogs\*-memleak*.log
  fields:
    "log_type": "memleak"

My decision point in pipelines.yml is:

- pipeline.id: beats-server
  config.string: |
    input { beats { port => 5044 } }
    output { 
      if ["log_type"] == "memleak" {
        pipeline { send_to => memleakpl }
      }
    }

If I remove the decision point and just call the 'send_to' it works. So i'm just trying to get the 'if' statement to work

From what I have read, in my filebeat yaml I can either:

tags: ["memleak"]
  • or -
fields:
   log_type: "memleak"

In my pipelines yaml I need to evaluate one of these and make a decision

output {
   if <something good goes here> {
      pipeline { send_to => memleakpl }
   }
}

I cannot find how to link either a tag or a field to the part of the conditional
Can you help me see the light?

Logstash syntax is:

If [fields][log_type] == "memleak" {
     do something
}

I think you are close :slight_smile:

I will give that a try.
What I just completed was in filebeat.yml:

- type: log
  enabled: true
  tags: ["memleak"]
  paths:
    - C:\Users\xyzzy\CommonLogs\*-memleak*.log

- type: log
  enabled: true
  tags: ["rmc3_taskstat"]
  paths:
    - C:\Users\xyzzy\CommonLogs\*_rmc3-taskstat*.log

and in pipelines.yml:

- pipeline.id: beats-server
  config.string: |
    input { beats { port => 5044 } }
    output { 
      if "memleak" in [tags] {
        pipeline { send_to => memleakpl }
      } else if "rmc3_taskstat" in [tags] {
        pipeline { send_to => rmc3_taskstatpl }
      }
    }

Which seems to be working.
I'll be adding another 6 logs or so in a similar fashion. Do you see a problem with this approach?

Len, your solution also works.
Is one better than the other?

Fields vs tags? I'd say whichever is understandable, logical and maintainable in your environment. I doubt there is a measurable performance difference.

I guess the old coding technique of putting the most frequent if condition first would apply either way.

Thanks!