Filebeat Apache module not showing data on Dashboards when using Logstash

Good afternoon all.

I've recently installed a filebeat and enabled the system and apache modules.

After that I set the filebeat.yml to point to ES and Kibana and run the
'filebeat setup -e'

Everything went as expected. With an apache2 running I started the filebeat and I saw the number of documents increased in my datastream:
.ds-filebeat-8.4.3-2022.11.09-000001

I opened Kibana to select both, '[Filebeat System] ECS' and '[Filebeat Apache] Access and error logs ECS'. In both of them I saw how data was showed.

I stopped filebeat and I edited the filebeat.yml to point to Logstash.

In Logstash I created the following pipeline:

input {
    beats {
        port => "5044"
    }
}

output {
  if [@metadata][pipeline] {
    elasticsearch {
      hosts => ["https://192.168.0.111:9200","https://192.168.0.112:9200","https://192.168.0.113:9200"]
      manage_template => false
      index => "%{[@metadata][beat]}-%{[@metadata][version]}"
      action => "create"
      pipeline => "%{[@metadata][pipeline]}"
      cacert => '/certs/elastic/http_ca.crt'
      user => "${LS_USER}"
      password => "${LS_PWD}"
    }
  } else {
    elasticsearch {
      hosts => ["https://192.168.0.111:9200","https://192.168.0.112:9200","https://192.168.0.113:9200"]
      manage_template => false
      cacert => '/certs/elastic/http_ca.crt'
      index => "%{[@metadata][beat]}-%{[@metadata][version]}"
      action => "create"
      user => "${LS_USER}"
      password => "${LS_PWD}"
    }
  }
}

Once I saved the file I started Logstash which run properly and I restarted filebeat.

After this more data was added in the datastream but when I checked the same dashboards, '[Filebeat System] ECS' showed data but '[Filebeat Apache] Access and error logs ECS' didnt.

At the moment I make the filebeat point to the ES again I see data back in both when it point to Logstash '[Filebeat Apache] Access and error logs ECS' stops showing data.

Any idea about what could be wrong?

Am I missing something?

Thank you in advance and best regards.

It's changed because the prebuilt dashboards use specific indices to look for the data, which you changed with your Logstash outputs.

Why did you send it via Logstash anyway?

Hi Warkolm.

Thank you for the answer.

I've checked for at least four times that:
Wether A) I set Filebeat to point to ES and Kibana, or B) to point to Logstash. The documents generated by Filebeat they always go to the same index:
.ds-filebeat-8.4.3-2022.11.09-000001

How it is then that in both cases 'a' and 'b' the number of documents in the same index get increased?

So I'm afraid I'm not in the case you are saying. Using Logstash didn't changed the index where documents are stored. In both cases the number was increased and I only had one FB running.

Why do I want to use Logstash? Because this is how one of our clients want it. I guess they want to use Logstash as a focal point to enrich, filter, etc...

So, considering that using FB + Logstash should not be a problem, I can`t tell the client to change the way they have to work just because I lack of the knowledge to make it work.

Thank you again !!

Regards.

Hi again.

I've been doing more research. I've disable the 'system' module so I only had apache documents to check more easilly whats going on.

I noticed that once I use Logstash in between ES and FB all (and I say all) of the documents I can see the 'Discover' tool of Kibana for the 'filebeat-*' Dataview look like this:

{
  "@timestamp": [
    "2022-11-10T11:29:51.291Z"
  ],
  "@version": [
    "1"
  ],
  "agent.ephemeral_id": [
    "9469c918-c786-48cc-af6a-33c3771219db"
  ],
  "agent.hostname": [
    "ubuntuelk03"
  ],
  "agent.id": [
    "7da603cc-71fd-43c0-89fa-91aecfbeaeaf"
  ],
  "agent.name": [
    "ubuntuelk03"
  ],
  "agent.type": [
    "filebeat"
  ],
  "agent.version": [
    "8.4.3"
  ],
  "ecs.version": [
    "1.12.0"
  ],
  "error.message": [
    "field [event.original] already exists"
  ],

I omitted the rest of the document's info as the error.message is giving a clue;
"field [event.original] already exists"

Any idea why am I having this behaviour?

Thank you and regards.

Carlos T.

This happens because of Logstash between Filebeat and Elasticsearch, all modules expect that you send the data directly to Elasticsearch, putting Logstash between will change the message that arrives in Elasticsearch and it may break the ingest pipeline used.

In this case, the ingest pipeline used is this one.

It is failing in the second processor.

- rename:
    field: message
    target_field: event.original

You may try to remove this processor from the ingest pipeline, but you will need to remove it every time you update filebeat since it will also update the ingest pipeline.

You may try to remove it in Logstash or set pipeline.ecs_compatibility to disabled so logstash will not create this field.

Is there any reason at all to use logstash? You are not doing any filtering in your Logstash pipeline and your output will send everything to the same index.

Indeed Leandro.

Now I dont really need to use Logstash in the middle but I want to learn to use filter, enrich and discard fields, etc.. Because I want to be ready for an environment where I could be necessary to use it.

Said that, I've just deleted the processor. I did it in the two ingest pipelines from which that dashboard reads data and, yes Sir!!, it worked.

So thank you very much for your help Leandro, and also Warkolm.

I really appreciate your time and knowledge.

Best regards.

Carlos!!

Hi @leandrojmp

If I get a chance I will look into this.

I literally set up hundreds of filebeat to logstash to Elasticsearch, demos and instances.

Specifically, I've done the Apache one many times in 7.x. I never needed to change or update ingest pipeline.

It seems like something has changed recently because this architecture always worked without modifying anything before.

If I get a chance I will look into the message and event.original. I did see it in a case I was just working with recently.

I don't know if this is a broad bug or something intentionally changed.

If I'm not wrong, this is a breaking change in 8.X related to ecs fields.

From 8.0 the pipeline.ecs_compatibility setting is on by default, so Logstash will output ecs fields and in this case the event.original field is created, in version 7.X the pipeline.ecs_compatibility setting was disabled by default.

For example, if you test the following pipeline in Logstash:

input {
    stdin {}
}
output {
    stdout {}
}

And you set pipeline.ecs_compatibility: v8 in logstash.yml

and you run:

echo "sample message" | /opt/logstash/bin/logstash --path.settings /opt/logstash/config -f /opt/logstash/pipelines/config.conf

You will get this output:

{
         "event" => {
        "original" => "sample message"
    },
       "message" => "sample message",
          "host" => {
        "hostname" => "server"
    },
      "@version" => "1",
    "@timestamp" => 2022-11-10T21:46:56.615Z
}

If you set pipeline.ecs_compatibility as disabled you get this output:

{
      "@version" => "1",
       "message" => "sample message",
          "host" => "server",
    "@timestamp" => 2022-11-10T21:50:08.861Z
}

So if someone is using Logstash with Filebeat or Elastic Agent modules and Elasticsearch Ingest pipelines, they may have some issues because some of those pipelines will try to copy message into event.original, which may already exist.

I think that this proposed change to use @metadata.original instead of event.original would be better.

1 Like

@leandrojmp Thanks!

I just put this in as Well and Just Test 7.x Fine 8.x Broken.
The pipeline has not changed for quite some time.
I will link to this thread as well.

In should beat Filebeat modules -> Logstash -> Elasticsearch should work in 8.x without issues.

Ahhh I was trying the compatibility mode on the input not in the logstash

So you can only put that in the logstash.yml? or can you put in the pipeline defintion

I confirmed it worked logstash.yml

But this is still not good. This should work OOTB. I think the pipeline should have a condition check or fall through on the set.

You can put in logstash.yml to apply to all pipelines, or in each pipeline definition inside pipelines.yml.

1 Like

My temp fix would be to

A) Either drop in logstash

B) The modules pipeline should be updated to

{
    "rename": {
      "field": "message",
      "target_field": "event.original",
      "if": "ctx.event?.original == null"
    }
  }

Which Works

@leandrojmp Huh What do you know..

Known issue this plus a couple internals etc.. .not sure when it will get fix... its one of those accross teams issue :slight_smile: