Filter input data from Filebeat using logstash?

Hello! I managed to set up the stack Filebeat->Logstash->Elasticsearch, but I am using journald as an input for my filebeat logs, which means that a lot of unnecessary data appears to be saved in the ES index. I thought the mappings in my logstash config would only let through the declared properties, so my template currently looks like this:

{
  "template": "logstash",
  "index_patterns": [
    "logstash-*"
  ],
  "settings": {
    "number_of_shards": 1
  },
  "mappings": {
    "properties": {
      "level": {
        "type": "byte"
      },
      "request-id": {
        "type": "keyword"
      },
      "app-id": {
        "type": "keyword"
      },
      "instance-id": {
        "type": "keyword"
      },
      "route": {
        "type": "text"
      },
      "ip": {
        "type": "ip"
      },
      "@timestamp": {
        "type": "date"
      }
    }
  }
}

However, a search against ES returns a lot of fields, so I was wondering where exactly in the stack should I write some kind of filter to trim all unnecessary data before it is being stored in ES. I'd appreciate a push in the right direction.

Hi,

One way of filtering out unnecessary fields from events in logstash is by using the drop/remove field filter.

This field will remove those fields and only the remaining fields will be mapped or stored in Elasticsearch.

    filter {
      drop {
        remove_field => [ "foo_%{somefield}" ]
      }
    }

You can refer the below doc if you need additional information.

Thanks,
Asish

Yes, perfect! Thank you very much. Would the filter { drop { remove_field ... }}} do the same as something like filter { mutate { remove_field ...} } ?

Mutate do not have a remove field filter.

See the below doc.

So it's better you go with the format I've given.

According to your link:
Mutate filter plugin | Logstash Reference [8.5] | Elastic "The following configuration options are supported by all filter plugins" and then it lists common options, included remove_field :smiley: I actually tested it with mutate {remove_field ... } and it appears to be working.

Both versions seem to work, thanks again! I will continue reading about plugins and stuff

1 Like

The default for elasticsearch is to enable dynamic mapping, so that any new field on a document creates a new field in the index. You can turn that off, so that only fields in your template are created.

Sounds like an useful parameter, if it just worked...I checked your link to figure out how to use it:

From the examples in the page it seems like I need to add this to my mappings object, so something like this:

{
  "template": "logstash",
  "index_patterns": [
    "logstash-*"
  ],
  "settings": {
    "number_of_shards": 1
  },
  "mappings": {
    "dynamic": false,
    "properties": {
      "level": {
        "type": "byte"
      }
    }
  }
}

So this should save only documents with a "level" field, right? Of course, it doesn't work. When I check the documents in Elasticsearch, they contain the entire list of fields sent by Logstash:

If you get the mapping what is the value for dynamic?

Thanks, but after a discussion with my peers I learned that we should be switching to Opensearch, and instead of sending from Logstash directly to Elasticsearch I need to send from Logstash to Graylog using the GELF format. This opened up an entire new family of issues to figure out, so this thread should be considered closed as of now.

OpenSearch/OpenDistro are AWS run products and differ from the original Elasticsearch and Kibana products that Elastic builds and maintains. You may need to contact them directly for further assistance.

(This is an automated response from your friendly Elastic bot. Please report this post if you have any suggestions or concerns :elasticheart: )

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.