Logs ingestion

I’m trying to setup logs ingestion for multiple systems. I have installed filebeat on all hosts and pushed all system logs with journald and container logs with a filestream for `/var/lib/docker/containers/*/*.log`. Everything is great up to here.

Now I wanna start parsing those logs and instead of them ending up in the filebeat streams with the message as plain text, have them end up in separate indices with extracted fields.

If I insert a default pipeline on the filebeat-* indices I can filter some logs that I want, extract fields, and reroute them to another index. After adding a new pipeline like that, how can I reindex old data through it? The issue here is that I request a manual reindex for all logs that were ingested before the pipeline existed I have to a specify a new destination index for the reindex operation. So all old logs from filebeat will pass through the pipeline, those that match my filters will parse and index fine, but those that don’t will be copied to the reindex destination index. That’s useless for me, I want the pipeline to not bother at all for the non matching logs. To my understanding you cannot specify a filter to the reindex operation. The whole index has to pass through it and non matching documents will have to end up in a useless index.

Another issue is I’m not sure how I could work with multiline logs with this. Is it possible to join multiple docs in a single new document and work with that? I haven’t found a way.

Generally it feels like I’m using the whole stack wrong, on every other step I find out these operations don’t wanna make things easy for me. So am I doing something fundamentally wrong here? Is a whole different approach better for this?

Hi,

Another issue is I’m not sure how I could work with multiline logs with this. Is it possible to join multiple docs in a single new document and work with that? I haven’t found a way.

You should configure Filebeat to handle multiline logs and send them correctly to Elasticsearch. [Here](Manage multiline messages | Beats) are the docs on that.

I want the pipeline to not bother at all for the non matching logs. To my understanding you cannot specify a filter to the reindex operation

Yes, you can! It is part of the [source property]( Reindex documents | Elasticsearch API documentation ):

POST _reindex
{
  "source": {
    "index": "filebeat-*",
    "query": {
      ...
    }
  }
}

Best regards

Wolfram