Best practices for reading elastic data to logstash

Greetings to everybody,

having only scratched the surface of the ELK stack I found myself in the following situation: one of my linux servers runs a few services which have built-in functionality to send logs directly to elastic. That functionality seems to remove the need for installing filebeat for pushing logs to elastic. My initial plan was to use Filebeat for inputing the logs to logstash, filter them and then output the results to elastic (for analyzing) and to 3rd party alert services (for creating custom alarms).

With the built-in functionality the logs go directly into elastic. Which is the best practice to filter and output the logs? I see three options:

  1. use the elasticsearch plugin for logstash to read the logs from elastic into logstash and do the filtering,

  2. forget the built in functionality of my services and use Filebeat instead,

  3. search for a way to use the built in functionality but send the data to logstash instead of sending them directly to elastic (I guess some kind of input{} fidling)

Any suggestions will be highly appreciated!

:slight_smile:

Sorry for bumping this but could this be the wrokn place to ask this question?

I don't understand what exactly "filtering" means in your context. Do you want actually to keep only a subset of the documents? Is it a thing you want to do every time, ie.: don't index HTTP 200 messages?

Filebeat does that and this is very good as it can reduce a lot the network activity.
For the 3rd solution the easiest way IMO is to use the http input and just post your events to logstash using http requests.

Exactly, that's what I mean by filtering: I would like to filter out some insignificant logs that my services produce, preferably before sending them (I will need filebeat for that) or at the logserver side (I will nead logstash for that). The problem is that by using the built in functionality, the logs go directly into elastic bypassing the logstash part.

Now -assuming I use the built in functionality- having the data in elastic, which is the best way to filter them? Is reading out the elastic into logstash to filter and output back to elastic a good design for my case? Should I forget the built in functionality and just use filebeat instead?

Well. If you can't modify the builtin behavior then yes I'd definitely use filebeat instead.
It comes with so nice features, including monitoring.

1 Like

After a bit of digging I saw the the built-in functionality sends http posts in elastic but with Accept-Encoding: .gzip. Elastic can handle those nicely and works as expected.

Then I changed the port of the built-in functionality to send the logs to logstash and the HTTP plugin. This seems to work too as an input but is also seems that Logstash does not decode the message.

I assume that I should continue posting to Logstash instead of here...

PS: Below is a snapshot of the http request which I m trying to insert to logstash

Host:.localhost:5043..
User-Agent:.Go-http-client/1.1..
Content-Length:.2508..
Content-Type:.application/json..
Accept-Encoding:.gzip....
{"index":
	{"_index":"mi-2018.07.06","_type":"log"}}.{"fields":{"Dur":14,"ID":"40a"},"level":"info","timestamp":"2018-07-06T14:11:49.16420038Z","message":"DataReady"}.
{"index":
	{"_index":"mi-2018.07.06","_type":"log"}}.{"fields":{"Dur":19,"ID":"40a"},"level":"info","timestamp":"2018-07-06T14:12:19.164899161Z","message":"DataReady"}```

I moved the discussion to #logstash

Why is the application claiming to be posting application/json when the payload as a whole isn't valid JSON? You can try switching to the json_lines codec in your http input. If that doesn't work, use a split filter to split each posted HTTP payload into multiple events, then use a json filter to parse the JSON string in each such event.

I think that they are using the bulk API to send the log files. Viewing the data directly in elastic is ok: I can see multiple log lines. When I read the index into logstash and then back to elastic (no filtering taking place...just testing the input/output) the same log data appears as bulk.

The following is the _source field of the data when sent directly to elasticsearch taken from kibana.

fields.Duration:    18817
level:    info
timestamp:    July 6th 2018, 16:37:19.131
message:    Read status
_id:    QZPPb2QB0L1kP94Hr-Ep
_type:    log

_score:    - 

The following is the _source field of the data when reading the index into logstash and outputing it back to elastic

@timestamp:    July 9th 2018, 12:09:20.380
@version:    1
host:    0:0:0:0:0:0:0:1
headers.http_user_agent:    Go-http-client/1.1
headers.request_uri:    /_bulk
headers.http_version:    HTTP/1.1
headers.http_accept_encoding:    gzip
headers.request_method:    POST
headers.content_length:    2508
headers.content_type:    application/json
headers.request_path:    /_bulk
index._type:    log
_id:    OKRNfmQB0L1kP94HbM4G
_type:    doc

I might be missing something here?

I added the codec => es_bulk in my pipeline configuration but with no results I guess...

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.