Best practices for reading elastic data to logstash

dmamalis · July 4, 2018, 1:31pm

Greetings to everybody,

having only scratched the surface of the ELK stack I found myself in the following situation: one of my linux servers runs a few services which have built-in functionality to send logs directly to elastic. That functionality seems to remove the need for installing filebeat for pushing logs to elastic. My initial plan was to use Filebeat for inputing the logs to logstash, filter them and then output the results to elastic (for analyzing) and to 3rd party alert services (for creating custom alarms).

With the built-in functionality the logs go directly into elastic. Which is the best practice to filter and output the logs? I see three options:

use the elasticsearch plugin for logstash to read the logs from elastic into logstash and do the filtering,
forget the built in functionality of my services and use Filebeat instead,
search for a way to use the built in functionality but send the data to logstash instead of sending them directly to elastic (I guess some kind of input{} fidling)

Any suggestions will be highly appreciated!

dmamalis · July 6, 2018, 9:42am

Sorry for bumping this but could this be the wrokn place to ask this question?

dadoonet · July 6, 2018, 10:00am

I don't understand what exactly "filtering" means in your context. Do you want actually to keep only a subset of the documents? Is it a thing you want to do every time, ie.: don't index HTTP 200 messages?

Filebeat does that and this is very good as it can reduce a lot the network activity.
For the 3rd solution the easiest way IMO is to use the http input and just post your events to logstash using http requests.

dmamalis · July 6, 2018, 10:36am

Exactly, that's what I mean by filtering: I would like to filter out some insignificant logs that my services produce, preferably before sending them (I will need filebeat for that) or at the logserver side (I will nead logstash for that). The problem is that by using the built in functionality, the logs go directly into elastic bypassing the logstash part.

Now -assuming I use the built in functionality- having the data in elastic, which is the best way to filter them? Is reading out the elastic into logstash to filter and output back to elastic a good design for my case? Should I forget the built in functionality and just use filebeat instead?

dadoonet · July 6, 2018, 11:01am

Well. If you can't modify the builtin behavior then yes I'd definitely use filebeat instead.
It comes with so nice features, including monitoring.

dmamalis · July 6, 2018, 3:16pm

After a bit of digging I saw the the built-in functionality sends http posts in elastic but with Accept-Encoding: .gzip. Elastic can handle those nicely and works as expected.

Then I changed the port of the built-in functionality to send the logs to logstash and the HTTP plugin. This seems to work too as an input but is also seems that Logstash does not decode the message.

I assume that I should continue posting to Logstash instead of here...

PS: Below is a snapshot of the http request which I m trying to insert to logstash

Host:.localhost:5043..
User-Agent:.Go-http-client/1.1..
Content-Length:.2508..
Content-Type:.application/json..
Accept-Encoding:.gzip....
{"index":
	{"_index":"mi-2018.07.06","_type":"log"}}.{"fields":{"Dur":14,"ID":"40a"},"level":"info","timestamp":"2018-07-06T14:11:49.16420038Z","message":"DataReady"}.
{"index":
	{"_index":"mi-2018.07.06","_type":"log"}}.{"fields":{"Dur":19,"ID":"40a"},"level":"info","timestamp":"2018-07-06T14:12:19.164899161Z","message":"DataReady"}```

dadoonet · July 6, 2018, 3:29pm

I moved the discussion to #logstash

magnusbaeck · July 8, 2018, 7:06pm

Why is the application claiming to be posting application/json when the payload as a whole isn't valid JSON? You can try switching to the json_lines codec in your http input. If that doesn't work, use a split filter to split each posted HTTP payload into multiple events, then use a json filter to parse the JSON string in each such event.

dmamalis · July 9, 2018, 9:23am

I think that they are using the bulk API to send the log files. Viewing the data directly in elastic is ok: I can see multiple log lines. When I read the index into logstash and then back to elastic (no filtering taking place...just testing the input/output) the same log data appears as bulk.

The following is the _source field of the data when sent directly to elasticsearch taken from kibana.

fields.Duration:    18817
level:    info
timestamp:    July 6th 2018, 16:37:19.131
message:    Read status
_id:    QZPPb2QB0L1kP94Hr-Ep
_type:    log

_score:    -

The following is the _source field of the data when reading the index into logstash and outputing it back to elastic

@timestamp:    July 9th 2018, 12:09:20.380
@version:    1
host:    0:0:0:0:0:0:0:1
headers.http_user_agent:    Go-http-client/1.1
headers.request_uri:    /_bulk
headers.http_version:    HTTP/1.1
headers.http_accept_encoding:    gzip
headers.request_method:    POST
headers.content_length:    2508
headers.content_type:    application/json
headers.request_path:    /_bulk
index._type:    log
_id:    OKRNfmQB0L1kP94HbM4G
_type:    doc

I might be missing something here?

I added the codec => es_bulk in my pipeline configuration but with no results I guess...

system · August 6, 2018, 9:23am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Why do i need Logstash if filebeat can send data to Elasticsearch Beats filebeat	6	14785	February 7, 2017
Ingesting Elasticsearch Logs Logstash	1	588	July 6, 2017
Best practice Logstash	8	2123	December 12, 2017
Indexing application log files to elasticsearch Logstash	7	2120	July 6, 2017
Send JSON formatted logs directly to elasticsearch Elasticsearch	5	9627	April 4, 2017

Best practices for reading elastic data to logstash

Related topics