How to collect the logs from multiple machines to my server efficiently?

(Pawan Chandra) #1

I want to send logs from multiple machines to my server via some end point. But I know that my files to be written is a log file which has the limits for a number of logs storage( maybe some 1000's). My logs would be anything around 10 million. What is the best possible scenario for log storage for me?

Currently using filebeat to read the files from my local system, which is working fine.

(Jaime Soriano) #2

Hi @Pawan_Chandra,

You are reading the logs with filebeat, but, where are you shipping them?

I think that a good scenario is to collect logs with filebeat and ship them to an Elasticsearch cluster, you may also try to add Logstash between your filebeats and your Elasticsearch cluster depending on your needings.

(Pawan Chandra) #3

Hello, @jsoriano obliged that you replied so quickly.
the flow is:
Log File-->filebeat-->logstash-->1.elasticsearch<--kibana
|-------->2. MongoDb
My end to end flow is working completely fine. I have used my logstash config file with multiple outputs.

One end is stdout, one is mongo and one is elasticsearch.
My config file for logstash is :
input {
beats {
port => 5044
output {
hosts => "localhost:9200"
manage_template => false
index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
uri => "mongodb://localhost:27017"
database => "mycol" #name_of_database
collection => "collab" #name_of_collection
stdout { codec => rubydebug }

I just want to know what would be the most efficient way to store logs from multiple systems to mine.

(Jaime Soriano) #4

It seems that you have a good foundation to collect logs in an scalable way. The most efficient way depends on the systems you have, are you having any specific problem collecting logs with your current architecture?

(Pawan Chandra) #5

As for now, I am trying to have multiple filebeat instances on different machines to collect their respective logs in my central machine. I am trying to figure it out that how that can happen because all I can see is:

  1. Change the config output of the filebeat to my machine's IP address instead of regular "loalhost".
  2. Start the filebeat from the remote machine.

But its not working. Am I working in the right direction?
If yes then how to move ahead?

If NO or maybe are their any better alternatives?

(Jaime Soriano) #6

filebeat has to be installed in all machines you want to collect logs from, configured with the logstash output with the address of your logstash machine.

(Pawan Chandra) #7

Thank you for your patience.
Yes i have tried it but its now working.
Is there any particular sequence of starting filebeat? or machines(central and remote)
or shall i name them differently? filebeat 1or 2 etc.
or is there anything to do with the registery file?

(Jaime Soriano) #8

You don't need to start filebeats in any particular sequence, but logstash should be running for the filebeats to be able to ship the logs.

In logstash you need to configure the input plugin for beats, and in beats the logstash output.

Once everything is configured, if it doesn't work you can check the logs of filebeats to see if they are reporting any problem.

(Pawan Chandra) #9

thanks a lot. It worked.
just one thing is there any limitation on the number of the filebeat instances?
or the size limit for the log files in any system?

(Jaime Soriano) #10

In principle there is no limitation in the number of filebeat instances or size of stored logs, but you may need to fine-tune or scale your logstashs or the elasticsearch cluster to tolerate the load. You may also need to use multiple clusters if your collect a really big volume of logs.

(Pawan Chandra) #11

thanks a lot. Can you please throw some light over this?
suppose , i will be having a roughly 10 million logs daily.
and lets say i will be having 5 filebeat instances in different machines,
where i am collecting my logs from multiple machines hitting an endpoint and distributing the logs equally on all 5 remote machines.

what do you suggest(roughly) how many instances of elastic search or logstash or filebeat even I should be using for hassle free work . or what should be the ideal distribution?
Thanks for your patience.

(Jaime Soriano) #12

It depends, you'd need to do some tests with your specific load. To give you an idea of the performance of Elasticsearch you can take a look to our benchmarks.

(system) #13

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.