Sending logs to logstash using http

Hi,
My logs are exposed on http. , how to push logs to logstash using http, earlier i use file beat it is working fine, but if i use http how is my logstash.json file will be:

input {
http {
host => "127.0.0.1" # default: 0.0.0.0
port => 31311 # default: 8080
}
}

In the host what should i put and in port what should i out. (i am thinking that i need to put host name from where i am getting logs) , please clarify this.

Why do you want to use HTTP? Who or what is sending the logs?

The host option configures on which network interface Logstash should listen for connections. The default value is normally fine so you don't have to set this option at all.

The port option can be any available port.

Hi,
We are using websphere server, our dev, QA and Production, log4j logs are exposed on http, (http://ausu372a.wm.com/devlogs/alvdapp002/WAS70/logs/ERLWMLTMSserver1/SystemOut.log) i need to read that logs and send to log stash, while i was reading on internet http_poller will do this job, so i configure my logstash config file like this. On LogStash console i am seeing it is reading the log file. But in Kibana i am not seeing any data.

input {
http_poller {
urls => {
test1 => "http://localhost:9200"
test2 => {
# Supports all options supported by ruby's Manticore HTTP client
method => get
url => "http://ausu372a.wm.com/devlogs/alvdapp002/WAS70/logs/ERLWMLTMSserver1/SystemOut.log"

  }
  test3 => {
    # Supports all options supported by ruby's Manticore HTTP client
    method => get
    url => "http://ausu372a.wm.com/devlogs/alvdapp003/WAS70/logs/ERLWMLTMSserver2/SystemOut.log"       
    
  }
}
request_timeout => 60
# Supports "cron", "every", "at" and "in" schedules by rufus scheduler
schedule => { every => "1m"}
 codec => multiline {
  # Grok pattern names are valid!
  pattern => "^%{TIMESTAMP_ISO8601} "
  negate => true
  what => previous
}
# A hash of request metadata info (timing, response headers, etc.) will be sent here
metadata_target => "http_poller_metadata"

}
}

filter {

grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}

 grok {
 match => { "message" => "%{TIMESTAMP_ISO8601:time} \[%{DATA:loglevel}\]\(%{DATA:method}\)%{GREEDYDATA:msgbody}" }  
  # add_field => { "time" => "%{time}" }
  # add_field => { "loglevel" => "%{loglevel}" }
  break_on_match => false	 
    
 }

}

output {
elasticsearch {
hosts => [ "localhost:9200" ]
}
stdout { codec => rubydebug }
}

url => "http://ausu372a.wm.com/devlogs/alvdapp002/WAS70/logs/ERLWMLTMSserver1/SystemOut.log"

Does this URL return the full log every time? If yes, you'll get duplicate log entries since Logstash can't figure out which log entries are new and which ones it has processed before. There are ways around this but I think you should choose another method of collecting the logs.

On LogStash console i am seeing it is reading the log file. But in Kibana i am not seeing any data.

How are you seeing that it's reading the data? Are you getting output from the stdout output?

Hi,
while i run log stash from command prompt: C:\ELK\logstash\bin>logstash -f c:\elk\logstash\bin\logstash.json

i am seeing that i am getting data from my "http://ausu372a.wm.com/devlogs/alvdapp003/WAS70/logs/ERLWMLTMSserver2/SystemOut.log" , here i am facing 2 issues:

  1. In Kibana i am seeing errors , 'Saved "field" parameter is now invalid. Please select a new field.' attached image .
  2. by using the http link every time i will get whole log file, you are saying that i will get duplicate entries, can you please suggest is there any other way i need to proceed?

In Kibana i am seeing errors , 'Saved "field" parameter is now invalid. Please select a new field.' attached image .

I don't know off the top of my head what that means. Try asking in the Kibana group.

by using the http link every time i will get whole log file, you are saying that i will get duplicate entries, can you please suggest is there any other way i need to proceed?

The standard method is to install an agent like Filebeat on each machine that has log files and have that agent ship the log data as it arrives.

If you insist on using HTTP you could use a fingerprint filter to compute a hash of each line of input and use that as the document id in Elasticsearch. It'll weed out duplicates but it gets quite inefficient if you increase the frequency with which you poll the log.

1 Like

Thank you so much for Reply,
If i want to go with file beat approach, i am seeing two options, please suggest best out of them.
My situation is i have ELK on different server and webserver which is sending logs is on different server.

  1. Install filebeat on the server where i will get logs , so filebeat will send logs to my log stash server.
  2. If web team share the logs location, can i pull the logs if i have filebeat on my ELK server?

PLease suggest second option is good to follow, i know first option is straight forward, but if my web team don't want to install file beat on server where i have webserver which is generating logs, my only option is 2.

1 Like

Both options work but option 1 is preferable.

I read something about filebeat, every time it will not read whole file, it knows where it left off last time. is it true?

Yes, that's how it works.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.