Logstash 2.2 Installation issue on Ubuntu 14.04 (APT) - First time

Hi All,

This is my first time installing Logstash and have been doing my best to follow the documentation for APT (on Ubuntu).

As an aside, I just finished installing InfluxDB, Telegraf, Kapacitor which had a similar process for adding the repository definition. That all seemed to go fine, but after doing more research, it looks like the ELK stack is the way to go..

Followed the procedure below:

https://www.elastic.co/guide/en/logstash/current/package-repositories.html

$ wget -qO - https://packages.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

$ echo "deb http://packages.elastic.co/logstash/2.2/debian stable main" | sudo tee -a /etc/apt/sources.list

$ sudo apt-get update && sudo apt-get install logstash

These all completed fine, but the next step is baffling me?
$ logstash
logstash: command not found

I can't find in the documentation what is command (or daemon) to use logstash?

Is it called logstash or something else?

Super keen to play and get this working, but a little frustrated with the documentation.

Gabe

The Logstash binary is installed in /opt/logstash/bin, which isn't in your PATH. It's also installed as a system service so you can use e.g. service logstash start to start it (but that's only useful once you have configuration in place).

You can use dpkg-query -L logstash to list the contents of an installed package.

Hi Magnus,

Thank you for getting back with those pointers, cleared up a few basic concepts.

The main (most urgent) objective for us today, is to implement comprehensive collection of system performance metrics (cpu, mem, disk, network, processes, users, date, time etc) across our servers to monitor the resource requirements of our applications and workflows (Java, Perl, Shell)

Most of our application servers are cpu bound and not memory bound, later we want to implement lightweight data collection on small memory servers for scaling out parallel processing (understand lumberjack might be the way to go for the small VMs).

Looking for a relatively simple implementation in the stack, ideally, keeping the number of vendor products to a minimum to avoid unnecessary complexity and compatibility issues.

Please could you suggest / recommend one or two solutions (inputs, configurations) for collection of Ubuntu 14.04 server performance metrics?

Gabe

I like collectd myself but you might want to look into Elastic's own Topbeat.

Excellent. I've gone down the track of setting up Beats Topbeat and Filebeat for use with Logstash and have installed Beats input plugin for Logstash

/opt/logstash/bin$ sudo ./plugin install logstash-input-beats

Should the config.json file should be in the same location as the binary /opt/logstash/bin ?

Looking to use the example config.json file to get started with beats

input {
beats {
port => 5044
}
}

output {
elasticsearch {
hosts => "localhost:9200"
manage_template => false
index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
document_type => "%{[@metadata][type]}"
}
}

Do ports 5044 and 9200 need to be open?

Should the config.json file should be in the same location as the binary /opt/logstash/bin ?

config.json is a poor name of that file since it isn't a JSON file (I'll file a pull request to fix the documentation). I suggest beats.conf or similar.

No, don't store any configuration files in /opt/logstash/bin. When run as a service Logstash looks for configuration files in /etc/logstash/conf.d. If you just want to play around (and start Logstash by hand) you can store the file anywhere you like.

Do ports 5044 and 9200 need to be open?

Not sure what you mean. Logstash will listen on TCP port 5044 which must be accessible to all machines from which you wants to submit data via Beats. Elasticsearch should already be listenting on TCP 9200 which must be accessible only to Logstash and whatever machine you run Kibana on (if any), so loopback-only access could be sufficient.

That's great again, thank you. As you can probably tell by now, it's early days for me as a linux sysadmin, still very much earning my stripes, so really appreciate the good explanations and patience.

Always wondered if loopback addresses '127.0.0.1' used TCP and UDP ports or not. Have some of these Virtual Machines machines sitting on Microsoft Azure, there is a whole extra layer of cloud IaaS configuration sitting around the instances, they call ports 'endpoints' and you have to specify these outside of the OS. Will present a whole bunch of additional challenges as the deployment gets more complicated Anyway, will specify the Azure endpoints 'ports' for TCP 5044 and TCP 9200 on each VM.

Also have InfluxDB running as another datastore, ease of installation and simple query make it a good test case.

Ran into a couple of challenges with beats configuration for logstash to elasticsearch, so switched to configuring logstash to InfluxDB and found that equally frustrating with the documentation on both sides.

So just in the process of trying to configure collectd for logstash and collectd for influxdb.

Still have a bit of a learning curve, but doing it is part of process..

I'll put that beats.conf file in the right place and see if can fire that up..