AWS Logstash Setup

This is the second of a two-part post on getting Amazon’s version of ElasticSearch set up in AWS.  We go over the basics of setting up an AWS ES cluster and then tackle supplying the cluster with data via Logstash.


This post assumes that you have already set up an ElasticSearch cluster in AWS.  At the end, you should have a working data pipeline into ES that is exploitable with Kibana.  Apologies ahead of time, this is a long one…

Create EC2 AMI for Logstash

The first step is to create a box for Logstash to sit on.  Access the AMIs through the EC2 portal in the AWS Console.

ec2

Launch a new instance and select your base AMI.  I am using Amazon Linux.  Feel free to go your own way, but take note that your package management commands may differ. 

amazon-ami

* If you are trying this in the free tier, ensure you focus on selecting options denoted by “Free tier eligible” throughout the options selection.*

Also note that Amazon Linux includes a lot of package dependencies (i.e. Java) by default.  You will have to install these manually if you opt for a different AMI.

I won’t walk you through the entire process, but select security and storage options that meet your needs.  If you are not sure or just testing, feel free to just keep clicking Next, the defaults are fine.  When you are done, let’s launch.

At this point you’ll be prompted to select, or create, a PEM key for accessing your new box. 

logstashkey

Type in whatever name you would like, I used ‘LogstashProd’.  Ensure that you download the Key before launch. DO NOT LOSE THIS KEY, as you will need it to access your box

Log into the box

You will need:

  • The PEM you just created
  • IP of the box.  This can be found from the EC2 console by selecting the Instance.

Open your terminal of choice (i.e. Windows: Cygwin, OS X: Terminal).

We need to first modify the permissions of the PEM file to more restrictive settings. In Terminal, navigate to the location of you PEM file:

cd /path/to/pem/

View the permissions:

ls -lh

This should display something similar to:

-rw-r–r–@  1 adam  staff   1.7K Mar 18 09:43 LogstashProd.pem

Notice how they are world readable, we need to fix that:

chmod 600 LogstashProd.pem

Now view the permissions again to verify they are only readable by you:

-rw——-@  1 adam  staff   1.7K Mar 18 09:43 LogstashProd.pem

Now we’ll SSH into the newly created box:

ssh -i LogstashProd.pem ec2-user@192.168.1.1

The -i command signifies the passing of the PEM key generated by AWS.

-i should be followed by the relative path to the PEM file

The last parameter is the ec2-user @ IP address assigned by AWS (or the correct user for your AMI, i.e. centos for CentOS)

If you are having any issues, ensure that:

  • You are passing the correct PEM path
  • Your PEM permissions are correct
  • You are using the ‘ec2-user’ (or correct equivalent)
  • You have the correct IP

Install Logstash

First, we need to inport Elasticsearch/Logstash’s Public Signing Key for its package repository:

rpm –import https://packages.elastic.co/GPG-KEY-elasticsearch

Now we need to add the Logstash package references to our yum repository:

sudo vi /etc/yum.repos.d/logstash.repo

Paste in the repository information:

[logstash-2.2]

name=Logstash repository for 2.2.x packages

baseurl=http://packages.elastic.co/logstash/2.2/centos

gpgcheck=1

gpgkey=http://packages.elastic.co/GPG-KEY-elasticsearch

enabled=1s

Save and exit vi:

[esc]

:wq

Update all software:

sudo yum update

Install Logstash:

sudo yum install logstash

You are done!

For the official Logstash installation instructions, see:

https://www.elastic.co/guide/en/logstash/current/package-repositories.html

Configure Logstash

For our basic setup, we are going to receive data from syslog and forward it to an AWS ES cluster.  

Let’s create a new Logstash config file:

sudo vi /etc/logstash/conf.d/logstash.conf

Paste in the following example config

input {

     file {

        path => “/tmp/log.txt”

        start_position => “beginning”

        sincedb_path => “/dev/null”

          }

        }

output {

     elasticsearch {

                hosts => [“YOUR ELASTICSEARCH HOST”]

                index => “logstash-%{+YYYY.MM.dd}”

                ssl => true

        }

}

So what are we looking at here?

‘input’ is essentially defining where Logstash should look for incoming data.  This can be a file, syslog port, etc.  We are telling Logstash to monitor ‘/tmp/log.txt’ for any changes, starting at the beginning of the file.

‘output’ is the exact opposite.  Now that Logstash has the data, where should it send it.  The example listed pushes all of our new logs into the Elasticsearch cluster we created previously.

Note that Logstash can also modify data for some basic ETL using filters and mutations.

To test it out, pipe some data to your log file.  Log stash should actively read this and send it over to ElasticSearch.

echo “My test log” >> /tmp/log.txt

Fire up Kibana, which is installed as part of AWS ES, and you should be good to go!  The url for your Kibana instance can be found with your Elasticsearch details in the console.

Congrats!

Adam Gerhart

Assistant Director at Cognitio Corp
Adam is a technologist at Cognitio where he leads evaluations and use case development efforts in Cognitio Labs.

Latest posts by Adam Gerhart (see all)

About Adam Gerhart

Adam is a technologist at Cognitio where he leads evaluations and use case development efforts in Cognitio Labs.

Leave a Reply