Logstash

The agent

Logstash is a tool that parses logs into JSON so they can be indexed and search. The application is packaged a self contained .jar file.

Currently the logstash is deployed by having an instance of logstash running on clients and a server.

The clients read in events from files and send them to a central server. This is done as most of the application in the cluster do not write their logs to the system syslog by rather use another logging tool such as log4j or logback

The central point where logstash events are sent to is acually a redis instance, this acts as a broker allowing for a loarge amount of traffic.

The logstash installation is contained within /opt/logstash

/opt/logstash/bin/logstash-monolithic.jar
/opt/logstash/etc/logstash.conf
/opt/logstash/log/logstash.log

All files are distributed by cfengine, this also takes care of ensuring the relevant conf file is copied based on the server.

A basic config would be as follows

input {
    file {
        type = "yum"
        path = ["/var/log/yum"]
        exclude = ["*.gz"] debug = true
      }
}

filter {
    grok {
        type = "yum"
        pattern = ["%{SYSLOGTIMESTAMP} %{DATA:action}\: %{GREEDYDATA:package}"]
        break_on_match = false
    }
}

output {
    stdout {
        debug = true
    }

    redis {
        host = "148.187.66.65"
        data_type = "list"
        key = "logstash"
    }
}

When events are received on the central server they are indexed by elastic search.

input {
    redis {
        host = "148.187.66.65"
        type = "redisinput"
        data_type = "list" key = "logstash"
    }
}

output {
    elasticsearch {
        cluster = "logstash"
    }
}

The grok filter

The main use of logstash is to filter events through grok, this allows us to break up fields within each event. Take the below event from the yum log for example

Mar 20 14:16:11 Installed: yum-utils-1.1.16-16.el5.noarch

The grok pattern to match this would be as follows

%{SYSLOGTIMESTAMP} %{DATA:action}\: %{GREEDYDATA:package}

SYSLOGTIMESTAMP, DATA and GREEDYDATA are pre-defined regular expressions which is what makes working with grok so easy. A full list of patterns and their reg-ex can be found here https://github.com/logstash/grok-patterns/blob/master/grok-patterns

This filter splits out the time stamp.

Then it takes text up to the ":" and marks this as "action", in the case of the yum log this will be installed, updated or erased. We can now search for updated packages for example. Note the preceding "\" before the ":" special characters need to be escaped.

Finally information after the action is defined as the package name, GREEDYDATA is used here as there could be special characters in the package name. This now gives us a package field we can search for.

http://grokdebug.herokuapp.com/ is an excellent tool to help build patterns for grok.

Redis instance

Redis is a simple in memory key/value store and is very quick to set up

yum install redis

vim /etc/redis.conf # Specifiy the IP address of the interface to bind to

service redis start

Elastic search

At the time of writing logstash is only compatible with elasticsearch 20.2-x

rpm -ivh elasticsearch-0.20.2-3.el6.x86_64.rpm

vim /etc/elasticsearch/elasticsearch.yml # we need to define the cluster name and node name.

cluster.name: logstash
node.name: "logstash1"

service elasticsearch start

Elasticsearch is not running however we have to tell it how the data we will be sending will be formatted.

curl -XPUT http://ppcluster.lcg.cscs.ch:9200/_template/logstash_per_index -d @/root/logstash_per_index.json

cat /root/logstash_per_index.json

{
  "template": "logstash*",
  "settings": {
    "index.query.default_field": "@message",
    "index.cache.field.type": "soft",
    "index.store.compress.stored": true },
  "mappings": {
    "_default_": {
      "_all": { "enabled": false },
      "properties": {
        "@message": { "type": "string", "index": "analyzed" },
        "@source": { "type": "string", "index": "not_analyzed"},
        "@source_host": { "type": "string", "index": "not_analyzed" },
        "@source_path": { "type": "string", "index": "not_analyzed" },
        "@tags": { "type": "string", "index": "not_analyzed"},
        "@timestamp": { "type": "date", "index": "not_analyzed" },
        "@type": { "type": "string", "index": "not_analyzed" }
        }
    }
  }
}

A lot of the interaction to elasticsearch seems to be though curl so I would recommend installing a few extra tools to make life easier

wget download.elasticsearch.org/es2unix/es
chmod +x es
mv es /usr/local/bin

[root@logstash ~]# es master
-yyvjwmKSGKaKCK6xIHtMg 10.10.66.65 logstash1

[root@logstash ~]# es nodes
xFxl2peqTXGJnSm6a2GsFw 10.10.66.65 9301 c - Adversary
-yyvjwmKSGKaKCK6xIHtMg 10.10.66.65 9200 10.10.66.65 9300 d * logstash1
MY2S1Kv5SQOr4U8br8p1Xg 148.187.64.207 9200 148.187.64.207 9300 d - ppcluster

Also for a webui dashboard

cd /usr/share/java/elasticsearch/bin/

./plugin -install karmi/elasticsearch-paramedic

Then browse to http://logstash.lcg.cscs.ch:9200/_plugin/paramedic/index.html

Web Interface

The web interface is provided by a ruby application called Kibana

First we need to install a couple of per-requisits

yum install rubygems ruby-devel

Download Kibana from http://kibana.org/

cd Kibana

bundle install

The configuration is stored in KibanaConfig.rb, import things to configure are the Kibana host, elasticsearch cluster and set Default_fields = ['@message']

To start the server ruby kibana.rb # Note cfengine will destribute an init script as Kibana does not ship with one

Search syntax is @FIELD_NAME:"SEARCH STRING" for example find events from cream02

@source_host:"cream02.lcg.cscs.ch"

Sending alerts to nagios

To send alters to nagios we first need to have NSCA set up. Ensure that the nagios server has the daemon and the server that will be triggering the alert has the client. These are nsca and nsca-client respectively.

There is a good guide to configuring NSCA at the link below

http://nagios.sourceforge.net/download/contrib/documentation/misc/NSCA_Setup.pdf

In order to have events trigger an nagios alert we need to define nsca as an output in our logstash config. Here we have an output that will take any event tagged with "java_trace" and forward it to nagios with the status code 1. The status codes are the same used by nagios so 0 = OK, 1 = WARNING, 2 = CRITICAL, 3 = UNKNOWN.

output {
    nagios_nsca {
        tags = "java_trace"
        host = "nagios.lcg.cscs.ch"
        port = "5667"
        nagios_status = "1"
        send_nsca_config = "/etc/nagios/send_nsca.cfg"
    }
}

-- GeorgeBrown - 2013-05-07