What is Logstash?

In this tutorial we will be looking at what is logstash and when to use lostash. In a previous tutorial we had implemented Filebeats + ELK Stack. In this tutorial we had seen that filebeats are lightweight data shippers that send log data to logstash which in turn indexes it to elasticsearch. Filebeats can also be configured to send data to elasticsearch directly. Then why should do we configure filebeats to send data to logstash and not elasticsearch directly.
Logstash is a powerful data processing pipeline that can transform and enrich your data before it reaches Elasticsearch. This includes:

Log Management

Let us first have a look at the concept of log management.
Log Management can be defined as processing of logs produced by our software application and the environment in which it is running. It involves continuous collection, persistance, processing and analysis of data from different applications. It has following use cases -

Security
Compliance
Troubleshooting
Business Insights

What is Logstash?

The easiest way to analyze logs is using cat, tail and grep commands. However usually applications run on multiple host machines. Also on these host machines, logs are generated as multiple files and at different locations. We cannot analyze these logs using just commands. What we need is a centralized log management tool
Logstash is an integrated framework for collecting logs, centralization, parsing, storage and search. Using logstash we can perform tasks like transformation of unstructured data into structured data, filtering out certain types of data, enriching the current data. We can also aggregate or summarize data before sending it to Elasticsearch. Once these tasks are done then this data is sent to a centralized destination. For example to elasticsearch. We will further look at the above tasks when we implement an example later.

Structure of Logstash

Logstash consists of 3 parts -
Structure of Logstash

Logstash input - Logstash has rich collection of input plugins using which it can take inputs from TCP/UDP, Windows event logs, files.
Logstash filtering - When the logstash processes the input, a large collection of filters can be applied to it to modify, manipulate and transform these events. This helps us to extract only the required data from the logs and index them. This makes it possible to query useful data and make logical conclusions.
Logstash output - Logstash also has a rich collection of output plugins using which data can be sent to different destinations like email, files, alerting tools, storages like elasticsearch.

Go to Logstash downloads page and download the latest logstash installation. In my case it is 7.15 version.
Download Logstash

Unzip the downloaded files to specified location. Go to the bin location and create file named logstash.conf as follows-

input{
 file{
        type => "testlogs"
        path => ["C:/logs/*.log" ]
    }

}

filter { 
 grok { match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:log-level} Component %{DATA:componentname} took %{INT:time} ms" } 
 }
}

output{
	stdout { codec => rubydebug }
}

The logs we are going to analyze are as follows -

2020-03-11T17:23:34.000+00:00 INFO Component A took 5 ms
2020-03-11T18:23:34.000+00:00 INFO Component B took 15 ms
2020-03-11T19:23:34.000+00:00 INFO Component C took 35 ms

Open the command prompt prompt and go to the logstash bin folder. Use the command

logstash.bat -f logstash.conf

We get the output as follows -
Logstash Output

Search Tutorials

What is Logstash?

Log Management

What is Logstash?

Structure of Logstash

Popular Posts

See Also