In this example we are going to use Fluentd to read JSON logs from a log file and write them into another log file.


Flow


  1. Application writes JSON logs into logs/application/registration.log* log file.

  2. Fluentd to reads (tails) logs/application/registration.log* log file.

  3. Fluentd writes the newly read log line to logs/fluentd/registration.log* log file.

As you can see above, log rotation feature has been handled by using * at the end of logs/application/registration.log file name.


Fluentd


You must read Before Installing Fluentd page which explains why we do the following configurations.


Install NTP


The current time is Wed Mar 28 19:25:10 UTC 2018 but server's time seems wrong as seen below.


ubuntu@linux:~$ date
Sat Mar 24 22:48:10 UTC 2018

Let's update system time.


ubuntu@linux:~$ sudo timedatectl set-timezone Europe/London

ubuntu@linux:~$ sudo apt-get install ntp

ubuntu@linux:~$ sudo nano /etc/ntp.conf # This is whole new content
driftfile /var/lib/ntp/drift
# Specify UK NTP servers.
server 0.uk.pool.ntp.org
server 1.uk.pool.ntp.org
server 2.uk.pool.ntp.org
server 3.uk.pool.ntp.org
# Use Ubuntu's NTP server as a fallback.
server ntp.ubuntu.com
# Local users may obtain data from NTP servers.
restrict 127.0.0.1
restrict ::1

ubuntu@linux:~$ sudo service ntp restart

As you can see below the time is now correct.


ubuntu@linux:~$ date
Wed Mar 28 19:29:38 BST 2018

Increase Max # of File Descriptors


The default value of $ ulimit -n is 1024 but we need to increase it so do the following and restart your system. After restarting the system we should see 65536.


ubuntu@linux:~$ sudo nano /etc/security/limits.conf # Add these lines to the file
root soft nofile 65536
root hard nofile 65536
* soft nofile 65536
* hard nofile 65536

Optimise Network Kernel Parameters


Add lines below to the configuration file and run $ sudo sysctl -p command to reflect the changes.


ubuntu@linux:~$ sudo nano /etc/sysctl.conf
net.core.somaxconn = 1024
net.core.netdev_max_backlog = 5000
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_wmem = 4096 12582912 16777216
net.ipv4.tcp_rmem = 4096 12582912 16777216
net.ipv4.tcp_max_syn_backlog = 8096
net.ipv4.tcp_slow_start_after_idle = 0
net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 10240 65535

Install Fluentd td-agent


Use commands below to install and restart the td-agent. The configuration file is located at /etc/td-agent/td-agent.conf. For more information read Installing Fluentd Using deb Package page.


ubuntu@linux:~$ curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-xenial-td-agent2.sh | sh

ubuntu@linux:~$ sudo /etc/init.d/td-agent status/start/stop/restart

Test


Let's test if we can log a message in the default log file which is located at /var/log/td-agent/td-agent.log. After running command below "td-agent.log" file should contain 2018-03-28 20:04:44 +0100 debug.test: {"json":"hello world"} at the bottom.


ubuntu@linux:~$ curl -X POST -d 'json={"json":"hello world"}' http://localhost:8888/debug.test

Preparation


In our example Fluentd will write logs to a file stored under certain directory so we have to create the folder and allow td-agent user to own it. This folder also contains log "position" file which keeps a record of the last read log and log line so that tg-agent doesn't duplicate logs.


ubuntu@linux:~$ mkdir logs

ubuntu@linux:~$ mkdir logs/application

ubuntu@linux:~$ mkdir logs/fluentd

ubuntu@linux:~$ sudo chown td-agent -R logs/fluentd

ubuntu@linux:~$ ls -l
drwxrwxr-x 4 ubuntu ubuntu 4096 Apr 6 20:32 logs

ubuntu@linux:~$ ls -l logs/
drwxrwxr-x 2 ubuntu ubuntu 4096 Apr 6 20:32 application
drwxrwxr-x 2 td-agent ubuntu 4096 Apr 6 20:32 fluentd

Let's create a dummy application log file so that we can write logs to it and td-agent can read it.


ubuntu@linux:~$ touch logs/application/registration.log.1

ubuntu@linux:~$ ls -l logs/application/
-rw-rw-r-- 1 ubuntu ubuntu 0 Apr 6 20:39 registration.log.1

Configuration


Add configuration below right at the bottom of the configuration file, save and exit.


ubuntu@linux:~$ sudo nano /etc/td-agent/td-agent.conf

# Directive determines the input sources
# Watches source and triggers an event with a tag attached to it
<source>
@type tail # Uses tail plugin to read logs from
format json # Assumes that the log file is in "json" format
read_from_head true # Start to read the logs from the head of file, not bottom
tag api.user.registration # Tag triggered event with "api.user.registration"
path /home/ubuntu/logs/application/registration.log* # Paths to the files which will be tailed
pos_file /home/ubuntu/logs/fluentd/registration.log.pos # Path to the "position" database file
</source>

# Directive determines the output destinations
# Catches an event with a specific tag attached to it
<match api.user.registration>
@type file # Uses file plugin to write logs to
path /home/ubuntu/logs/fluentd/registration.log # Path to the log file which logs will be written to
</match>

Restart td-agent with command below so that td-agent is aware of new configuration.


ubuntu@linux:~$ sudo /etc/init.d/td-agent restart
[ ok ] Restarting td-agent (via systemctl): td-agent.service.

As soon as the td-agent start, it creates the "position" database file.


ubuntu@linux:~$ ls -l logs/fluentd/
total 4
-rw-r--r-- 1 td-agent td-agent 83 Apr 6 20:56 registration.log.pos

Test


ubuntu@linux:~$ echo '{"user":"1"}' >> logs/application/registration.log.1 
ubuntu@linux:~$ echo '{"user":"2"}' >> logs/application/registration.log.1
ubuntu@linux:~$ echo '{"user":"3"}' >> logs/application/registration.log.1

ubuntu@linux:~$ ls -l logs/fluentd/

-rw-r--r-- 1 td-agent td-agent 61 Apr 6 21:02 registration.log.20180406.b56933893cd87b6b8
-rw-r--r-- 1 td-agent td-agent 83 Apr 6 21:02 registration.log.pos

ubuntu@linux:~$ cat logs/fluentd/registration.log.20180406.b56933893cd87b6b8

2018-04-06T21:02:30+01:00 api.user.registration {"user":"1"}
2018-04-06T21:02:49+01:00 api.user.registration {"user":"2"}
2018-04-06T21:02:55+01:00 api.user.registration {"user":"3"}

ubuntu@linux:~$ touch logs/application/registration.log.2

ubuntu@linux:~$ echo '{"admin":"1"}' >> logs/application/registration.log.2
ubuntu@linux:~$ echo '{"admin":"2"}' >> logs/application/registration.log.2
ubuntu@linux:~$ echo '{"admin":"3"}' >> logs/application/registration.log.2

ubuntu@linux:~$ cat logs/fluentd/registration.log.20180406.b56933893cd87b6b8

2018-04-06T21:02:30+01:00 api.user.registration {"user":"1"}
2018-04-06T21:02:49+01:00 api.user.registration {"user":"2"}
2018-04-06T21:02:55+01:00 api.user.registration {"user":"3"}
2018-04-06T21:07:37+01:00 api.user.registration {"admin":"1"}
2018-04-06T21:07:37+01:00 api.user.registration {"admin":"2"}
2018-04-06T21:07:38+01:00 api.user.registration {"admin":"3"}

References