Tuesday, July 30, 2024

Qemu, virt-manager, and libvirt on macOS with Apple silicon M2

1. install QEMU

brew install qemu
#OR
sudo port install qemu

2. Install libvirt and virt-manager

# Install libvirt
brew install libvirt
#OR
# Time to install virt-manager, you will need macports for this
sudo port install virt-manager

3. Start libvirt service.

# Now, Start libvirt services (to start the libvirt daemon)
brew services start libvirt
# And now start virt-manager
virt-manager -c "qemu:///session" --no-fork

Source

Thursday, June 27, 2024

Docker macvlan and ipvlan network plugins

This is a continuation of my previous blog on macvlan and ipvlan Linux network drivers. Docker has added support for macvlan and ipvlan drivers and its currently in experimental mode as of Docker release 1.11.

Example used in this blog

In this example, we will use Docker macvlan and ipvlan network plugins for Container communication across hosts. To illustrate macvlan and ipvlan concepts and usage, I have created the following example.

vlan4

Following are details of the setup:

First, we need to create two Docker hosts with experimental Docker installed. The experimental Docker has support for macvlan and ipvlan. To create experimental boot2docker image, please use the procedure here.
I have used Virtualbox based environment. Macvlan network is created on top of host-only network adapter in Virtualbox. It is needed to enable promiscuous mode on the Virtualbox adapter. This allows for Container communication across hosts.
There are four Containers in each host. Two Containers are in vlan70 network and two other Containers are in vlan80 network.
We will use both macvlan and ipvlan drivers and illustrate Container network connectivity in same host and across hosts.

Following output shows the Docker experimental version running:

$ docker --version
Docker version 1.11.0-dev, build 6c2f438, experimental

Macvlan

In this section, we will illustrate macvlan based connectivity with macvlan bridge mode.

On host 1, create macvlan subinterface and Containers:

docker network  create  -d macvlan \
   --subnet=192.168.0.0/16 \
    --ip-range=192.168.2.0/24 \
	-o macvlan_mode=bridge \
    -o parent=eth2.70 macvlan70
docker run --net=macvlan70 -it --name macvlan70_1 --rm alpine /bin/sh
docker run --net=macvlan70 -it --name macvlan70_2 --rm alpine /bin/sh

docker network  create  -d macvlan \
   --subnet=192.169.0.0/16 \
    --ip-range=192.169.2.0/24 \
	-o macvlan_mode=bridge \
    -o parent=eth2.80 macvlan80
docker run --net=macvlan80 -it --name macvlan80_1 --rm alpine /bin/sh
docker run --net=macvlan80 -it --name macvlan80_2 --rm alpine /bin/sh

Containers in host 1 will get ip address in 192.168.2.0/24 network and 192.169.2.0/24 network based on the options mentioned above.

On host 2, create macvlan subinterface and Containers:

docker network  create  -d macvlan \
   --subnet=192.168.0.0/16 \
    --ip-range=192.168.3.0/24 \
	-o macvlan_mode=bridge \
    -o parent=eth2.70 macvlan70
docker run --net=macvlan70 -it --name macvlan70_3 --rm alpine /bin/sh
docker run --net=macvlan70 -it --name macvlan70_4 --rm alpine /bin/sh

docker network  create  -d macvlan \
   --subnet=192.169.0.0/16 \
    --ip-range=192.169.3.0/24 \
	-o macvlan_mode=bridge \
    -o parent=eth2.80 macvlan80
docker run --net=macvlan80 -it --name macvlan80_3 --rm alpine /bin/sh
docker run --net=macvlan80 -it --name macvlan80_4 --rm alpine /bin/sh

Containers in host 2 will get ip address in 192.168.3.0/24 network and 192.169.3.0/24 network based on the options mentioned above.

Lets look at Docker networks created in host 1, we can see the macvlan networks “macvlan70” and “macvlan80” as shown below.

$ docker network ls
NETWORK ID          NAME                DRIVER
e5f5f6add03d        bridge              bridge
a1b89ce4bd84        host                host
90b7d5ba61b9        macvlan70           macvlan
bedeca9839e1        macvlan80           macvlan

Lets check connectivity on ip subnet 192.168.x.x/16(vlan70) between Containers in same host and across hosts:

Here, we are inside macvlan70_1 Container in host1:
# ifconfig
eth0      Link encap:Ethernet  HWaddr 02:42:C0:A8:02:01
          inet addr:192.168.2.1  Bcast:0.0.0.0  Mask:255.255.0.0
# ping -c1 192.168.2.2
PING 192.168.2.2 (192.168.2.2): 56 data bytes
64 bytes from 192.168.2.2: seq=0 ttl=64 time=0.137 ms

--- 192.168.2.2 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.137/0.137/0.137 ms
/ # ping -c1 192.168.3.1
PING 192.168.3.1 (192.168.3.1): 56 data bytes
64 bytes from 192.168.3.1: seq=0 ttl=64 time=2.596 ms

--- 192.168.3.1 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 2.596/2.596/2.596 ms
/ # ping -c1 192.168.3.2
PING 192.168.3.2 (192.168.3.2): 56 data bytes
64 bytes from 192.168.3.2: seq=0 ttl=64 time=1.400 ms

--- 192.168.3.2 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 1.400/1.400/1.400 ms

Connectivity is also successful in ip subnet 192.169.x.x/16(vlan80) between Containers in same host and across hosts.

Connecting macvlan container to host

By default containers in macvlan network cannot directly talk to host and this is intentional. It is needed to create a macvlan interface in the host to allow the communication between host and container. Also, Containers can expose tcp/udp ports using macvlan network and it can directly be accessed from underlay network.

Lets use an example to illustrate host to container connectivity in macvlan network.

Create a macvlan interface on host sub-interface:

docker network create -d macvlan \
–subnet=192.168.0.0/16 \
–ip-range=192.168.2.0/24 \
-o macvlan_mode=bridge \
-o parent=eth2.70 macvlan70

Create container on that macvlan interface:

docker run -d –net=macvlan70 –name nginx nginx

Find ip address of Container:

docker inspect nginx | grep IPAddress
“SecondaryIPAddresses”: null,
“IPAddress”: “”,
“IPAddress”: “192.168.2.1”,

At this point, we cannot ping container IP “192.168.2.1” from host machine.

Now, let’s create macvlan interface in host with address “192.168.2.10” in same network.

sudo ip link add mymacvlan70 link eth2.70 type macvlan mode bridge
sudo ip addr add 192.168.2.10/24 dev mymacvlan70
sudo ifconfig mymacvlan70 up

Now, we should be able to ping the Container IP as well as access “nginx” container from host machine.

$ ping -c1 192.168.2.1
PING 192.168.2.1 (192.168.2.1): 56 data bytes
64 bytes from 192.168.2.1: seq=0 ttl=64 time=0.112 ms

— 192.168.2.1 ping statistics —
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.112/0.112/0.112 ms

ipvlan

In this section, we will illustrate ipvlan based connectivity with ipvlan l2 mode.

On host 1, create ipvlan sub-interface and Containers.

docker network  create  -d ipvlan \
   --subnet=192.168.0.0/16 \
    --ip-range=192.168.2.0/24 \
	-o ipvlan_mode=l2 \
    -o parent=eth2.70 ipvlan70
docker run --net=ipvlan70 -it --name ipvlan70_1 --rm alpine /bin/sh
eth0      Link encap:Ethernet  HWaddr 08:00:27:FA:9D:0C
          inet addr:192.168.2.1  Bcast:0.0.0.0  Mask:255.255.0.0
docker run --net=ipvlan70 -it --name ipvlan70_2 --rm alpine /bin/sh

docker network  create  -d ipvlan \
   --subnet=192.169.0.0/16 \
    --ip-range=192.169.2.0/24 \
	-o ipvlan_mode=l2 \
    -o parent=eth2.80 ipvlan80
docker run --net=ipvlan80 -it --name ipvlan80_1 --rm alpine /bin/sh
docker run --net=ipvlan80 -it --name ipvlan80_2 --rm alpine /bin/sh

On host 2, create ipvlan sub-interface and Containers.

docker network  create  -d ipvlan \
   --subnet=192.168.0.0/16 \
    --ip-range=192.168.3.0/24 \
	-o ipvlan_mode=l2 \
    -o parent=eth2.70 ipvlan70
docker run --net=ipvlan70 -it --name ipvlan70_3 --rm alpine /bin/sh
docker run --net=ipvlan70 -it --name ipvlan70_4 --rm alpine /bin/sh	  
		  
docker network  create  -d ipvlan \
   --subnet=192.169.0.0/16 \
    --ip-range=192.169.3.0/24 \
	-o ipvlan_mode=l2 \
    -o parent=eth2.80 ipvlan80
docker run --net=ipvlan80 -it --name ipvlan80_3 --rm alpine /bin/sh
docker run --net=ipvlan80 -it --name ipvlan80_4 --rm alpine /bin/sh

Let’s look at networks created in host 1, we can see ipvlan networks “ipvlan70” and “ipvlan80” as shown below.

$ docker network ls
NETWORK ID          NAME                DRIVER
e5f5f6add03d        bridge              bridge
a1b89ce4bd84        host                host
1a1262e008d3        ipvlan70            ipvlan
080b230b892e        ipvlan80            ipvlan

Connectivity is successful in ip subnet 192.168.x.x/16(vlan70) and ip subnet 192.169.x.x/16(vlan80) between Containers in same host and across hosts.

ipvlan l3 mode

There are some issues in getting ipvlan l3 mode to work across hosts. Following example shows setting up ipvlan l3 mode across Containers in a single host:

docker network  create  -d ipvlan \
   --subnet=192.168.2.0/24 \
   --subnet=192.169.2.0/24 \
	-o ipvlan_mode=l3 \
    -o parent=eth2.80 ipvlan
docker run --net=ipvlan --ip=192.168.2.10 -it --name ipvlan_1 --rm alpine /bin/sh
docker run --net=ipvlan --ip=192.169.2.10 -it --name ipvlan_2 --rm alpine /bin/sh

Source

Sunday, June 2, 2024

Message Broker vs. Event Broker

There comes a time in your career as a backend developer when you need to answer this question:

I need to build an asynchronous application using distributed queues, which broker can I use?

Let me stop you there!

Our natural instinct as engineers is to create a list of tools that we know or want to be familiar with (in case it is a new and known technology), and start using it.

Unfortunately, at that exact point in time, we have missed the first and most important question, which needs an answer before all others: What are our present and sometimes-future use cases/requirements, and what tool will best solve them?

This was the beginning of our story when it came to designing a major feature. Our instincts as engineers took over. Our first question was not the most important one, and from there, we found the process of selecting the right tool less effective.

Our team had a few meetings to discuss our need for a distributed queue, where different limitations/features of different technologies (from different paradigms) kept the focus away from our most important requirements, and from reaching a decision and consensus.

At that point, we decided to go back to basics and asked:

What is the use case we are trying to solve, and what are the areas in which we have no room for compromise?

As always, let’s start with the requirements.

Step 1: Fine tune the problem you are trying to solve and how the technology/tool architecture aligns with your goals and considerations

When choosing a message broker or event broker, there are many things to consider: high availability, fault tolerance, multi-tenancy, multiple cloud regions support, ability to support high throughput and low latency — and the list goes on and on.

Most of the time, when reading about the main features of either an event or message broker, we are presented with the most complex use cases, which most companies or products never fully utilize or need.

As engineers, as in life, there is a common saying that applies a lot of the time:

“God is in the details, but the devil is between the lines.”

When choosing between the two paradigms of event broker vs. message broker, the “devil” lies in more low-level technical considerations, such as:

message consumption or production acknowledgment methods, deduplication, prioritization of messages, consumer threading model, message consumption methods, message distribution/fanout support, poison pill handling, etc.

Oranges and apples (differences between concepts)

Step 2: Understand the differences between the two paradigms

Event broker
Stores a sequence of events. Events are usually appended to a log (queue or topic) in the order in which they arrived at the event broker. Events in the topic or queue are immutable and their order cannot be changed.

As events are published to the queue or topic, the broker identifies subscribers to the topic or queue, and makes the events available to multiple types of subscribers.

Producers and consumers need not be familiar with each other.

Events can potentially be stored for days or weeks, as once they are successfully consumed, they are not evicted from the queue/topic.

Message broker

Used for services or components to communicate with each other. It provides the exchange of information between applications by transmitting messages received from the producer to the consumer in an async manner.

It usually supports the concept of queue, where messages are typically stored for a short period of time. The purpose of the messages in the queue is to be consumed as soon as consumers are available for processing, and dropped after the successful consumption of said message.

Order of message processing in the queue is not guaranteed and can be altered.

Message broker vs. event broker

Normally, when dealing with a short-lived command or task-oriented processing, we would favor using a message broker.

For example, let’s say you are working in an e-commerce company and want to add a new product to your company’s website. This could mean that multiple services need to be aware of it and process this request in an async manner.

The diagram above shows the use of RabbitMQ fanout message distribution, where each service has its own queue attached to a fanout exchange.

The products service sends a message to the exchange with the new product information, and in turn, the exchange sends the message to all the attached queues.

After a message is successfully consumed from a queue, it is deleted, as the services involved do not need to retain or reprocess the message again.

When dealing with current or historical events, usually in large volumes of data, which need to be processed in either a single or bulk manner, we would favor an event broker.

For example, let’s say you are working at an entertainment rating website, and you want to add a new feature to display movie writers and directors to your users. The information is historically stored but not accessible to the services in charge of providing this data.

The diagram above shows the use of Kafka as an event broker, allowing it to extract hundreds of millions of movies from the data warehouse in order to append the necessary information to movie information stored by each service.

Kafka can accept a massive amount of data in a relatively short period of time, and consumers can have a separate consumer group to process the movie’s topic stream separately.

Important aspects to be aware of

As I previously mentioned, there are a lot of things to consider when choosing the right paradigm for you.

I would like to discuss some key differences which often can make or break your decision about technology.

For this part, I will compare the two most popular technologies to date: Kafka (event broker) & RabbitMQ (message broker), each representing said paradigms, which I have a working experience with.

I strongly encourage you to take the following points into account in your technology selection process.

Poll vs. push

The way Kafka consumers work is by polling a bulk of messages in order from a topic, which is divided by partitions. Each consumer is assigned the responsibility of consuming from one or more partitions, where partitions are used as the parallelism mechanism for consumers (implicit threading model).

This means that the producer, who is usually in charge of managing the topic, is implicitly aware of the max number of consumer instances that can subscribe to the topic.

The consumer is responsible for handling both success and failure scenarios when processing messages. As messages are being polled in bulk from a partition, the message processing order is guaranteed at the partition level.

The way RabbitMQ consumers receive their messages from the queue is by the broker pushing messages to them.

Each message is processed in a singular atomic fashion, allowing for an explicit threading model by the consumer, without the awareness of the producer of the number of consumer instances.

Successful message processing is the responsibility of the consumer, whereas failure handling is done largely by the message broker.

Message distribution is managed by the broker.

Features such as delayed messages and message prioritization come out of the box, as message processing ordering is mostly not guaranteed by the queue.

Error handling

The way Kafka handles message processing errors is by delegating the responsibility to the consumer.
In case a message was processed a few times unsuccessfully (poison pill), the consumer application will need to keep track of the amount of processing attempts and then produce a message to a separate DLQ (dead letter queue) topic, where it can be examined/re-run later on.

For error handling purposes, the consumer is the one assigned all of the responsibility.

This means that in case you would like to have either retry/DLQ capabilities, it is up to you to provide a retry mechanism and also act as a producer when sending a message to a DLQ topic, which in some edge cases, might lead to message loss.

The way RabbitMQ handles message processing errors is by keeping track of failures in processing a message. After a message is considered a poison pill, it is routed to a DLQ exchange.

This allows for either requeueing of messages or routing to a dedicated DLQ for examination.

In this manner, RabbitMQ provides a guarantee that a message which was not processed successfully will not get lost.

Consumer acknowledgment and delivery guarantees

The way Kafka handles consumer acknowledgment is by the consumer committing messages offset belonging to the bulk of messages polled from the topics partition.

Out-of-the-box, the Kafka client commits the offset automatically, regardless if the message was processed successfully or not, which may lead to message loss, as shown in the image bellow.

This behavior can be changed by the consumer code taking the responsibility of committing the offsets of the fetched messages manually, including handling failures of messages consumption as well.

The way RabbitMQ handles consumer acknowledgment is by the consumer “acking” or “nacking” a message in an atomic per message manner, allowing for a retry policy / DLQ, if needed to be managed by the message broker.

Out-of-the-box RabbitMQ client acknowledgment is done automatically regardless if the message was processed successfully or not. The acknowledgment can be controlled manually by a configuration on the consumer side, allowing the message to be pushed again the to consumer for reprocessing in case of failures/ timeout.

Both RabbitMQ and Kafka provide, for most cases, at least once guarantee for message/event processing, which means the consumer should be idempotent in order to handle multiple processing of the same message/event.

Our process

Step 3: choose the technology by your use case and not the other way around

The most important part for us, was compiling a list of technical criteria for our solution, and assigning “no go” to requirements which we couldn’t live without as a team and as a product.

In the spirit of going back to basics, I used a plain old table to compile and compare the different criteria and also mentioned some “gotcha”s. Remember, “The devil is between the lines.”

This really helped organize and put a focus around what was critical for us and what we couldn’t live without.

For example, one of our “no go” requirements was that we couldn’t afford to lose messages in case there was an error in processing.

As you might remember from the section above, when using Kafka where a DLQ is needed, the consumer is also a DLQ producer. This means that in some cases of failures in the consumer, the message will not be sent to the DLQ topic, causing potential message loss.

At this point, as you might have guessed, we decided to go with the message broker.

Our feature consisted of a command/task-oriented processing use case, and the message broker met all of our product/data volume requirements, and also our team’s needs.

Final thoughts

The messaging and event streaming ecosystems consist of many solutions, each with dozens of different aspects that are important to consider and be familiar with.

It is vital that we enter each ecosystem with our eyes wide open, and have a clear understanding of these different paradigms. They will have a great effect on our day-to-day (and sometimes night) life as engineers.

In my next blog post, I will dive into the comparison table I created between the two paradigms, and deep dive into the more technical aspects of each one of them.

source

Wednesday, February 21, 2024

A Complete Guide to Managing Log Files with Logrotate

Controlling the sizes of log files on a Linux server is crucial due to their continuous growth. As log files accumulate, they can consume valuable storage space, strain server resources, and cause performance and memory issues. To address this problem, log rotation is commonly employed. It involves renaming or compressing log files before they become too large, while also removing or archiving old logs to free up storage space. On most Linux distributions, the preferred tool for log rotation is the logrotate program, which we will be focusing on in this tutorial.

By reading through this article, you will learn how to:

Examine and modify the Logrotate configuration, including both general and application-specific settings.
Create Logrotate configurations for a custom application or service.
Choose the right log rotation strategy for your application.
Debug common log rotation problems.

Prerequisites

Before proceeding with the rest of this tutorial, please ensure that you have:

A basic knowledge of working with the Linux command line.
A Linux server that includes the non-root user with sudo access. We'll be using Ubuntu 22.04 throughout in this guide but everything should work even if you're on some other distribution.
Prior knowledge of how to work with system log files on Linux.

🔭 Want to centralize and monitor your Linux logs?

Head over to Logtail and start ingesting your logs in 5 minutes.

Why file-based logging matters

Sending your application logs to a file is the first step towards persisting them and making them available for historical analysis, auditing, and troubleshooting, although you'll likely want to aggregate them in the cloud to unlock the full potential of your log data.

Even when you've adopted a log management service like Logtail, we generally don't recommend sending the logs to the service directly from the application code for a few reasons:

If a network connection or logging endpoint becomes temporarily unavailable, the application has to attempt resource-intensive retry logic to resume log streaming, which could impact overall performance.
If there's a persistent outage, the logs could be dropped and lost forever which could impact the troubleshooting process, ability to comply with regulations, or even conduct security investigations.

Therefore, we recommend persisting your logs to a local file to provide some redundancy, then use a log forwarder like Vector to transmit them to their final destination. This approach has a few notable advantages:

It decouples the log generation process from the log transmission process. This separation of concerns allows the application to focus on its core functionality without being concerned about the intricacies of log transmission. It also simplifies application development and maintenance, as you can rely on the log forwarder to handle the complexities of log delivery.
Log forwarders can typically aggregate logs from multiple sources, such as different applications or servers, into a centralized location and this flexibility allows you to adapt your log management infrastructure as your needs evolve, without requiring changes to individual applications.
Log forwarders can handle network disruptions, retries, and buffering of log data so that the log data is delivered reliably even in the event of an extended outage.
They can support different log formats and protocols, making it easier to send logs to multiple destinations or perform transformations on the log data.

The necessity of log rotation

Once you've started persisting your logs to local files, you'll need to implement a process for keeping individual files from becoming too large, and also a way to remove or archive older logs that are no longer needed to free up disk space.

When log files get too large, they become tedious to work with and searching for the records relevant to your current tasks can take a long time due to the large volume of records.

Therefore, implementing log rotation to spread the log data over several files and to remove older items is a must. It involves renaming log files on a predefined schedule or when the file reaches a predefined size. Once the specified condition is met, the log file is renamed to preserve its contents and make way for a new file.

Typically an auto incrementing number or timestamp is appended to the filename to indicate its time of rotation which is often helpful in narrowing down your search when investigating an issue that occurred on a specific date.

After the file is renamed, a new log file with the same name is created to capture the latest entries from the application or service. A cleanup process is also initiated to prevent an accumulation of rotated log files as older logs beyond a specified retention period are removed. This process repeats indefinitely as long as the log rotation mechanism is working.

Getting started with Logrotate

The logrotate daemon is pre-installed and active by default in Ubuntu and most mainstream Linux distributions. If Logrotate is not installed on your machine, ensure to install it first through your distribution's package manager.

logrotate --version

Output

logrotate 3.19.0

    Default mail command:       /usr/bin/mail
    Default compress command:   /bin/gzip
    Default uncompress command: /bin/gunzip
    Default compress extension: .gz
    Default state file path:    /var/lib/logrotate/status
    ACL support:                yes
    SELinux support:            yes

The Logrotate daemon uses configuration files to specify all the log rotation details for an application. The default setup consists of the following aspects:

/etc/logrotate.conf: this is the main configuration file for the Logrotate utility. It defines the global settings and defaults for log rotation that are applied to all log files unless overridden by individual Logrotate configuration files in the /etc/logrotate.d/ directory.
/etc/logrotate.d: this directory includes files that configure log rotation policies specific to the log files produced by a individual applications or services.

We will examine both configuration possibilities below.

The main Logrotate configuration

First off, let's view the main Logrotate configuration file at /etc/logrotate.conf. Go ahead and print its contents with the cat utility:

cat /etc/logrotate.conf

The command above prints the entire contents of this file:

# see "man logrotate" for details

# global options do not affect preceding include directives

# rotate log files weekly
weekly

# use the adm group by default, since this is the owning group
# of /var/log/syslog.
su root adm

# keep 4 weeks worth of backlogs
rotate 4

# create new (empty) log files after rotating old ones
create

# use date as a suffix of the rotated file
#dateext

# uncomment this if you want your log files compressed
#compress

# packages drop log rotation information into this directory
include /etc/logrotate.d

# system-specific logs may also be configured here.

Here's a description of what each of the above configuration directives mean (lines that begin with # indicate a comment):

weekly: represents the frequency of log rotation. Alternatively, you can specify another time interval (hourly, daily, monthly, or yearly). Since the logrotate utility is typical run once per day, you may need to change this configuration if a if a shorter rotation frequency than daily is desired (see below).
su root adm: log rotation is performed with the root user and admin group. By using this directive, you can ensure that the rotated log files are owned by a specific user and group, which can be useful for access control and permissions management. This is particularly relevant when the log files need to be accessed or managed by a specific user or group with appropriate privileges.
rotate 4: log files are rotated four times before old ones are removed. If rotate is set to zero, then the old versions are removed immediately and not rotated. If it is set to -1, the older logs will not be remove at all except if affected by maxage.
create: immediately after rotation, create a new log file with the same name as the one just rotated.
dateext: if this option is enabled, rotated log files will be renamed by appending a date to their filenames, allowing for better organization and differentiation of log files based on the date of rotation (especially when the frequency of rotation is daily or greater). The default scheme for rotated files is logname.1, logname.2, and so on, but enabling this option changes it to logname.YYYYMMDD. You can change the date format through the dateformat and dateyesterday directives.
compress: this rule determines whether old log files should be compressed (using gzip by default) or not. Log compression is turned off by default but you can enable it to save on disk space.
include: this directive is used to include additional configuration files or snippets. It allows you to modularize and organize your Logrotate configuration by splitting it into multiple files. In this case, the files in the /etc/logrotate.d directory have been included in the configuration.

As noted earlier, the /etc/logrotate.conf file serves as a global configuration file for Logrotate, providing default settings and options for log rotation across the system. It sets the stage for log rotation but can be extended or overridden by the configuration files in the /etc/logrotate.d/ directory which typically configure the rotation policy for specific application logs.

Application-specific configuration

Next, let's view the contents of the /etc/logrotate.d directory. It typically contains additional Logrotate configuration files for various applications or services installed on your machine:

ls /etc/logrotate.d/

Output

alternatives  apt      btmp  rsyslog                 ufw                  wtmp
apport        bootlog  dpkg  ubuntu-advantage-tools  unattended-upgrades

You will observe that quite a few programs have their log rotation configuration in this directory. Each configuration file within /etc/logrotate.d/ focuses on a particular application or log file set, specifying the log file path, rotation frequency, compression settings, and any additional directives necessary for managing the logs of that specific application or service.

Having separate configuration files in this directory allows for easy customization and maintenance of log rotation settings for individual applications or services without affecting other log files. For example, let's take a look at the config file for the Rsyslog utility through the cat command:

cat /etc/logrotate.d/rsyslog

You'll see the program's output appear on the screen:

Output

/var/log/syslog
/var/log/mail.info
/var/log/mail.warn
/var/log/mail.err
/var/log/mail.log
/var/log/daemon.log
/var/log/kern.log
/var/log/auth.log
/var/log/user.log
/var/log/lpr.log
/var/log/cron.log
/var/log/debug
/var/log/messages
{
        rotate 4
        weekly
        missingok
        notifempty
        compress
        delaycompress
        sharedscripts
        postrotate
                /usr/lib/rsyslog/rsyslog-rotate
        endscript
}

The above configuration specifies the rotation rules for several log files located in the /var/log/ directory. It also includes the following directives in addition to the ones we examined in the previous section:

missingok: continue log rotation without reporting any error if any of the specified log files are missing.
notifempty: ensures that log files are not rotated if they are empty. If a log file is empty, it won't trigger rotation.
delaycompress: delays compression of the rotated log files until the next rotation cycle. This allows for the previous log file to be available for analysis before compression.
sharedscripts: ensures that the commands or scripts specified in the prerotate or postrotate directive are executed only once, regardless of the number of log files being rotated. By default, logrotate executes the commands/scripts separately for each log file being rotated.
postrotate and endscript: encloses the commands or scripts to be executed after log rotation. In this case, the /usr/lib/rsyslog/rsyslog-rotate script is executed after a successful rotation. It sends the SIGHUP signal to the Rsyslog service so that it can close and reopen the log file for writing.

Overall, this configuration ensures that the specified log files are rotated weekly, compressed, and limited to a maximum of 4 rotated log files. It also includes additional directives for handling missing or empty log files and executes a post-rotation script specific to Rsyslog.

Other useful directives to note include:

size: specifies the maximum size in bytes, kilobytes, megabytes, or gigabytes that a log file can reach before rotation is initiated. This causes the default schedule to be ignored if as long as size is specified after the time directive (hourly, daily, etc).

/etc/logrotate.d/myapp

/var/log/myapp.log {
    daily
    size 10M
    . . .
}

In this example, Logrotate will trigger rotation when myapp.log reaches 10 megabytes in size. Once the size threshold is crossed, rotation will be initiated regardless of the time schedule (daily in this case).

minsize: the log files are rotated according to the specified time schedule, but not before the specified size is reached. Therefore, when minsize is used, both file size and timestamp are considered to determine if the file should be rotated.

/etc/logrotate.d/myapp

/var/log/myapp.log {
    daily
    minsize 10M
    . . .
}

When using minsize, rotation will not occur until the file reaches a minimum of 10 megabytes even if the daily schedule is met.

maxsize: specifies that the log files are rotated once they exceed the stated file size, even when the time interval has not yet been reached.

/etc/logrotate.d/myapp

/var/log/myapp.log {
    weekly
    maxsize 10M
    . . .
}

In this snippet, rotation will occur when a size of 10 megabytes is reached. Otherwise, it will rotate weekly.

Choosing the appropriate log rotation strategy

Logrotate offers two directives that specify how the log rotation should be handled: create and copytruncate. The former is the default, and its works by renaming a log file (say myapp.log) to myapp.log.1, before creating a new myapp.log file will be created to continue logging.

/etc/logrotate.d/myapp

/var/log/myapp.log {
    rotate 7
    create
    . . .
}

In copytruncate mode, the myapp.log file is copied to a new myapp.log.1 file, then the original file is emptied (truncated), allowing the application to continue writing to it as if it were a new file. This mode is useful if your application or process does not handle log file rotation gracefully by automatically switching to the new log file after rotation.

/etc/logrotate.d/myapp

/var/log/myapp.log {
    rotate 7
    copytruncate
    . . .
}

It's worth noting that while copytruncate avoids interrupting the logging process, it may cause a brief period of log loss during the rotation process since the original file is truncated. However, this is usually acceptable for applications that don't rely on continuous log analysis and can tolerate occasional gaps in the logs.

Configuring log rotation for a custom application

So far, we've seen how Logrotate can be used to manage the log files for system services and pre-installed utilities on your Linux server. Now, let's look at how to do the same thing for custom applications or services that you've deployed to the server.

To simulate an application that writes logs continuously to a file, create the following bash script somewhere on your filesystem. It writes fictional but realistic-looking log records to a file every second:

logify.sh

#!/bin/bash

logfile="/var/log/logify/log_records.log"

# Function to generate a random log record
generate_log_record() {
    local loglevel=("INFO" "WARNING" "ERROR")
    local services=("web" "database" "app" "network")
    local timestamps=$(date +"%Y-%m-%d %H:%M:%S")
    local random_level=${loglevel[$RANDOM % ${#loglevel[@]}]}
    local random_service=${services[$RANDOM % ${#services[@]}]}
    local message="This is a sample log record for ${random_service} service."

    echo "${timestamps} [${random_level}] ${message}"
}

# Main loop to write log records every second
while true; do
    log_record=$(generate_log_record)
    echo "${log_record}" >> "${logfile}"
    sleep 1
done

Save the file, then make it executable:

chmod +x logify.sh

Afterward, create the /var/log/logify directory using elevated privileges, then change the ownership of the directory to your user so that the script can write files to the directory:

sudo mkdir /var/log/logify

sudo chown -R $USER:$USER /var/log/logify/

You can now execute the script and it should begin to write the logs to the file every second:

./logify.sh

cat /var/log/logify/log_records.log

Output

2023-04-29 08:07:25 [WARNING] This is a sample log record for database service.
2023-04-29 08:07:26 [ERROR] This is a sample log record for app service.
2023-04-29 08:07:27 [INFO] This is a sample log record for app service.
2023-04-29 08:07:28 [ERROR] This is a sample log record for app service.
2023-04-29 08:07:29 [INFO] This is a sample log record for database service.
2023-04-29 08:07:31 [ERROR] This is a sample log record for web service.
2023-04-29 08:07:32 [INFO] This is a sample log record for network service.
2023-04-29 08:07:33 [INFO] This is a sample log record for network service.
. . .

At this stage, you must set up a log rotation policy to prevent the log_records.log file from growing too large and taking up valuable disk space on the server. There are two main options for doing this:

Create a new Logrotate configuration file and place it in the /etc/logrotate.d/ directory to perform log rotation according to the system's default schedule (it runs once per day by default but you can change it.
Create the configuration file that is independent of the system's Logrotate schedule and execute Logrotate at your preferred pace using through a cronjob.

Creating a standard Logrotate configuration

In this section, you will create a standard configuration file for your application logs and place it in the /etc/logrotate.d/ directory. Go ahead and create a new logify file in the /etc/logrotate.d/ directory with your text editor:

sudo nano /etc/logrotate.d/logify

Add the following text to the file:

/var/log/logify/*.log
{
    daily
    missingok
    rotate 7
    compress
    notifempty
}

The configuration above applies to all the files ending with .log in the /var/log/logify/ directory. We've already discussed what each directive here does earlier, so we won't go over that again here.

Save the file and test the new configuration by executing the command below. The --debug option instructs logrotate to operate in test mode where only debug messages are printed.

sudo logrotate /etc/logrotate.conf --debug

You should spot an entry for the logify configuration that looks similar to what is displayed below:

Output

. . .

rotating pattern: /var/log/logify/*.log
 after 1 days (7 rotations)
empty log files are not rotated, old logs are removed
considering log /var/log/logify/log_records.log
Creating new state
  Now: 2023-04-29 10:35
  Last rotated at 2023-04-29 10:00
  log does not need rotating (log has already been rotated)

. . .

The above output indicates that the configuration file at /etc/logrotate.d/logify has been found by the logrotate program. Therefore, the log files specified therein will now be rotated according to the defined policy along with the other system and application logs.

If you want to test that the log rotation works without without waiting for the specified schedule, you can use the -f/--force option like this:

sudo logrotate -f /etc/logrotate.d/logify

You will observe that the old log file was renamed and compressed and a new one was created:

ls /var/log/logify/

Output

log_records.log  log_records.log.1.gz

Another way to verify if a particular log file is rotating or not, and to check the last date and time of its rotation, examine the /var/lib/logrotate/status file (or /var/lib/logrotate/logrotate.status on Red Hat systems) like this:

sudo cat /var/lib/logrotate/status | grep 'logify'

Output

"/var/log/logify/log_records.log" 2023-4-29-10:39:50

Creating a system-independent Logrotate configuration

As mentioned earlier, a system-independent Logrotate configuration is one that is not run on the default system schedule. Such a configuration will not be included in the /etc/logrotate.d/ directory. Instead, you place the file in some other directory and create a cron job that will execute Logrotate with the configuration file at custom time interval.

Change into your home directory, and create a logify directory therein:

cd ~

mkdir logify

Next, edit your logify.sh script and change the logfile variable to the following:

logfile="$HOME/logify/log_records.log"

Afterward, rerun the script so that the logs are now written to the ~/logify directory:

./logify

To create a system-independent Logrotate configuration for these logs, you must create your configuration file outside of /etc/logrotate.d/. Therefore, go ahead and create a logrotate.conf file within the ~/logify directory:

nano ~/logify/logrotate.conf

Populate the file with the following contents:

logify/logrotate.conf

/home/<user>/logify/*.log
{
    hourly
    missingok
    rotate 7
    compress
    notifempty
}

This configuration is the same as in the previous section, except that daily has been changed to hourly so that the log files are rotated every hour instead of once per day.

You also need to create a Logrotate state file which stores information such as the last rotation date and time, the number of rotations performed, and other relevant details. This allows Logrotate to accurately perform rotations and prevent unnecessary rotations when they are not required.

In the default Logrotate setup, the state file is located in the /var/lib/logrotate/ directory. However, we will create a custom one through the command below:

logrotate ~/logify/logrotate.conf --state ~/logify/logrotate.state

The --state option tells logrotate to use an alternative state file located at ~/logify/logrotate.state. The logrotate command will create this file if it doesn't already exist, and you can view its content with cat:

cat /home/<user>/logify/logrotate.state

You'll see the program's output appear on the screen:

Output

logrotate state -- version 2
"/home/<user>/logify/log_records.log" 2023-4-29-20:0:0

The output indicates that Logrotate identified the relevant log file and when it last considered them for rotation. The next step here is to set up a cron job to execute the logrotate file at your desired frequency (hourly in this case).

Go ahead and open the cron jobs configuration file by executing crontab -e in your terminal:

crontab -e

The -e option is used to edit the current user's cron jobs using the editor specified by the $VISUAL or $EDITOR environmental variables. The above command should open a configuration file in your preferred text editor specified by one of these variables.

At the bottom of the file, add the following line:

0 * * * * /usr/sbin/logrotate /home/<user>/logify/logrotate.conf --state /home/<user>/logify/logrotate.state

This new line specifies that the cron job will be executed every hour (at minute 0), and the logrotate command will run with your custom configuration and state file. The full path of the logrotate binary is used here just to be safe.

Save and close the modified file. You will observe the following output:

crontab: installing new crontab

Now that your log rotation policy is all set up, you can view the ~/logify directory after an hour to confirm that the log file therein are rotated according to the defined policy. For more details about cron jobs see the following tutorial or type man cron in your terminal.

Changing the system Logrotate schedule

As mentioned earlier, when using the default system configuration, Logrotate only runs once per day which means using the hourly option in a configuration will be ineffective. However, you can modify this behaviour by changing the location of the script that runs Logrotate. On Ubuntu, its located at /etc/cron.daily/logrotate which indicates that the script is run once per day by the system's daily cronjob. If you want to change the schedule to hourly, move the script to the /etc/cron.hourly/ directory using the command below:c

sudo mv /etc/cron.daily/logrotate /etc/cron.hourly

Afterward, the script should be executed by the system's hourly cronjob so that the hourly option works normally henceforth.

Running commands or scripts before or after log rotation

Logrotate provides the ability to run arbitrary commands or scripts before and after log rotation through the prerotate and postrotate directives. As their names implies, the former executes commands or scripts before log rotation while the latter does the same thing after log rotation. Both directives are closed using the endscript directive.

You can use prerotate to perform any necessary preparations or actions required prior to the rotation, while postrotate should be used to perform tasks such as restarting services, notifying stakeholders, or further processing of the rotated log files.

For example, you can monitor your log rotation configuration by pinging a monitoring service like Better Uptime so that if the rotation does not execute as scheduled, you'll get an alert to investigate the problem further.

~/logify/logrotate.conf

/home/<user>/logify/*.log
{
    hourly
    missingok
    rotate 7
    compress
    notifempty
    sharedscripts
    postrotate
      curl https://betteruptime.com/api/v1/heartbeat/<heartbeat_id>
    endscript
}

In this example, postrotate is used to report that the log rotation was successful according to the defined schedule. If this report is not received within the expected period, an incident will be created and you will receive notifications at the configured channels (such as Email, Slack, SMS, etc). Its always a good idea to set up such monitoring so that if there's an issue with the rotation, you catch and fix it quickly before it causes more severe problems.

Note that postrotate commands or scripts are only executed when at least one file that matches the specified pattern was rotated. The sharedscripts directive above is used to specify that the commands in prerotate and postrotate blocks should be run only once no matter how many log files were rotated. Normally, the commands are run once per rotated log file which is not ideal in this scenario.

If prerotate or postrotate commands or scripts are not executing as expected, ensure that they have the correct permissions and are executable. You can use the chmod +x command to make the scripts or binaries executable where applicable. Additionally, double-check that the paths to the scripts are accurate and that any dependencies required by the scripts are installed.

Modifying Logrotate access permissions

As seen earlier in /etc/logrotate.conf, Logrotate performs its duties with the privileges of the root user and the adm group. This allows the tool to perform the log rotation operation with elevated permissions, typically required to access and manage system logs.

This also means that newly created log files by the tool will be owned by the root user and group, but this may sometimes prevent the application or service producing the logs from being able to access the file. In such situations, you need to modify your settings to ensure that the right access permissions are set on the file.

Hence, the create directive provides a few additional options:

/etc/logrotate.d/myapp

/var/log/myapp.log
{
    create 644 <user> <group>
}

In this example, when Logrotate creates a new log file (myapp.log) after rotation, it will set the file permissions to 644 (read-write for the owner, and read-only for the group and others). The file will be owned by myuser and assigned to mygroup.

Debugging Logrotate problems

You need to ensure that the Logrotate utility is running correctly at all times so that your scheduled log rotation tasks are executed as expected. If log files are not rotating as expected, it could be due to incorrect configuration or permissions issues.

To fix such problems, first check the Logrotate status file at /var/lib/logrotate/status to ensure that the log file is indeed included in the rotation schedule and to confirm when it was last rotated.

sudo cat /var/lib/logrotate/status

If a pattern that matches the log file is not included here, you may need to verify if a corresponding configuration file for the application or service is present in the /etc/logrotate.d/ directory.

The logrotate command also provides a helpful -d/--debug option to test and debug configuration issues by simulating log rotation without actually rotating the logs. For example, if you notice that the rotated logs are not being compressed and you run logrotate in debug mode, you may observe the following output indicating that the compress directive was misspelled:

sudo logrotate /etc/logrotate.d/logify --debug

Output

. . .
reading config file /etc/logrotate.d/logify
error: /etc/logrotate.d/logify:7 unknown option 'compresss' -- ignoring line
. . .

Another useful option is -v/--verbose which provides detailed output and information about the log rotation process. When enabled, Logrotate displays additional messages, including the files being rotated, the actions taken, and any errors or warnings encountered during the rotation.

If you're running logrotate through a cronjob, you can specify the --verbose option and redirect its standard output and standard error to a file using the syntax below:

0 * * * * /usr/sbin/logrotate -v /home/<user>/logify/logrotate.conf --state /home/<user>/logify/logrotate.state >> </path/to/logrotate.log> 2>&1

For the system cronjob, you must edit the logrotate script that's located in /etc/cron.daily/ by default. Note that when enabling verbose mode here, it'll include information about all logs being rotated on the system which can be pretty huge and mostly irrelevant. We recommend using the cron method shown above if you only care about the logs for a specific service or application.

/etc/cron.daily/logrotate

#!/bin/sh

# skip in favour of systemd timer
if [ -d /run/systemd/system ]; then
    exit 0
fi

# this cronjob persists removals (but not purges)
if [ ! -x /usr/sbin/logrotate ]; then
    exit 0
fi

/usr/sbin/logrotate -v /etc/logrotate.conf >> </path/to/logrotate.log> 2>&1

EXITVALUE=$?
if [ $EXITVALUE != 0 ]; then
    /usr/bin/logger -t logrotate "ALERT exited abnormally with [$EXITVALUE]"
fi
exit $EXITVALUE
~

The next time Logrotate executes, the logrotate.log file will be created in the specified directory and you'll find all the details of the log rotation. Here's some example output from a successful rotation attempt:

/path/to/logrotate.log

reading config file /home/betterstack/logify/logrotate.conf
acquired lock on state file /home/betterstack/logify/logrotate.stateReading state from file: /home/betterstack/logify/logrotate.state
Allocating hash table for state file, size 64 entries
Creating new state
Creating new state
Creating new state
Creating new state

Handling 1 logs

rotating pattern: /home/betterstack/logify/*.log
 forced from command line (7 rotations)
empty log files are not rotated, old logs are removed
considering log /home/betterstack/logify/log_records.log
  Now: 2023-04-30 09:19
  Last rotated at 2023-04-30 09:19
  log needs rotating
considering log /home/betterstack/logify/logrotate.log
  Now: 2023-04-30 09:19
  Last rotated at 2023-04-30 09:00
  log does not need rotating (log is empty)
rotating log /home/betterstack/logify/log_records.log, log->rotateCount is 7
dateext suffix '-2023043009'

. . .

Once you're collecting Logrotate logs as above, you can forward them to Logtail so that you can easily search for key events and receive alerts when an error is encountered.

Final thoughts

In this tutorial, we explored log rotation in Linux and its implementation using the Logrotate program. We began by examining the configuration files and discussing key directives commonly encountered. We then created a standard Logrotate configuration for a custom application and then transitioned to a system-independent configuration, before discussing some common problems with Logrotate and how to troubleshoot them effectively.

To further expand your knowledge of Logrotate and explore its full capabilities, I encourage you to consult its manual page. Simply run man logrotate in your terminal to access the comprehensive documentation.

Thanks for reading, and happy logging!

Source