DRBD (Distributed Replicated Block Device)

Posted Oct 13, 2023

By Sena Perdiana 8 min read

DRBD (Distributed Replicated Block Device) is a Linux software solution for replicating data between servers. It ensures data consistency and is often used in high-availability clusters to minimize downtime in case of hardware or software failures.

Preparing Disk Partition

On the SDA Physical Disk, we have 100 GB of free space, we’ll create a new partition named sda4 with the size of 50 GB for DRBD

To do that, run “fdisk /dev/sda”

The command “fdisk /dev/sda” is used to interact with the disk partitioning utility called fdisk on the /dev/sda device. This command allows us to create, modify, and manage disk partitions on the first SCSI or SATA hard drive (sda) in the system.

type ‘n’ to create new partition

  
Welcome to fdisk (util-linux 2.37.2).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

This disk is currently in use - repartitioning is probably a bad idea.
It's recommended to umount all file systems, and swapoff all swap
partitions on this disk.


Command (m for help): n

Select partition number

  
Command (m for help): n
Partition number (4-128, default 4): 4

Select default for first and last sector

select ‘t’ to change partition ID, then select 4, and for partition alias type ‘8e’.
The hex code ‘8e’ is the code for a Linux LVM which is what we want this partition to be, as we will be joining it with the original /dev/sda4 Linux LVM

  
Command (m for help): t
Partition number (1-4, default 4): 4
Partition type or alias (type L to list all): 8e

Type of partition 4 is unchanged: Linux filesystem.

Command (m for help): w

Lastly type ‘w’ to write the changes

Run ‘lsblk’ to see the newly created partition

Do the same for the second node

Configuring DRBD

Install the DRBD

  
sudo apt update
sudo apt install drbd-utils -y

Then create the DRBD configuration

sudo nano /etc/drbd.conf

Set the synchronization type here

  
common {
    protocol C;
}

Protocol A (Async): Allows the primary node to acknowledge writes before they are synchronized with the secondary node. Offers better performance but may have some data consistency delay.
Protocol B (Semi-Sync): Acknowledges writes only after they are confirmed on both primary and secondary nodes, striking a balance between data consistency and performance.
Protocol C (Sync): Ensures strict data consistency by acknowledging writes only after they have been written to both the primary and secondary nodes. Provides the highest level of data integrity but may impact performance.

Then configure the nodes member

  
resource r0 {
    on storage1 {
        device /dev/drbd0;
        disk /dev/sda4;
        address 198.18.0.31:7788;
        meta-disk internal;
    }
    on storage2 {
        device /dev/drbd0;
        disk /dev/sda4;
        address 198.18.0.32:7788;
        meta-disk internal;
    }
}

Here’s what we end up with

Next create the metadata for a DRBD resource named “r0”

drbdadm create-md r0

Then start the DRBD service

systemctl start drbd

Now if we run “lsblk”, we can see the drbd0 device created

Run “cat /proc/drbd” to see the DRBD status

Repeat the same process for the second node until we have the same state

As we can see here, both nodes are in the secondary state. Run this command on node1 to initiate the sync and make it the primary node

drbdadm -- --overwrite-data-of-peer primary r0/0

Running “cat /proc/drbd” shows that this node has become the primary. We can also see the syncing process

The sync process can also bee seen on the secondary node

After some minutes, the sync process should complete

node1

node2

We can also run ‘drbdadm status’ to see the status

node1

node2

Next, create an ext4 file system on the DRBD block device

mkfs.ext4 /dev/drbd0 

Ext4 is a widely used and robust file system format for Linux

Mounting the DRBD Device

Now on the node1, create a directory to mount the device and mount the device

mkdir /opt/helenadrbd
mount /dev/drbd0 /opt/helenadrbd

Create a test file with this command

cd /opt/helenadrbd
touch test1.txt

Now let’s dismount the device from node1 and try mouting it on node2.

umount /opt/helenadrbd

Set the node1 to become secondary node

drbdadm secondary r0

Move over to the node2, set this as the primary node

drbdadm primary r0

And mount the DRBD device

mkdir /opt/helenadrbd
mount /dev/drbd0 /opt/helenadrbd

As we can see, the file inside the device is replicated and still show up intact on the second node.

Pacemaker

Pacemaker is a cluster resource manager and high-availability solution that can be used in conjunction with DRBD to create and manage highly available clusters. It provides automated failover and resource management capabilities for services and resources that use DRBD for data replication.

Pacemaker will help us automate the failover process that we had to do manually earlier, to use peacemaker first disable the service process of drbd

Then install pacemaker with these commands

  
apt install pacemaker pcs -y
apt install resource-agents -y

Configure the corosync conf file

Corosync ensures that all nodes in the cluster can communicate and synchronize their activities, making it a foundational component for Pacemaker to function effectively in a cluster environment.

nano /etc/corosync/corosync.conf

  
# Cluster communication settings
totem {
    version: 2          # Corosync protocol version
    secauth: off        # Security authentication is disabled
    cluster_name: helenacluster   # Cluster name
    transport: udpu     # UDP unicast transport
}

# Node configuration
nodelist {
    node {
        ring0_addr: 198.18.0.31  # First node's IP address
        name: storage1           # First node's name
    }
    node {
        ring0_addr: 198.18.0.32  # Second node's IP address
        name: storage2           # Second node's name
    }
}

# Quorum settings
quorum {
    provider: corosync_votequorum   # Quorum provider
    two_node: 1                     # Allow quorum with just two nodes
}

Restart the services for the config to take effect

systemctl restart corosync
systemctl restart pacemaker

Repeat the process for the second node, and after that run ‘crm status’ to see the pacemaker status

From now on, we configure the pacemaker cluster configuration only from the primary node.

The first ones are to disable quorum policy and STONITH, a mechanism for isolating failed nodes in the cluster

  
pcs property set no-quorum-policy=ignore
pcs property set stonith-enabled=false

Next we create the resource for the HA

pcs cluster cib drbdconf  

Next create a Pacemaker resource named “p_drbd_r0” to manage a DRBD resource “r0.” This sets parameters for starting, stopping, and monitoring the resource, including timeout values for various operations.

  
pcs -f drbdconf resource create p_drbd_r0 ocf:linbit:drbd \
drbd_resource=r0 \
op start interval=0s timeout=240s \
stop interval=0s timeout=100s \
monitor interval=31s timeout=20s \
role=Unpromoted monitor interval=29s timeout=20s role=Promoted

Next run this command to configure the “p_drbd_r0” resource to be promotable in the Pacemaker cluster. It specifies constraints for promotion, clone, and notifications.

  
pcs -f drbdconf resource promotable p_drbd_r0 \
promoted-max=1 promoted-node-max=1 clone-max=2 clone-node-max=1 notify=true

Then push the config

pcs cluster cib-push drbdconf

Next, create a Pacemaker resource named “p_fs_drbd0” to manage a filesystem on a DRBD device (/dev/drbd0). The filesystem is mounted at the “/opt/helenadrbd” directory, using the ext4 filesystem type with specific options. It sets parameters for starting, stopping, and monitoring the resource.

  
pcs -f drbdconf resource create p_fs_drbd0 ocf:heartbeat:Filesystem \
device=/dev/drbd0 directory=/opt/helenadrbd fstype=ext4 \
options=noatime,nodiratime \
op start interval="0" timeout="60s" \
stop interval="0" timeout="60s" \
monitor interval="20" timeout="40s" 

Then run these commands to define constraints for resource ordering and colocation in the Pacemaker cluster. They ensure that the “p_drbd_r0” resource is promoted before starting the “p_fs_drbd0” resource, and that they are colocated with the “Promoted” role.

  
pcs -f drbdconf constraint order promote p_drbd_r0-clone then start p_fs_drbd0
pcs -f drbdconf constraint colocation add p_fs_drbd0 with p_drbd_r0-clone INFINITY with-rsc-role=Promoted

Lastly, push the configuration

pcs cluster cib-push drbdconf

Run ‘crm status’ to see the pacemaker status

This cluster summary provides an overview of a Pacemaker cluster with two nodes (“storage1” and “storage2”). It is currently operational with quorum and manages several resources, including a promotable DRBD clone set and a filesystem resource. The current designated controller is “storage1” with a summary showing that the cluster is in a healthy and operational state.

Testing Pacemaker’s Auto Failover

Right now both nodes are up and node1 being the primary where the DRBD device is mounted on

Now we’re gonna simulate a node failure by running command “echo b > /proc/sysrq-trigger”. This will make the node1 immediately reboots

echo b > /proc/sysrq-trigger

And while the node1 is down, the node2 starts taking over as the primary node and automatically mount the DRBD device

Storage, Distributed Replicated Block Device (DRBD)

This post is licensed under CC BY 4.0 by the author.