Post

DRBD (Distributed Replicated Block Device)


DRBD (Distributed Replicated Block Device) is a Linux software solution for replicating data between servers. It ensures data consistency and is often used in high-availability clusters to minimize downtime in case of hardware or software failures.

x



Preparing Disk Partition

On the SDA Physical Disk, we have 100 GB of free space, we’ll create a new partition named sda4 with the size of 50 GB for DRBD

x


To do that, run “fdisk /dev/sda”

The command “fdisk /dev/sda” is used to interact with the disk partitioning utility called fdisk on the /dev/sda device. This command allows us to create, modify, and manage disk partitions on the first SCSI or SATA hard drive (sda) in the system.


type ‘n’ to create new partition

1
2
3
4
5
6
7
8
9
10
Welcome to fdisk (util-linux 2.37.2).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

This disk is currently in use - repartitioning is probably a bad idea.
It's recommended to umount all file systems, and swapoff all swap
partitions on this disk.


Command (m for help): n


Select partition number

1
2
Command (m for help): n
Partition number (4-128, default 4): 4


Select default for first and last sector


select ‘t’ to change partition ID, then select 4, and for partition alias type ‘8e’.
The hex code ‘8e’ is the code for a Linux LVM which is what we want this partition to be, as we will be joining it with the original /dev/sda4 Linux LVM

1
2
3
4
5
6
7
Command (m for help): t
Partition number (1-4, default 4): 4
Partition type or alias (type L to list all): 8e

Type of partition 4 is unchanged: Linux filesystem.

Command (m for help): w

Lastly type ‘w’ to write the changes

x


Run ‘lsblk’ to see the newly created partition

x


Do the same for the second node

x



Configuring DRBD

Install the DRBD

1
2
sudo apt update
sudo apt install drbd-utils -y


Then create the DRBD configuration

1
sudo nano /etc/drbd.conf


Set the synchronization type here

1
2
3
common {
    protocol C;
}
  • Protocol A (Async): Allows the primary node to acknowledge writes before they are synchronized with the secondary node. Offers better performance but may have some data consistency delay.
  • Protocol B (Semi-Sync): Acknowledges writes only after they are confirmed on both primary and secondary nodes, striking a balance between data consistency and performance.
  • Protocol C (Sync): Ensures strict data consistency by acknowledging writes only after they have been written to both the primary and secondary nodes. Provides the highest level of data integrity but may impact performance.


Then configure the nodes member

1
2
3
4
5
6
7
8
9
10
11
12
13
14
resource r0 {
    on storage1 {
        device /dev/drbd0;
        disk /dev/sda4;
        address 198.18.0.31:7788;
        meta-disk internal;
    }
    on storage2 {
        device /dev/drbd0;
        disk /dev/sda4;
        address 198.18.0.32:7788;
        meta-disk internal;
    }
}


Here’s what we end up with

x


Next create the metadata for a DRBD resource named “r0”

1
drbdadm create-md r0

x


Then start the DRBD service

1
systemctl start drbd


Now if we run “lsblk”, we can see the drbd0 device created

x


Run “cat /proc/drbd” to see the DRBD status

x


Repeat the same process for the second node until we have the same state

x


As we can see here, both nodes are in the secondary state. Run this command on node1 to initiate the sync and make it the primary node

1
drbdadm -- --overwrite-data-of-peer primary r0/0


Running “cat /proc/drbd” shows that this node has become the primary. We can also see the syncing process

x


The sync process can also bee seen on the secondary node

x


After some minutes, the sync process should complete

node1

x

node2

x


We can also run ‘drbdadm status’ to see the status

node1

x

node2

x


Next, create an ext4 file system on the DRBD block device

1
mkfs.ext4 /dev/drbd0 

x

  • Ext4 is a widely used and robust file system format for Linux



Mounting the DRBD Device

Now on the node1, create a directory to mount the device and mount the device

1
2
mkdir /opt/helenadrbd
mount /dev/drbd0 /opt/helenadrbd


Create a test file with this command

1
2
cd /opt/helenadrbd
touch test1.txt

x


Now let’s dismount the device from node1 and try mouting it on node2.

1
umount /opt/helenadrbd


Set the node1 to become secondary node

1
drbdadm secondary r0

x


Move over to the node2, set this as the primary node

1
drbdadm primary r0

x


And mount the DRBD device

1
2
mkdir /opt/helenadrbd
mount /dev/drbd0 /opt/helenadrbd

x

As we can see, the file inside the device is replicated and still show up intact on the second node.



Pacemaker

Pacemaker is a cluster resource manager and high-availability solution that can be used in conjunction with DRBD to create and manage highly available clusters. It provides automated failover and resource management capabilities for services and resources that use DRBD for data replication.

Pacemaker will help us automate the failover process that we had to do manually earlier, to use peacemaker first disable the service process of drbd

x


Then install pacemaker with these commands

1
2
apt install pacemaker pcs -y
apt install resource-agents -y


Configure the corosync conf file

Corosync ensures that all nodes in the cluster can communicate and synchronize their activities, making it a foundational component for Pacemaker to function effectively in a cluster environment.

1
nano /etc/corosync/corosync.conf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Cluster communication settings
totem {
    version: 2          # Corosync protocol version
    secauth: off        # Security authentication is disabled
    cluster_name: helenacluster   # Cluster name
    transport: udpu     # UDP unicast transport
}

# Node configuration
nodelist {
    node {
        ring0_addr: 198.18.0.31  # First node's IP address
        name: storage1           # First node's name
    }
    node {
        ring0_addr: 198.18.0.32  # Second node's IP address
        name: storage2           # Second node's name
    }
}

# Quorum settings
quorum {
    provider: corosync_votequorum   # Quorum provider
    two_node: 1                     # Allow quorum with just two nodes
}


Restart the services for the config to take effect

1
2
systemctl restart corosync
systemctl restart pacemaker


Repeat the process for the second node, and after that run ‘crm status’ to see the pacemaker status

x


From now on, we configure the pacemaker cluster configuration only from the primary node.

The first ones are to disable quorum policy and STONITH, a mechanism for isolating failed nodes in the cluster

1
2
pcs property set no-quorum-policy=ignore
pcs property set stonith-enabled=false


Next we create the resource for the HA

1
pcs cluster cib drbdconf  


Next create a Pacemaker resource named “p_drbd_r0” to manage a DRBD resource “r0.” This sets parameters for starting, stopping, and monitoring the resource, including timeout values for various operations.

1
2
3
4
5
6
pcs -f drbdconf resource create p_drbd_r0 ocf:linbit:drbd \
drbd_resource=r0 \
op start interval=0s timeout=240s \
stop interval=0s timeout=100s \
monitor interval=31s timeout=20s \
role=Unpromoted monitor interval=29s timeout=20s role=Promoted


Next run this command to configure the “p_drbd_r0” resource to be promotable in the Pacemaker cluster. It specifies constraints for promotion, clone, and notifications.

1
2
pcs -f drbdconf resource promotable p_drbd_r0 \
promoted-max=1 promoted-node-max=1 clone-max=2 clone-node-max=1 notify=true


Then push the config

1
pcs cluster cib-push drbdconf

x


Next, create a Pacemaker resource named “p_fs_drbd0” to manage a filesystem on a DRBD device (/dev/drbd0). The filesystem is mounted at the “/opt/helenadrbd” directory, using the ext4 filesystem type with specific options. It sets parameters for starting, stopping, and monitoring the resource.

1
2
3
4
5
6
pcs -f drbdconf resource create p_fs_drbd0 ocf:heartbeat:Filesystem \
device=/dev/drbd0 directory=/opt/helenadrbd fstype=ext4 \
options=noatime,nodiratime \
op start interval="0" timeout="60s" \
stop interval="0" timeout="60s" \
monitor interval="20" timeout="40s" 


Then run these commands to define constraints for resource ordering and colocation in the Pacemaker cluster. They ensure that the “p_drbd_r0” resource is promoted before starting the “p_fs_drbd0” resource, and that they are colocated with the “Promoted” role.

1
2
pcs -f drbdconf constraint order promote p_drbd_r0-clone then start p_fs_drbd0
pcs -f drbdconf constraint colocation add p_fs_drbd0 with p_drbd_r0-clone INFINITY with-rsc-role=Promoted


Lastly, push the configuration

1
pcs cluster cib-push drbdconf

x


Run ‘crm status’ to see the pacemaker status

x

This cluster summary provides an overview of a Pacemaker cluster with two nodes (“storage1” and “storage2”). It is currently operational with quorum and manages several resources, including a promotable DRBD clone set and a filesystem resource. The current designated controller is “storage1” with a summary showing that the cluster is in a healthy and operational state.



Testing Pacemaker’s Auto Failover

Right now both nodes are up and node1 being the primary where the DRBD device is mounted on

x


Now we’re gonna simulate a node failure by running command “echo b > /proc/sysrq-trigger”. This will make the node1 immediately reboots

1
echo b > /proc/sysrq-trigger


And while the node1 is down, the node2 starts taking over as the primary node and automatically mount the DRBD device

x


This post is licensed under CC BY 4.0 by the author.