DRBD (Distributed Replicated Block Device)
DRBD (Distributed Replicated Block Device) is a Linux software solution for replicating data between servers. It ensures data consistency and is often used in high-availability clusters to minimize downtime in case of hardware or software failures.
Preparing Disk Partition
On the SDA Physical Disk, we have 100 GB of free space, we’ll create a new partition named sda4 with the size of 50 GB for DRBD
To do that, run “fdisk /dev/sda”
The command “fdisk /dev/sda” is used to interact with the disk partitioning utility called fdisk on the /dev/sda device. This command allows us to create, modify, and manage disk partitions on the first SCSI or SATA hard drive (sda) in the system.
type ‘n’ to create new partition
1
2
3
4
5
6
7
8
9
10
Welcome to fdisk (util-linux 2.37.2).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.
This disk is currently in use - repartitioning is probably a bad idea.
It's recommended to umount all file systems, and swapoff all swap
partitions on this disk.
Command (m for help): n
Select partition number
1
2
Command (m for help): n
Partition number (4-128, default 4): 4
Select default for first and last sector
select ‘t’ to change partition ID, then select 4, and for partition alias type ‘8e’.
The hex code ‘8e’ is the code for a Linux LVM which is what we want this partition to be, as we will be joining it with the original /dev/sda4 Linux LVM
1
2
3
4
5
6
7
Command (m for help): t
Partition number (1-4, default 4): 4
Partition type or alias (type L to list all): 8e
Type of partition 4 is unchanged: Linux filesystem.
Command (m for help): w
Lastly type ‘w’ to write the changes
Run ‘lsblk’ to see the newly created partition
Do the same for the second node
Configuring DRBD
Install the DRBD
1
2
sudo apt update
sudo apt install drbd-utils -y
Then create the DRBD configuration
1
sudo nano /etc/drbd.conf
Set the synchronization type here
1
2
3
common {
protocol C;
}
- Protocol A (Async): Allows the primary node to acknowledge writes before they are synchronized with the secondary node. Offers better performance but may have some data consistency delay.
- Protocol B (Semi-Sync): Acknowledges writes only after they are confirmed on both primary and secondary nodes, striking a balance between data consistency and performance.
- Protocol C (Sync): Ensures strict data consistency by acknowledging writes only after they have been written to both the primary and secondary nodes. Provides the highest level of data integrity but may impact performance.
Then configure the nodes member
1
2
3
4
5
6
7
8
9
10
11
12
13
14
resource r0 {
on storage1 {
device /dev/drbd0;
disk /dev/sda4;
address 198.18.0.31:7788;
meta-disk internal;
}
on storage2 {
device /dev/drbd0;
disk /dev/sda4;
address 198.18.0.32:7788;
meta-disk internal;
}
}
Here’s what we end up with
Next create the metadata for a DRBD resource named “r0”
1
drbdadm create-md r0
Then start the DRBD service
1
systemctl start drbd
Now if we run “lsblk”, we can see the drbd0 device created
Run “cat /proc/drbd” to see the DRBD status
Repeat the same process for the second node until we have the same state
As we can see here, both nodes are in the secondary state. Run this command on node1 to initiate the sync and make it the primary node
1
drbdadm -- --overwrite-data-of-peer primary r0/0
Running “cat /proc/drbd” shows that this node has become the primary. We can also see the syncing process
The sync process can also bee seen on the secondary node
After some minutes, the sync process should complete
node1
node2
We can also run ‘drbdadm status’ to see the status
node1
node2
Next, create an ext4 file system on the DRBD block device
1
mkfs.ext4 /dev/drbd0
- Ext4 is a widely used and robust file system format for Linux
Mounting the DRBD Device
Now on the node1, create a directory to mount the device and mount the device
1
2
mkdir /opt/helenadrbd
mount /dev/drbd0 /opt/helenadrbd
Create a test file with this command
1
2
cd /opt/helenadrbd
touch test1.txt
Now let’s dismount the device from node1 and try mouting it on node2.
1
umount /opt/helenadrbd
Set the node1 to become secondary node
1
drbdadm secondary r0
Move over to the node2, set this as the primary node
1
drbdadm primary r0
And mount the DRBD device
1
2
mkdir /opt/helenadrbd
mount /dev/drbd0 /opt/helenadrbd
As we can see, the file inside the device is replicated and still show up intact on the second node.
Pacemaker
Pacemaker is a cluster resource manager and high-availability solution that can be used in conjunction with DRBD to create and manage highly available clusters. It provides automated failover and resource management capabilities for services and resources that use DRBD for data replication.
Pacemaker will help us automate the failover process that we had to do manually earlier, to use peacemaker first disable the service process of drbd
Then install pacemaker with these commands
1
2
apt install pacemaker pcs -y
apt install resource-agents -y
Configure the corosync conf file
Corosync ensures that all nodes in the cluster can communicate and synchronize their activities, making it a foundational component for Pacemaker to function effectively in a cluster environment.
1
nano /etc/corosync/corosync.conf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Cluster communication settings
totem {
version: 2 # Corosync protocol version
secauth: off # Security authentication is disabled
cluster_name: helenacluster # Cluster name
transport: udpu # UDP unicast transport
}
# Node configuration
nodelist {
node {
ring0_addr: 198.18.0.31 # First node's IP address
name: storage1 # First node's name
}
node {
ring0_addr: 198.18.0.32 # Second node's IP address
name: storage2 # Second node's name
}
}
# Quorum settings
quorum {
provider: corosync_votequorum # Quorum provider
two_node: 1 # Allow quorum with just two nodes
}
Restart the services for the config to take effect
1
2
systemctl restart corosync
systemctl restart pacemaker
Repeat the process for the second node, and after that run ‘crm status’ to see the pacemaker status
From now on, we configure the pacemaker cluster configuration only from the primary node.
The first ones are to disable quorum policy and STONITH, a mechanism for isolating failed nodes in the cluster
1
2
pcs property set no-quorum-policy=ignore
pcs property set stonith-enabled=false
Next we create the resource for the HA
1
pcs cluster cib drbdconf
Next create a Pacemaker resource named “p_drbd_r0” to manage a DRBD resource “r0.” This sets parameters for starting, stopping, and monitoring the resource, including timeout values for various operations.
1
2
3
4
5
6
pcs -f drbdconf resource create p_drbd_r0 ocf:linbit:drbd \
drbd_resource=r0 \
op start interval=0s timeout=240s \
stop interval=0s timeout=100s \
monitor interval=31s timeout=20s \
role=Unpromoted monitor interval=29s timeout=20s role=Promoted
Next run this command to configure the “p_drbd_r0” resource to be promotable in the Pacemaker cluster. It specifies constraints for promotion, clone, and notifications.
1
2
pcs -f drbdconf resource promotable p_drbd_r0 \
promoted-max=1 promoted-node-max=1 clone-max=2 clone-node-max=1 notify=true
Then push the config
1
pcs cluster cib-push drbdconf
Next, create a Pacemaker resource named “p_fs_drbd0” to manage a filesystem on a DRBD device (/dev/drbd0). The filesystem is mounted at the “/opt/helenadrbd” directory, using the ext4 filesystem type with specific options. It sets parameters for starting, stopping, and monitoring the resource.
1
2
3
4
5
6
pcs -f drbdconf resource create p_fs_drbd0 ocf:heartbeat:Filesystem \
device=/dev/drbd0 directory=/opt/helenadrbd fstype=ext4 \
options=noatime,nodiratime \
op start interval="0" timeout="60s" \
stop interval="0" timeout="60s" \
monitor interval="20" timeout="40s"
Then run these commands to define constraints for resource ordering and colocation in the Pacemaker cluster. They ensure that the “p_drbd_r0” resource is promoted before starting the “p_fs_drbd0” resource, and that they are colocated with the “Promoted” role.
1
2
pcs -f drbdconf constraint order promote p_drbd_r0-clone then start p_fs_drbd0
pcs -f drbdconf constraint colocation add p_fs_drbd0 with p_drbd_r0-clone INFINITY with-rsc-role=Promoted
Lastly, push the configuration
1
pcs cluster cib-push drbdconf
Run ‘crm status’ to see the pacemaker status
This cluster summary provides an overview of a Pacemaker cluster with two nodes (“storage1” and “storage2”). It is currently operational with quorum and manages several resources, including a promotable DRBD clone set and a filesystem resource. The current designated controller is “storage1” with a summary showing that the cluster is in a healthy and operational state.
Testing Pacemaker’s Auto Failover
Right now both nodes are up and node1 being the primary where the DRBD device is mounted on
Now we’re gonna simulate a node failure by running command “echo b > /proc/sysrq-trigger”. This will make the node1 immediately reboots
1
echo b > /proc/sysrq-trigger
And while the node1 is down, the node2 starts taking over as the primary node and automatically mount the DRBD device