Elasticity Task — LVM on Hadoop
Task 7.1: Elasticity Task
A. Integrating LVM with Hadoop and providing Elasticity to DataNode Storage.
So lets start!!
For this we will first check the configuration files of Hadoop NameNode and DataNode i.e, the core-site.xml file and the hdfs-site.xml file
1. For NameNode:
vim /etc/hadoop/core-site.xml
My NameNode IP is 192.168.43.97 which I have updated here, now in hdfs-site.xml file:
vim /etc/hadoop/hdfs-site.xml
Here I have first created a directory /nn and then allocated it to hdfs
Now we have to start the service:
hadoop-daemon.sh start namenode
You can confirm the service is running or not using jps command
2. For DataNode:
vim /etc/hadoop/core-site.xml
Here also you have to provide the namenode IP
vim /etc/hadoop/hdfs-site.xml
Here also I have first created a directory /dn1 and then allocated it to hdfs
Now we have to start the service:
hadoop-daemon.sh start datanode
You can check the service is running or not using jps command.
Now, you can check the report of hdfs cluster
hadoop dfsadmin -report
For now there is only 1 DataNode is connected and it is sharing around 50GiB of storage.
And here our task is to provide elasticity to this storage which DataNode is sharing using the help of LVM.
To accomplish this we have to follow these steps:
- Step 1: Add volume to the DataNode
- Step 2: Create a physical volume
- Step 3: Create a volume group
- Step 4: Create a logical volume
- Step 5: Format and mount the volume to the DataNode
So lets do it step by step
Step 1:
For adding volume to DataNode, I have created 2 volumes (of size 10Gib and 20Gib) and added it to the DataNode VM which are Slave-1_1.vdi and Slave-1_2.vdi
Now we can check if the volume of 10Gib and 20Gib are attached or not using command:
fdisk -l
Step 2:
Now we will create a physical volume of both the volumes which are /dev/sdb and /dev/sdc
pvcreate /dev/sdb
pvcreate /dev/sdc
Now we can confirm whether they are created or not by displaying the physical volumes.
pvdisplay
Step 3:
Now we will create a volume group in which we will add both the physical volumes which are /dev/sdb and /dev/sdc. Here I'm giving the volume group name as test and then we can display it using command vgdisplay
vgcreate test /dev/sdb /dev/sdc
vgdisplay
It is showing that one volume group is created of size 30GiB (10Gib + 20Gib)
Step 4:
Now we will create a logical volume of size 15Gib from the volume group test which is of size 30GiB. Here I'm giving the logical volume name as lv1 and then we can display it using command lvdisplay
lvcreate --size 15G --name lv1 test
lvdisplay
Step 5:
Now we will format the logical volume before mounting it
mkfs.ext4 /dev/test/lv1
Its formatted, now we can mount it to the datanode directory /dn1 and then we can check whether it is mounted or not using df -lh command
mount /dev/test/lv1 /dn1
df -lh
Almost done! now to implement this storage to the cluster we have to restart the DataNode
hadoop-daemon.sh stop datanode
hadoop-daemon.sh start datanode
jps
Now to check the storage sharing by DataNode in the cluster we can see the report
hadoop dfsadmin -report
Done!!
Thank you!! :)