Integration of LVM With Hadoop Cluster

Shubhamkhandelwal
6 min readNov 4, 2020

--

What is LVM ?

Logical volume manager (LVM) introduces an extra layer between the physical disks and the file system allowing file systems to be :
* resized and moved easily and online without requiring a system-wide outage.
* Using discontinuous space on disk
* meaningful names to volumes, rather than the usual cryptic device names.
* span multiple physical disks.

The prefix of logical volume is “Lv”

What is Elasticity?

Do you remember what our college physics had taught us about elasticityLet me define it .The property of a substance that enables it to change its length, volume, or shape in direct response to a force effecting such a change and to recover its original form upon the removal of the force is called Elasticity.

What is Physical Volume ?

Physical Volume (PV) Each Physical Volume can be a disk partition, whole disk, meta-device, or a loopback file. Use the command pvcreate to initialize storage for use by LVM. Initializing a block device as physical volume places a label at the start of the device.

The prefix of physical volume is “Pv”

What Is a Volume Group?

A volume group (VG) is a logical collection, or purposeful grouping, of computer storage space that may span multiple physical devices. It combines both physical and logical volumes into a single administrative element within UNIX&reg and LINUX&reg logical volume management (LVM). A physical volume may be an internal or external storage device. Logical volumes are concatenations, or chained lists, of storage space that might join portions of several physical volumes. Under LINUX&reg, the volume group is the highest level of abstraction, or visualization, in its logical volume manager utility.

The prefix of VG is “Vg”

Let’s Start with the Task:-

Task Description 📄

🌀7.1: Elasticity Task

✱Integrating LVM with Hadoop and providing Elasticity to Data Node Storage

✱Increase or Decrease the Size of Static Partition in Linux.

Solution Part :-

First of All we have to connect the dataNode to the Hadoop cluster or Namenode . Here I have connected 1 datanode to the hadoop cluster . So you can see datanode is providing 40Gb to the cluster. But we want to give the limit storage , Let’s See all these things.

How to Connect the Datanode to the Namenode:

You can refer link given below for above requisite:-

https://shubhamkhandelwal523.medium.com/how-to-connect-datanode-slave-to-namenode-master-b42a8ae26092

Step 1: Add physical Hard disks to our datanode .

  • First attach all the 2 EBS to the Hadoop EC2 instances .
  • Here I have added two volumes :-

First is /dev/sdb (20GB)size and second is /dev/sdc (20GB)size.

  • we can check the connected disk using the given command “fdisk -l” or “lsblk”.

Step 2 : Convert that hard disks into physical volumes group (PV) can be create with physical volumes only .

  • To Convert Harddisk into physical volumes the command is as below :

pvcreate /dev/sdb(first disk) and /dev/sdc(second disk)

  • To check the physical volumes is created or not using the above command. we can check using the given command :

Pvdisplay /dev/sdb(first disk) and /dev/sdc(second disk)

Step 3 : Create volume group (VG) with physical volumes.

VG is used to attach the size of volume of different hard disk.

  • For creating VG use the below command :-

vgcreate vg_name /dev/sdb(first disk) and /dev/sdc(second disk)

And for checking VG it is created or not use the command below :-

Vgdisplay vg_name or vgdisplay

Step 4 : Create partition in volume group of size you want to contribute to namenode.

For creating the partition use the below command:-

lvcreate — size 10G — name lv_name vg_name(VG name that we given to the VG group.

And for checking the partition it is created or not use the command below :-

lvdisplay VG_name/LV_name

here the partition of 10GB

After creating the new partition , you have to format it .

Step 5: Format the partition

So , use the below command to format the volume.

mkfs.ext4 /dev/Vg_name/Lv_name

Here I format the partition

After formatting the partition , we have to mount the rest of volume to the Datanode.

Step 6:Mount that partition on datanode Folder

  • For mounting the partition ,we have to use the datanode folder that we have given to the Datanode value. or we can create your own folder using “mkdir dir_name”

For mounting the partition on folder “/dn6" use the below command :

mount /dev/VG_Name/LV_name /folder_name.

the partition is mounted or not we can check using the command “df -l”

Step 7: Start the datanode service and check the volume contribution to datanode .

For starting the datanode service used command :

hadoop-daemon.sh start datanode

And for checking the contribution used command :

hadoop dfsadmin — -report

here we can see the datanode contributes volumes of size 10GB
  • We can extend the size of datanode volume contribution to namenode on the fly i.e. without unmounting or stopping any services.
  • We can only able to increase the size only when space available currently in volume group .

Step 8: Extend the volume

For extending the volume use the below command :

lvextend — size +5G /dev/VG_name/Lv_name

and check again the volume is extended or not.

  • So here I have given 5Gb more to the volume, before that it was 10GB and after extending the volume the size of volume is 15Gb.

Step 9: Reformat the size

For format the extended part use the below command:

resize2fs /dec/Vg_name/Lv_name

and again check using command“df -h”

Step 10:Now ,again check the size of volume contribution of datanode to namenode.

hadoop dfsadmin -report

Also we can check the report of datanode through the namenode also ,using the same command “hadoop dfsadmin -report”

So here you can see , the partition is done and mounted successfully to the datanode folder.

Conclusion:

As you can see above you can distribute the amount of storage of datanode to the Hadoop cluster dymnamically and extend it on the fly whatever and how much we want . Also have a good idea of Logical Volume Management . And we came to know that LVM helps to provide Elasticity to the Storage Device using dynamic partition.

So, the Task is successfully done .

Thank you for reading.

LinkedIn :- linkedin.com/in/shubham-khandelwal-a04613144

Instagram:-https://www.instagram.com/shubham.006pvt/

--

--

No responses yet