How To Connect Datanode(Slave)To NameNode(Master)

Shubhamkhandelwal
5 min readNov 4, 2020

Master node

This node manages all services and operations. One Master node is enough for a cluster but having the secondary one increases scalability and high availability. The main operation Master node does is, running NameNode process that cordinates Hadoop storage operations.

Slave node

This node provides required infrastructure such as CPU, memory and local disk for storing and processing data. This does all slave processes; the main is running DataNode process. Generally cluster comprises at least three Slave nodes but cluster can be easily scaled up by adding many number of Salve nodes.

Namenode

This node is a part of Master node and responsible for coordinating HDFS functions. For an example, when a location of a file block is requested, Master node gets the location from the Namenode process.

Datanode

Datanode is a process that handles actual reading and writing of data blocks from/to storage. This is under Slave node and it is a slave for Namenode.

Steps :-

First we have to configure the HDFS cluster inside the Datanode and the Namenode.

  • For this we need of hadoop and java software to setup the hadoop cluster.
  • For downloading both the hadoop and java software use the below link.

Step 1: Installation of hadoop and java S/W

For installing the hadoop and java software inside the cluster use the below command :

Note:- It is same for both the datanode and the namenode

Rpm -ivh hadoop-1.2.1–1.x86_64.rpm — force

Installation of hadoop software

Rpm -ivh jdk-8u171-linux-x86.rpm — force

Installation of java software

Inside NameNode :

Step 2: Configuration of “hdfs-site.xml” and “core-site. xml”.

Now we will configure the “hdfs-site.xml” and “core-site. xml” inside the “cd /etc/hadoop”

  • Inside the “hdfs-site.xml” we will give the below code :

<property>

<name>dfs.name.dir</name>

<value>/nn3</value>

</property>

The name I have given inside the <name> </name>, the same name we have to give all the Namenodes . But we can give any name inside the <value></value> I have given “/nn3". You can give any eg. /nn,/nn1 etc.

namenode hdfs-site.xml configuration
  • Inside the “core-site.xml” we will give the below code :

<property>

<name>fs.default.name</name>

<value>hdfs://0.0.0.0:9001</value>

</property>

Same for “core-site.xml” , same name we have to give that I have given to and in the <value></value> we have to give the IP of datanode i.e. Private or Public IP.

Inside Namenode we gave neutral-IP 0.0.0.0 you can say that it is a default gateway to reach/connect to any other system IP both privately and publicly.

In my case I am using Port : 9001 you can give any port number.

namenode core-site.xml configuration
  • Before connecting any data node or using any storage we need to format NameNode directory using:

hadoop namenode -format

Step 3: Start Namenode services

  • To start/stop the Namenode services use below command:

hadoop-daemon.sh start namenode

hadoop-daemon.sh stop namenode

we can verify it by using “Jps” command, the namenode services started or not.

we can check through “netstat -tnlp” command on which port it is available.

  • Here, All the Namenode services is started .
  • Now we will go to the Datanode Services.

Inside DataNode :

Step 4: Configuration of “hdfs-site.xml” and “core-site. xml”.

Same, we will configure the “hdfs-site.xml” and “core-site. xml” in the “cd /etc/hadoop” in datanode.

  • Inside the “hdfs-site.xml” we will give the below code :

<property>

<name>dfs.data.dir</name>

<value>/dn6</value>

</property>

The name I have given inside the <name> </name>, the same name we have to give all the datanodes . But we can give any name inside the <value></value> I have given “/dn6”. You can give any eg. /dn,/dn1 etc.

datanode hdfs-site.xml configuration
  • Inside the “core-site.xml” we will give the below code :

<property>

<name>fs.default.name</name>

<value>hdfs://192.168.1.2:9001</value>

</property>1

In the <value></value> we will give the IP of namenode and Port Number :9001, that we gave to the Namenode.

datanode core-site.xml configuration

Before connecting datanode to the namenode or using any storage we need to format Datanode directory using:

hadoop datanode -format

Step 5: Start DataNode services.

  • To start/stop the datanode services use below command:

hadoop-daemon.sh start datanode

hadoop-daemon.sh stop datanode

we can verify it by using “Jps” command, the datanode services started or not.

  • Here, All the Datanode services started .

Hence, HDFS i.e hadoop cluster is configured successfully👍🏻 we can verify by using command :

hadoop dfsadmin -report

Here , You can see the “datanode” is connected to the “namenode”

cluster is configured

we can check the report in both the namenode and datanode using the same command “hadoop dfsadmin -report”.

Thank you for reading.😃

  • Here I have explained , every concept of hadoop cluster how to connect.

LinkedIn :- linkedin.com/in/shubham-khandelwal-a04613144

Instagram:-https://www.instagram.com/shubham.006pvt

--

--