Step1 : Create new user for hadoop hduser or other one like as on ubuntu/Red Hat
useradd hduser
passwd hduser
/*Type the password*/
Step2 : Create new group hadoop and add hduser to in that.
addgroup hadoop
adduser --ingroup hadoop hduser
Do all the following process from /home/hduser other permission denied problem occur at some steps.
Step 3 : Download hadoop tar file
Step 4 : Extract in /home/hduser/
Step 5 : Disable IPV6 as :
open /etc/sysctl.conf and add these lines in that :
# disable ipv6
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
Step 6 : Check whether IPv6 is enabled on your machine with the following command:
cat /proc/sys/net/ipv6/conf/all/disable_ipv6
Step 7 : Add the entry of all m/c in /etc/hosts. If you are not able to edit this, then change the permission of file from root to every one access permission
(chmod 777 /etc/hosts)
like as :
152.144.198.245 tarunrhels1
152.144.198.246 tarunrhels2
152.144.198.247 tarunrhels3
In this tarunrhels1,tarunrhels2 and tarunrhels3 are m/c names which are we using for hadoop cluster. It includes bot Namenodes and Datanodes.
Steps for Name node
DO all the above 7 steps for each m/c which will be use for hadoop.
For hadoop setup, we have to crate one Namenode and others Datanode.
Step 8 : If ssh is not running on m/c then first install ssh.
Generate ssh key for hduser as :
ssh-keygen -t rsa -P ""
Step 9 : Enable SSH access to local machine with this newly created key as :
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
Step 10 : Hadoop creates temporary directory both for the local file system and HDFS where it generate data files.
For local system create directory as :
mkdir -p /home/hduser/hdfs
Step 11 : Change the JAVA_HOME path in conf/hadoop-env.sh file according to java installed on linux m/c.
Step12 : Change the conf/core-site.xml as
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://152.144.198.245</value>
<description>The name of the default file system. Either the
literal string "local" or a host:port for NDFS.
</description>
<final>true</final>
</property>
</configuration>
Step 13 : Change conf/mapre-site.xml as :
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>152.144.198.245:50300</value>
<final>true</final>
</property>
<property>
<name>mapred.system.dir</name>
<value>/home/marvin1/mapred/system</value>
<final>true</final>
</property>
<property>
<name>mapred.local.dir</name>
<value>/home/marvin1/cache/mapred/local</value>
<final>true</final>
</property>
</configuration>
Step 14 : Change the conf/hdfs-site.xml as
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/home/hduser/hdfs/name</value>
<description>Determines where on the local filesystem the DFS name
node should store the name table. If this is a comma-delimited list
of directories then the name table is replicated in all of the
directories, for redundancy.
</description>
<final>true</final>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/hduser/hdfs/data</value>
<description>Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories,
then data will be stored in all named directories, typically on different devices.Directories that do not exist are ignored.
</description>
<final>true</final>
</property>
</configuration>
Step 15 : Add new file as conf/masters
In that masters file add the entry of Namenode. Master file entry will be ip address of m/c or localhost.
Step16 : Add new file as conf/slaves
In this slaves file add entry of slaves/Datanodes. In this add ip of each slave m/c. If we want to treat master node as Datanode also then add entry of master node in slaves also.
slaves file look like as :
152.144.198.245 tarunrhels1
152.144.198.246 tarunrhels2
152.144.198.247 tarunrhels3
Step 17 : To format the filesystem for hdfs run the command as
/home/hduser/hadoop/bin/hadoop namenode -format
###############################################################################
Do the following step at each Datanode.
Login to slave/datanode(152.144.198.246) and do these steps
Step 1 to Step 7 as mentioned above.
Step A : Copy conf/core-site.xml,conf/mapred-site.xml and conf/hdfs-site.xml from Namenode/Master m/c to Datanode/Slave m/c at same conf/core-site.xml,conf/mapred-site.xml and conf/hdfs-site.xml path by scp command.
Step B : Copy id_rsa.pub from master node to slave node through scp as
scp hduser@152.144.198.245:/home/hduser/.ssh/id_rsa.pub /home/hduser/.ssh/
Step C : Do this :
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
Login to master node again and do these steps
Step X_1 : Start hadoop processes from master node as :
/home/hduser/hadoop/bin/start-all.sh
Step X_2 : Check the processes through jps command, it should list down processes as:
50682 TaskTracker
50471 JobTracker
49554 Jps
50084 DataNode
50281 SecondaryNameNode
49881 NameNode
Step X_3 : Same jps command run at each slave/Datanode m/c. It should diplay results as :
62216 Jps
8122 TaskTracker
Tips :
T1) If jps command is not working, then add your java installation in PATH variable as :
export PATH = $PATH:/usr/java/jdk1.7.0_15/bin
jps helps to check the status of hadoop processes.
T2) check with netstat if Hadoop is listening on the configured port or not on master m/c as :
netstat -plten | grep java
If port you had added in conf/mapred-site.xml in some conflicting state by checking in above command. Then change the port number from conf/mapred-site.xml and start the hadoop process again. Then again check port status through above command.
T3) Sometime hadoop not able to write on hdfs it gives error of permission denied or java.io.IOException, then do these steps at master m/c.
Step a) Stop the hadoop process as
/home/hduser/hadoop/bin/stop-all.sh
Step b) delete data and name folder from /home/hduser/hdfs folder
Step c) start the hadoop again through /home/hduser/hadoop/bin/start-all.sh
Hopefully by this it will start working.
T4) Sometime face the proble of safemode. If safemode is on, hadoop start giving error. Then off the safemode by this :
/home/hduser/hadoop/bin/hadoop dfsadmin -safemode leave
hi ,you have gathered a valuable information on Hadoop...., and i am much impressed with the information and it is useful for Hadoop Learners.These blogs are valuable because these are providing such informative information for all the people.
ReplyDeleteHadoop Training in hyderabad
Uniqe informative article and of course True words, thanks for sharing. Today I see myself proud to be a hadoop professional with strong dedication and will power by blasting the obstacles. Thanks to Hadoop training institute in chennai
ReplyDeleteEach step by step process is clearly explaining. The overall explanation is very good for the beginners.Nice article.
ReplyDeleteCloud Computing Training in chennai | Cloud Computing Training chennai | Cloud Computing Course in chennai | Cloud Computing Course chennai
I learn a worthful information by this training.This makes very helpful for future reference.Thank you very much.
ReplyDeleteVMWare Training in chennai | VMWare Training chennai | VMWare course in chennai
I have read your post, it was good to read & I am getting some useful info's through your blog keep sharing...
ReplyDeleteJAVA Training in Chennai | JAVA Training Institutes in Chennai
Thanks for sharing your informative article on Hive ODBC Driver. Your article is very descriptive and assists me to learn whole concept in detail. Hadoop Training in Chennai
ReplyDeleteThanks for splitting your comprehension with us. It’s really useful to me & I hope it helps the people who in need of this vital information.
ReplyDeleteRegards,
ccna course in Chennai|ccna training in Chennai
ccna training institute in Chennai
Excellent Post, I welcome your interest about to post blogs. It will help many of them to update their skills in their interesting field.
ReplyDeleteRegards,
sas training in Chennai|sas course in Chennai|sas training institute in Chennai
Truely a very good article on how to handle the future technology. This content creates a new hope and inspiration within me. Thanks for sharing article like this. The way you have stated everything above is quite awesome. Keep blogging like this. Thanks :)
ReplyDeleteSoftware testing training in chennai | Testing training in chennai | Software testing course in chennai
Cloud computing is the next big thing, through cloud the users have the liberty to use a shared network. The companies can focus on core business parts rather than investing heavily on infrastucture.
ReplyDeletecloud computing training in chennai|cloud computing courses in chennai|cloud computing training
All are saying the same thing repeatedly, but in your blog I had a chance to get some useful and unique information, I love your writing style very much, I would like to suggest your blog in my dude circle, so keep on updates…
ReplyDeleteRegards
Angularjs training in chennai|Angularjs training chennai|Angularjs course in chennai|Angularjs training center in Chennai
This information is impressive. I am inspired with your post writing style & how continuously you describe this topic. Eagerly waiting for your new blog keep doing more.
ReplyDeleteIELTS Coaching Classes in Mumbai
IELTS Course in Mumbai
IELTS Institute in Mumbai
Best IELTS Coaching Classes in Mumbai
IELTS Coaching Center in Mumbai
Best IELTS Classes in Mumbai
This blog is more effective and it is very much useful for me.
ReplyDeletewe need more information please keep update more.
Cloud computing Training in Bangalore
Cloud computing courses in Anna Nagar
Cloud Computing Training in T nagar
Cloud Computing Training in OMR
What an awesome post, I just read it from start to end. Learned something new after a long time.
ReplyDeleteBig Data Hadoop Training in Tnagar
Big Data Hadoop Training in Nungambakkam
Big Data Hadoop Training in Saidapet
Big Data Hadoop Training in sholinganallur
Big Data Hadoop Training in navalur
Big Data Hadoop Training in kelambakkam
Very good blog, thanks for sharing such a wonderful blog with us. Keep sharing such worthy information to my vision.
ReplyDeleteccna Training in Chennai
ccna Training near me
ccna course in Chennai
RPA Training in Chennai
Angularjs Training in Chennai
AWS Training in Chennai
Wonderful piece of work. Master stroke. I have become a fan of your words. Pls keep on writing.
ReplyDeleteDrupal Training in Chennai
Drupal Software
Drupal Training
Drupal 8 Training
Drupal Classes
Drupal 7 Training
Drupal Certification Training
Drupal Training Course
Drupal 7 Certification
ReplyDeleteExtra-ordinary. The way you narrate the post makes it a exemplorary piece of work. Pls Keep writing.
Tableau Training in Chennai
Tableau Course in Chennai
Tableau Certification in Chennai
Tableau Training Institutes in Chennai
Tableau Certification
Tableau Training
Tableau Course
Very impressive to read the post
ReplyDeleteTableau training in chennai
Interesting to learn a lot about HAdoop.
ReplyDeletehonor service center chennai
honor service center in chennai
honor service centre chennai
honor service centre
So we always need to study around the things and the new part of educations with that we are not mindful.
ReplyDeleteBig Data Hadoop Training In Chennai | Big Data Hadoop Training In anna nagar | Big Data Hadoop Training In omr | Big Data Hadoop Training In porur | Big Data Hadoop Training In tambaram | Big Data Hadoop Training In velachery
hi thanku so much this infromation thanku so much
ReplyDeleteWordpress
milesweb-review
Thanks for posting such an wonderful post.
ReplyDeleteLinux training in Pune