Hadoop
What Is Apache Hadoop?
The Apache™ Hadoop™ project develops open-source software for reliable, scalable, distributed computing.
The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-avaiability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-availabile service on top of a cluster of computers, each of which may be prone to failures.
1. Download and Install from apache website
To start / stop hadoop use the following Commands /scripts on the hadoop installed directory:
a. bin/start-dfs.sh
b. bin/start-mapred.sh
c. bin/stop-mapred.sh
d. bin/stop-dfs.sh
2. Check http://localhost:50030/jobtracker.jsp for Hadoop Services . 1 Node should be available for execution of MAPREDUCE jobs.
Check http://localhost:50070/dfshealth.jsp for Namenode service.
3. Start Putty login to hadoop server : haddop-user/hadoop
4. Configuration file : hadoop-site.xml Set property there if required
The following settings are necessary to configure HDFS:
fs.default.name protocol://servername:port hdfs://master.ora.org:8000
dfs.data.dir pathname /home/username/hdfs/data
dfs.name.dir pathname /home/username/hdfs/name
5. One time format of HDFS is required for haddop to use the file system :
$ bin/hadoop namenode -format
What Is Map Reduce ?
The Mapreduce is a hadoop component which allows the programmer to split a large files into chunks and to group them together and store the output.
In SQL terminology Map is similler to Group by clause and Reduce can be similler to Union All clause.
A Mapreduce job splits the input data into independent chunks which are processed by the map tasks in a completely parallel manner. The framework sorts the outputs of the maps, which are then input to the reduce tasks. Typically both the input and the output of the job are stored in a file-system. The framework takes care of scheduling tasks, monitoring them and re-executes the failed tasks.Lets take an example to understand the mapreduce task :
File 1 :
Hello World Bye World
File 2 :
Hello World bye Data
the key value paise which is generated < <word>, 1>.
< Hello, 1>
< World, 1>
< Bye, 1>
< World, 1>
The second map emits:
< Hello, 1>
< World, 1>
< bye, 1>
< Data, 1>
The output of the first map:
< Bye, 1>
< Hello, 1>
< World, 2>
The output of the second map:
< Bye, 1>
< World, 1>
< Hello, 1>
<Data, 1 >
The final output of the reducer job can be :
< Bye, 2>
< Data, 1>
< Hello, 3>
< World, 2>
tutorial on Apache Hadoop is good.I am happy to found such helpful and fascinating post that is written in well manner. i actually enhanced my data when browse your post .thanks
ReplyDeleteHadoop Training in hyderabad
Thank you so much for sharing this great information. Today I stand as a successful hadoop certified professional. Thanks to Big Data Training Chennai
ReplyDeletevery nice !!! i have to learning a lot of information for this sites...Sharing for wonderful information.
ReplyDeleteAWS Training in chennai | AWS Training chennai | AWS course in chennai
very nice !!!
ReplyDeleteoracle-database-use-xml-db training in chennai
good.
ReplyDeleteunix trainng in chennai
This comment has been removed by the author.
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteI wish to show thanks to you just for bailing me out of this particular trouble.As a result of checking through the net and meeting techniques that were not productive, I thought my life was done.
ReplyDeleteJava Training Institute Bangalore
Best Java Training Institute Chennai
I believe there are many more pleasurable opportunities ahead for individuals that looked at your site.
ReplyDeleteBest Hadoop Training in Chennai
Great and interesting article to read.
ReplyDeleteHadoop training in Hyderabad
Hadoop training in Bangalore
You explained this topic so nicely with some awesome example. Great work, love it. Hadoop Pune
ReplyDeleteExcellent content!!! After reading your blog, I am curious to read the next part of the blog.
ReplyDeleteselenium Training in Chennai
Selenium Training Chennai
ios training institute in chennai
Digital Marketing Course in Chennai
.Net coaching centre in chennai
Future of testing professional
french course
French Training Institutes in Chennai
French Language Classes in Velachery
Awesome Blog!
ReplyDeleteJava Training in Chennai
Python Training in Chennai
IOT Training in Chennai
Selenium Training in Chennai
Data Science Training in Chennai
FSD Training in Chennai
MEAN Stack Training in Chennai
This was a worthy blog. I enjoyed reading this blog and got an idea about it. Keep sharing more like this.
ReplyDeleteIELTS Coaching in Anna Nagar
IELTS Coaching in Chennai Anna Nagar
IELTS Coaching in Adyar
IELTS Coaching in Chennai Adyar
IELTS Coaching in Porur
IELTS Coaching in T Nagar
The given information was excellent and useful. This is one of the excellent blog, I have come across. Do share more.
ReplyDeleteAWS Training in Chennai
DevOps Training in Chennai
Data Science Course in Chennai
ccna course in Chennai
Python Training in Chennai
R Programming Training in Chennai
Angularjs Training in Chennai
RPA Training in Chennai
Blue Prism Training in Chennai
Thank you so much for the sharing this kind of blogs.
ReplyDeletemobile repairing institute near me
mobile repairing institute in preet vihar
mobile repairing institute near me
Advance Hadoop training in Delhi every day technology changes and we are learning more technical things to support ourself in the computative world,so our Aptron Solutions provides more knowledge with real time projects and experienced staff, they have 10 to 20 years experience in training and numerous students are placed in top and best corporate companies.We give both online training and classroom training with student flexibility.
ReplyDeleteFor More Info: Hadoop Course in Delhi
Thanks a lot for sharing such a good source with all, i appreciate your efforts taken for the same. I found this worth sharing and must share this with all.
ReplyDeleteDot Net Training in Chennai | Dot Net Training in anna nagar | Dot Net Training in omr | Dot Net Training in porur | Dot Net Training in tambaram | Dot Net Training in velachery
Hi,Great information.Thanks for sharing.I always read your blog and get lot of useful information.
ReplyDeletehardware and networking training in chennai
hardware and networking training in porur
xamarin training in chennai
xamarin training in porur
ios training in chennai
ios training in porur
iot training in chennai
iot training in porur
Great Article
ReplyDeleteCloud Computing Projects
JavaScript Training in Chennai
JavaScript Training in Chennai
big data projects for students
The Angular Training covers a wide range of topics including Components, Angular Directives, Angular Services, Pipes, security fundamentals, Routing, and Angular programmability. The new Angular TRaining will lay the foundation you need to specialise in Single Page Application developer. Angular Training
There Are Many Complaints About XM REVIEW Broker In The Internet But You Should Read This Review Before Investing Your Money With Them. We Have Personally Tested XM Fx And Found It To Be A Scam, Avoid Them At All Costs!
ReplyDeleteHappy to read the informative blog. Thanks for sharing
ReplyDeletebest java training institute in chennai
best java training institute in chennai
Know the features, advantages and the major difference between Java and Python with the emphasized examples from the best software training institute in Chennai, Infycle Technologies. Dial +91-7504633633 or +91-7502633633 to know the best offers and get the free demo for the combo of Python + Java
ReplyDeleteThis post is so interactive and informative.keep update more information...
ReplyDeleteSalesforce Training in Tambaram
Salesforce Training in Anna Nagar