¡¡MapReduceÊÇÒ»ÖÖ·Ö²¼Ê½¼ÆËãÄ£ÐÍ£¬ÓÉGoogleÌá³ö£¬Ö÷ÒªÓÃÓÚËÑË÷ÁìÓò£¬½â¾öº£Á¿Êý¾ÝµÄ¼ÆËãÎÊÌâ.¶ÔÓÚÒµ
½çµÄ´óÊý¾Ý´æ´¢¼°·Ö²¼Ê½´¦ÀíϵͳÀ´ËµHadoop2Ìá³öµÄÐÂMapReudce¾ÍÊÇYARN: A framework
for job scheduling and cluster resource management.
1.MapReduceµÄ¼òµ¥¸ÅÄî
°Ù¶È°Ù¿Æ:MapReduceÊÇÒ»ÖÖ±à³ÌÄ£ÐÍ£¬ÓÃÓÚ´ó¹æÄ£Êý¾Ý¼¯£¨´óÓÚ1TB£©µÄ²¢ÐÐÔËËã¡£¸ÅÄî"Map£¨Ó³É䣩"ºÍ"Reduce£¨¹éÔ¼£©"£¬ºÍËûÃǵÄÖ÷Ҫ˼Ï룬¶¼ÊÇ´Óº¯Êýʽ±à³ÌÓïÑÔÀï½èÀ´µÄ£¬»¹ÓдÓʸÁ¿±à³ÌÓïÑÔÀï½èÀ´µÄÌØÐÔ¡£Ëü¼«´óµØ·½±ãÁ˱à³ÌÈËÔ±ÔÚ²»»á·Ö²¼Ê½²¢Ðбà³ÌµÄÇé¿öÏ£¬½«×Ô¼ºµÄ³ÌÐòÔËÐÐÔÚ·Ö²¼Ê½ÏµÍ³ÉÏ¡£
µ±Ç°µÄÈí¼þʵÏÖÊÇÖ¸¶¨Ò»¸öMap£¨Ó³É䣩º¯Êý£¬ÓÃÀ´°ÑÒ»×é¼üÖµ¶ÔÓ³Éä³ÉÒ»×éеļüÖµ¶Ô£¬Ö¸¶¨²¢·¢µÄReduce£¨¹éÔ¼£©º¯Êý£¬ÓÃÀ´±£Ö¤ËùÓÐÓ³ÉäµÄ¼üÖµ¶ÔÖеÄÿһ¸ö¹²ÏíÏàͬµÄ¼ü×é¡£ÖÁÓÚʲôÊǺ¯Êýʽ±à³ÌÓïÑÔºÍʸÁ¿±à³ÌÓïÑÔ£¬×Ô¼ºÒ²¸ãµÃ²»Ì«Çå³þ£¬¼û½âÊÍÁ´½Ó:
http://www.cnblogs.com/kym/archive/2011/03/07/1976519.html.
×Ô¼ºµÄÀí½â:MapReduceÊÇÒ»ÖÖ·Ö²¼Ê½¼ÆËãÄ£ÐÍ£¬ÓÉGoogleÌá³ö£¬Ö÷ÒªÓÃÓÚËÑË÷ÁìÓò£¬½â¾öº£Á¿Êý¾ÝµÄ¼ÆËãÎÊÌâ.µ±ÄãÏòMapReduce
¿ò¼ÜÌá½»Ò»¸ö¼ÆËã×÷ҵʱ£¬Ëü»áÊ×ÏȰѼÆËã×÷Òµ²ð·Ö³ÉÈô¸É¸öMap ÈÎÎñ£¬È»ºó·ÖÅäµ½²»Í¬µÄ½ÚµãÉÏÈ¥Ö´ÐУ¬Ã¿Ò»¸öMap
ÈÎÎñ´¦ÀíÊäÈëÊý¾ÝÖеÄÒ»²¿·Ö£¬µ±Map ÈÎÎñÍê³Éºó£¬Ëü»áÉú³ÉһЩÖмäÎļþ£¬ÕâЩÖмäÎļþ½«»á×÷ΪReduce
ÈÎÎñµÄÊäÈëÊý¾Ý¡£Reduce ÈÎÎñµÄÖ÷ҪĿ±ê¾ÍÊǰÑÇ°ÃæÈô¸É¸öMap µÄÊä³ö»ã×ܵ½Ò»Æð²¢Êä³ö.¾ÍÊÇ˵HDFSÒѾΪÎÒÃÇÌṩÁ˸ßÐÔÄÜ¡¢¸ß²¢·¢µÄ·þÎñ£¬µ«ÊDz¢Ðбà³Ì¿É²»ÊÇËùÓгÌÐòÔ±¶¼ÍæµÃתµÄ»î¶ù£¬Èç¹ûÎÒÃǵÄÓ¦Óñ¾Éí²»Äܲ¢·¢£¬ÄÇHadoopµÄHDFSÒ²¶¼ÊÇûÓÐÒâÒåµÄ¡£MapReduceµÄΰ´óÖ®´¦¾ÍÔÚÓÚÈò»ÊìϤ²¢Ðбà³ÌµÄ³ÌÐòÔ±(±ÈÈçÏñÎÒÕâµÄ)Ò²Äܳä·Ö·¢»Ó·Ö²¼Ê½ÏµÍ³µÄÍþÁ¦¡£ÕâÀï˵Ã÷ÒÔÏÂ:Hadoop±¾ÉíÕâ¸ö¿ò¼Ü¾ÍÊÇÑóÈË»ùÓÚÑóÈ˹«Ë¾¹È¸èµÄÈý´óÂÛÎÄGFS,BigTable,MapReduce(±à³ÌÄ£ÐÍ),ÓÃJavaÓïÑÔʵÏֵĿò¼Ü.¹È¸èËü¾ÍÓõÄC++ʵÏÖ,¶øMapReduce±à³ÌÄ£ÐÍ£¨ÊǸ߶ȳéÏóµÄ£©´óÌåÀë²»¿ªÏÂÃæÕâÕÅͼ.Spark²¢ÐÐÔËËã¿ò¼Ü(ºÍHadoopµÄMapReduce)µÄ²»Í¬µã:ÔÚÓÚËü½«Öмä½á¹û¼´mapº¯Êý½á¹ûÖ±½Ó·ÅÈëÄÚ´æÖÐ,¶ø²»ÊÇ·ÅÈë±¾µØ´ÅÅ̵ÄHDFSÖÐ.ÕâЩ¶¼²»ÊÇÖØµã,ÖØµãÊÇÏÂÃæÍ¼µÄÁ÷³Ì:

ÉÏͼÊÇÂÛÎÄÀï¸ø³öµÄÁ÷³Ìͼ¡£Ò»Çж¼ÊÇ´Ó×îÉÏ·½µÄuser program¿ªÊ¼µÄ£¬user
programÁ´½ÓÁËMapReduce¿â£¬ÊµÏÖÁË×î»ù±¾µÄMapº¯ÊýºÍReduceº¯Êý¡£Í¼ÖÐÖ´ÐеÄ˳Ðò¶¼ÓÃÊý×Ö±ê¼ÇÁË¡£
1.MapReduce¿âÏȰÑuser programµÄÊäÈëÎļþ»®·ÖΪM·Ý£¨MΪÓû§¶¨Ò壩£¬Ã¿Ò»·Ýͨ³£ÓÐ16MBµ½64MB£¬Èçͼ×ó·½Ëùʾ·Ö³ÉÁËsplit0~4£»È»ºóʹÓÃfork½«Óû§½ø³Ì¿½±´µ½¼¯ÈºÄÚÆäËü»úÆ÷ÉÏ¡£
2.user programµÄ¸±±¾ÖÐÓÐÒ»¸ö³ÆÎªmaster£¬ÆäÓà³ÆÎªworker£¬masterÊǸºÔðµ÷¶ÈµÄ£¬Îª¿ÕÏÐworker·ÖÅä×÷Òµ£¨Map×÷Òµ3»òÕßReduce×÷Òµ£©£¬workerµÄÊýÁ¿Ò²ÊÇ¿ÉÒÔÓÉÓû§Ö¸¶¨µÄ¡£
3.±»·ÖÅäÁËMap×÷ÒµµÄworker£¬¿ªÊ¼¶ÁÈ¡¶ÔÓ¦·ÖƬµÄÊäÈëÊý¾Ý£¬Map×÷ÒµÊýÁ¿ÊÇÓÉM¾ö¶¨µÄ£¬ºÍsplitÒ»Ò»¶ÔÓ¦£»Map×÷Òµ´ÓÊäÈëÊý¾ÝÖгéÈ¡³ö¼üÖµ¶Ô£¬Ã¿Ò»¸ö¼üÖµ¶Ô¶¼×÷Ϊ²ÎÊý´«µÝ¸ømapº¯Êý£¬mapº¯Êý²úÉúµÄÖмä¼üÖµ¶Ô±»»º´æÔÚÄÚ´æÖС£
4.»º´æµÄÖмä¼üÖµ¶Ô»á±»¶¨ÆÚдÈë±¾µØ´ÅÅÌ£¬¶øÇÒ±»·ÖΪR¸öÇø£¬RµÄ´óСÊÇÓÉÓû§¶¨ÒåµÄ£¬½«À´Ã¿¸öÇø»á¶ÔÓ¦Ò»¸öReduce×÷Òµ£»ÕâЩÖмä¼üÖµ¶ÔµÄλÖûᱻͨ±¨¸ømaster£¬master¸ºÔð½«ÐÅϢת·¢¸øReduce
worker¡£
5.master֪ͨ·ÖÅäÁËReduce×÷ÒµµÄworkerËü¸ºÔðµÄ·ÖÇøÔÚʲôλÖ㨿϶¨²»Ö¹Ò»¸öµØ·½£¬Ã¿¸öMap×÷Òµ²úÉúµÄÖмä¼üÖµ¶Ô¶¼¿ÉÄÜÓ³Éäµ½ËùÓÐR¸ö²»Í¬·ÖÇø£©£¬µ±Reduce
worker°ÑËùÓÐËü¸ºÔðµÄÖмä¼üÖµ¶Ô¶¼¶Á¹ýÀ´ºó£¬ÏȶÔËüÃǽøÐÐÅÅÐò£¬Ê¹µÃÏàͬ¼üµÄ¼üÖµ¶Ô¾Û¼¯ÔÚÒ»Æð¡£ÒòΪ²»Í¬µÄ¼ü¿ÉÄÜ»áÓ³É䵽ͬһ¸ö·ÖÇøÒ²¾ÍÊÇͬһ¸öReduce×÷Òµ£¨ËÈ÷ÖÇøÉÙÄØ£©£¬ËùÒÔÅÅÐòÊDZØÐëµÄ¡£
6.reduce worker±éÀúÅÅÐòºóµÄÖмä¼üÖµ¶Ô£¬¶ÔÓÚÿ¸öΨһµÄ¼ü£¬¶¼½«¼üÓë¹ØÁªµÄÖµ´«µÝ¸øreduceº¯Êý£¬reduceº¯Êý²úÉúµÄÊä³ö»áÌí¼Óµ½Õâ¸ö·ÖÇøµÄÊä³öÎļþÖС£
7.µ±ËùÓеÄMapºÍReduce×÷Òµ¶¼Íê³ÉÁË£¬master»½ÐÑÕý°æµÄuser
program£¬MapReduceº¯Êýµ÷Ó÷µ»Øuser programµÄ´úÂë
ËùÓÐÖ´ÐÐÍê±Ïºó£¬MapReduceÊä³ö·ÅÔÚÁËR¸ö·ÖÇøµÄÊä³öÎļþÖУ¨·Ö±ð¶ÔÓ¦Ò»¸öReduce×÷Òµ£©¡£Óû§Í¨³£²¢²»ÐèÒªºÏ²¢ÕâR¸öÎļþ£¬¶øÊǽ«Æä×÷ΪÊäÈë½»¸øÁíÒ»¸öMapReduce³ÌÐò´¦Àí¡£Õû¸ö¹ý³ÌÖУ¬ÊäÈëÊý¾ÝÊÇÀ´×Եײã·Ö²¼Ê½Îļþϵͳ£¨GFS£©µÄ£¬ÖмäÊý¾ÝÊÇ·ÅÔÚ±¾µØÎļþϵͳµÄ£¬×îÖÕÊä³öÊý¾ÝÊÇдÈëµ×²ã·Ö²¼Ê½Îļþϵͳ£¨GFS£©µÄ¡£¶øÇÒÎÒÃÇҪעÒâMap/Reduce×÷ÒµºÍmap/reduceº¯ÊýµÄÇø±ð£ºMap×÷Òµ´¦ÀíÒ»¸öÊäÈëÊý¾ÝµÄ·ÖƬ£¬¿ÉÄÜÐèÒªµ÷Óöà´Îmapº¯ÊýÀ´´¦Àíÿ¸öÊäÈë¼üÖµ¶Ô£»Reduce×÷Òµ´¦ÀíÒ»¸ö·ÖÇøµÄÖмä¼üÖµ¶Ô£¬ÆÚ¼äÒª¶Ôÿ¸ö²»Í¬µÄ¼üµ÷ÓÃÒ»´Îreduceº¯Êý£¬Reduce×÷Òµ×îÖÕÒ²¶ÔÓ¦Ò»¸öÊä³öÎļþ¡£
ÖÁÓÚÏÂÃæÒ»ÕÅͼHadoop MapReduce(²ÊÉ«µÄ)µÄÄ£ÐÍʵÏÖÔòÈçÏÂͼ(µ±È»ÕâÒ²²»ÊÇÎÒ»µÄ,Ö»ÊÇ´ó×ÔÈ»µÄ°áÔ˹¤):

(input) <k1, v1> -> map -> <k2, v2> -> combine -> <k2, v2> -> reduce -> <k3, v3> (output) |
2.Hadoop1.xÖеÄMapReduce
ÔÚHadoopÀïÃæµÄMapReduceµÄÊÇÓдæÔÚÁ½¸ö²»Í¬µÄʱÆÚ.¸Õ¿ªÊ¼µÄHadoopÖеÄMapReduceʵÏÖÊÇ×öµ½ºÜ¶àµÄÊÂÇ飬¶ø¸Ã¿ò¼ÜµÄºËÐÄJob
Tracker(×÷Òµ¸ú×ÙÕß)ÔòÊǼȵ±µùÓÖµ±ÂèµÄÒâ˼.¿´ÏÂͼ:
ÔMapReduce ³ÌÐòµÄÁ÷³Ì¼°Éè¼ÆË¼Â·£º
1.Ê×ÏÈÓû§³ÌÐò (JobClient) Ìá½»ÁËÒ»¸ö job£¬job µÄÐÅÏ¢»á·¢Ë͵½
Job Tracker ÖУ¬Job Tracker ÊÇ Map-reduce ¿ò¼ÜµÄÖÐÐÄ£¬ËûÐèÒªÓ뼯ȺÖеĻúÆ÷¶¨Ê±Í¨ÐÅ
(heartbeat), ÐèÒª¹ÜÀíÄÄЩ³ÌÐòÓ¦¸ÃÅÜÔÚÄÄЩ»úÆ÷ÉÏ£¬ÐèÒª¹ÜÀíËùÓÐ job ʧ°Ü¡¢ÖØÆôµÈ²Ù×÷¡£
2.TaskTracker ÊÇ Map-reduce ¼¯ÈºÖÐÿ̨»úÆ÷¶¼ÓеÄÒ»¸ö²¿·Ö£¬Ëû×öµÄÊÂÇéÖ÷ÒªÊǼàÊÓ×Ô¼ºËùÔÚ»úÆ÷µÄ×ÊÔ´Çé¿ö¡£
3.TaskTracker ͬʱ¼àÊÓµ±Ç°»úÆ÷µÄ tasks ÔËÐÐ×´¿ö¡£TaskTracker
ÐèÒª°ÑÕâЩÐÅϢͨ¹ý heartbeat·¢Ë͸øJobTracker£¬JobTracker »áËѼ¯ÕâЩÐÅÏ¢ÒÔ¸øÐÂÌá½»µÄ
job ·ÖÅäÔËÐÐÔÚÄÄЩ»úÆ÷ÉÏ¡£
¼ÈÈ»³öÏÖHadoop2¸Ä½øËü,ÄÇËü¾ÍÓÐһЩÎÊÌâ¿©¡£Ö÷ÒªµÄÎÊÌâÈçÏÂ:
1.JobTracker ÊÇ Map-reduce µÄ¼¯Öд¦Àíµã£¬´æÔÚµ¥µã¹ÊÕÏ¡£
2.JobTracker Íê³ÉÁËÌ«¶àµÄÈÎÎñ£¬Ôì³ÉÁ˹ý¶àµÄ×ÊÔ´ÏûºÄ£¬µ± map-reduce
job ·Ç³£¶àµÄʱºò£¬»áÔì³ÉºÜ´óµÄÄڴ濪Ïú£¬Ç±ÔÚÀ´Ëµ£¬Ò²Ôö¼ÓÁË JobTracker fail µÄ·çÏÕ£¬ÕâÒ²ÊÇÒµ½çÆÕ±é×ܽá³öÀÏ
Hadoop µÄ Map-Reduce Ö»ÄÜÖ§³Ö 4000 ½ÚµãÖ÷»úµÄÉÏÏÞ¡£
3.ÔÚ TaskTracker ¶Ë£¬ÒÔ map/reduce task
µÄÊýÄ¿×÷Ϊ×ÊÔ´µÄ±íʾ¹ýÓÚ¼òµ¥£¬Ã»Óп¼Âǵ½ cpu/ ÄÚ´æµÄÕ¼ÓÃÇé¿ö£¬Èç¹ûÁ½¸ö´óÄÚ´æÏûºÄµÄ task ±»µ÷¶Èµ½ÁËÒ»¿é£¬ºÜÈÝÒ׳öÏÖ
OOM¡£
4.ÔÚ TaskTracker ¶Ë£¬°Ñ×ÊÔ´Ç¿ÖÆ»®·ÖΪ map task
slot ºÍ reduce task slot, Èç¹ûµ±ÏµÍ³ÖÐÖ»ÓÐ map task »òÕßÖ»ÓÐ reduce
task µÄʱºò£¬»áÔì³É×ÊÔ´µÄÀË·Ñ£¬Ò²¾ÍÊÇÇ°ÃæÌá¹ýµÄ¼¯Èº×ÊÔ´ÀûÓõÄÎÊÌâ¡£
Ô´´úÂë²ãÃæ·ÖÎöµÄʱºò£¬»á·¢ÏÖ´úÂë·Ç³£µÄÄѶÁ£¬³£³£ÒòΪһ¸ö class ×öÁËÌ«¶àµÄÊÂÇ飬´úÂëÁ¿´ï
3000 ¶àÐÐÔì³É class µÄÈÎÎñ²»ÇåÎú£¬Ôö¼Ó bug ÐÞ¸´ºÍ°æ±¾Î¬»¤µÄÄѶȡ£
5.´Ó²Ù×÷µÄ½Ç¶ÈÀ´¿´£¬ÏÖÔÚµÄ Hadoop MapReduce ¿ò¼ÜÔÚÓÐÈκÎÖØÒªµÄ»òÕß²»ÖØÒªµÄ±ä»¯
( ÀýÈç bug ÐÞ¸´£¬ÐÔÄÜÌáÉýºÍÌØÐÔ»¯ ) ʱ£¬¶¼»áÇ¿ÖÆ½øÐÐϵͳ¼¶±ðµÄÉý¼¶¸üС£¸üÔãµÄÊÇ£¬Ëü²»¹ÜÓû§µÄϲºÃ£¬Ç¿ÖÆÈ÷ֲ¼Ê½¼¯ÈºÏµÍ³µÄÿһ¸öÓû§¶Ëͬʱ¸üС£ÕâЩ¸üлáÈÃÓû§ÎªÁËÑéÖ¤ËûÃÇ֮ǰµÄÓ¦ÓóÌÐòÊDz»ÊÇÊÊÓÃеÄ
Hadoop °æ±¾¶øÀË·Ñ´óÁ¿Ê±¼ä¡£
3.Hadoop2.xÖÐз½°¸YARN+MapReduce
Ê×ÏȵIJ»Òª±»YARN¸øÃÔ»óסÁË,ËüÖ»ÊǸºÔð×ÊÔ´µ÷¶È¹ÜÀí,¶øMapReduce²ÅÊǸºÔðÔËËãµÄ¼Ò»ï,ËùÒÔYARN
!= MapReduce2.ÕâÊÇ´óʦ˵µÄ:
YARN²¢²»ÊÇÏÂÒ»´úMapReduce£¨MRv2£©£¬ÏÂÒ»´úMapReduceÓëµÚÒ»´úMapReduce£¨MRv1£©ÔÚ±à³Ì½Ó¿Ú¡¢Êý¾Ý´¦ÀíÒýÇæ£¨MapTaskºÍReduceTask£©ÊÇÍêȫһÑùµÄ£¬
¿ÉÈÏΪMRv2ÖØÓÃÁËMRv1µÄÕâЩģ¿é£¬²»Í¬µÄÊÇ×ÊÔ´¹ÜÀíºÍ×÷Òµ¹ÜÀíϵͳ£¬MRv1ÖÐ×ÊÔ´¹ÜÀíºÍ×÷Òµ¹ÜÀí¾ùÊÇÓÉJobTrackerʵÏֵ쬼¯Á½¸ö¹¦ÄÜÓÚÒ»Éí£¬¶øÔÚMRv2ÖУ¬½«ÕâÁ½²¿·Ö·Ö¿ªÁË£¬
ÆäÖУ¬×÷Òµ¹ÜÀíÓÉApplicationMasterʵÏÖ£¬¶ø×ÊÔ´¹ÜÀíÓÉÐÂÔöϵͳYARNÍê³É£¬ÓÉÓÚYARN¾ßÓÐͨÓÃÐÔ£¬Òò´ËYARNÒ²¿ÉÒÔ×÷ΪÆäËû¼ÆËã¿ò¼ÜµÄ×ÊÔ´¹ÜÀíϵͳ£¬²»½öÏÞÓÚMapReduce£¬Ò²ÊÇÆäËû¼ÆËã¿ò¼Ü(Spark).

¿´ÉÏͼÎÒÃÇ¿ÉÒÔÖªµÀHadoop1ÖÐmapreduce¿ÉÒÔ˵ÊÇɶʶ¼¸É,¶øHadoop2ÖеÄMapReduceµÄ»°ÔòÊÇרÃÅ´¦ÀíÊý¾Ý·ÖÎö.¶øYARNµÄ»°Ôò×öΪ×ÊÔ´¹ÜÀíÆ÷´æÔÚ.
ÓÐÁËYARNÖ®ºó£¬¹ÙÍøÉÏÕâô˵Apache Hadoop NextGen
MapReduce (YARN).ËüµÄ¼Ü¹¹Í¼ÈçÏÂ:

ÔÚHadoop2Öн«JobTrackerÁ½¸öÖ÷ÒªµÄ¹¦ÄÜ·ÖÀë³Éµ¥¶ÀµÄ×é¼þ£¬ÕâÁ½¸ö¹¦ÄÜÊÇ×ÊÔ´¹ÜÀíºÍÈÎÎñµ÷¶È/¼à¿Ø¡£ÐµÄ×ÊÔ´¹ÜÀíÆ÷È«¾Ö¹ÜÀíËùÓÐÓ¦ÓóÌÐò¼ÆËã×ÊÔ´µÄ·ÖÅ䣬ÿһ¸öÓ¦ÓõÄ
ApplicationMaster ¸ºÔðÏàÓ¦µÄµ÷¶ÈºÍе÷¡£Ò»¸öÓ¦ÓóÌÐòÎÞ·ÇÊÇÒ»¸öµ¥¶ÀµÄ´«Í³µÄ MapReduce
ÈÎÎñ»òÕßÊÇÒ»¸ö DAG( ÓÐÏòÎÞ»·Í¼ ) ÈÎÎñ¡£ResourceManager ºÍÿһ̨»úÆ÷µÄ½Úµã¹ÜÀí·þÎñÆ÷Äܹ»¹ÜÀíÓû§ÔÚÄÇ̨»úÆ÷ÉϵĽø³Ì²¢ÄܶԼÆËã½øÐÐ×éÖ¯¡£
1.ÊÂʵÉÏ£¬Ã¿Ò»¸öÓ¦ÓõÄApplicationMasterÊÇÒ»¸öÏêϸµÄ¿ò¼Ü¿â£¬Ëü½áºÏ´ÓResourceManager»ñµÃµÄ×ÊÔ´ºÍ
NodeManagr Ðͬ¹¤×÷À´ÔËÐÐºÍ¼à¿ØÈÎÎñ¡£
2.ÔÚÉÏͼÖÐResourceManagerÖ§³Ö·Ö²ã¼¶µÄÓ¦ÓöÓÁУ¬ÕâЩ¶ÓÁÐÏíÓм¯ÈºÒ»¶¨±ÈÀýµÄ×ÊÔ´¡£´ÓijÖÖÒâÒåÉϽ²Ëü¾ÍÊÇÒ»¸ö´¿´âµÄµ÷¶ÈÆ÷£¬ËüÔÚÖ´Ðйý³ÌÖв»¶ÔÓ¦ÓýøÐÐ¼à¿ØºÍ״̬¸ú×Ù¡£Í¬Ñù£¬ËüÒ²²»ÄÜÖØÆôÒòÓ¦ÓÃʧ°Ü»òÕßÓ²¼þ´íÎó¶øÔËÐÐʧ°ÜµÄÈÎÎñ¡£
ResourceManager ÊÇ»ùÓÚÓ¦ÓóÌÐò¶Ô×ÊÔ´µÄÐèÇó½øÐе÷¶ÈµÄ ;
ÿһ¸öÓ¦ÓóÌÐòÐèÒª²»Í¬ÀàÐ͵Ä×ÊÔ´Òò´Ë¾ÍÐèÒª²»Í¬µÄÈÝÆ÷¡£×ÊÔ´°üÀ¨£ºÄڴ棬CPU£¬´ÅÅÌ£¬ÍøÂçµÈµÈ¡£¿ÉÒÔ¿´³ö£¬ÕâͬÏÖ
Mapreduce ¹Ì¶¨ÀàÐ͵Ä×ÊԴʹÓÃÄ£ÐÍÓÐÏÔÖøÇø±ð£¬Ëü¸ø¼¯ÈºµÄʹÓôøÀ´¸ºÃæµÄÓ°Ïì¡£×ÊÔ´¹ÜÀíÆ÷Ìṩһ¸öµ÷¶È²ßÂԵIJå¼þ£¬Ëü¸ºÔ𽫼¯Èº×ÊÔ´·ÖÅ䏸¶à¸ö¶ÓÁкÍÓ¦ÓóÌÐò¡£µ÷¶È²å¼þ¿ÉÒÔ»ùÓÚÏÖÓеÄÄÜÁ¦µ÷¶ÈºÍ¹«Æ½µ÷¶ÈÄ£ÐÍ¡£
3.ÔÚÉÏͼÖÐ NodeManager ÊÇÿһ̨»úÆ÷¿ò¼ÜµÄ´úÀí£¬ÊÇÖ´ÐÐÓ¦ÓóÌÐòµÄÈÝÆ÷£¬¼à¿ØÓ¦ÓóÌÐòµÄ×ÊԴʹÓÃÇé¿ö
(CPU£¬Äڴ棬ӲÅÌ£¬ÍøÂç ) ²¢ÇÒÏòµ÷¶ÈÆ÷»ã±¨¡£
4.ÔÚÉÏͼÖÐ,ÿһ¸öÓ¦ÓÃµÄ ApplicationMasterµÄÖ°ÔðÓУºÏòµ÷¶ÈÆ÷Ë÷ÒªÊʵ±µÄ×ÊÔ´ÈÝÆ÷£¬ÔËÐÐÈÎÎñ£¬¸ú×ÙÓ¦ÓóÌÐòµÄ״̬ºÍ¼à¿ØËüÃǵĽø³Ì£¬´¦ÀíÈÎÎñµÄʧ°ÜÔÒò¡£
ÔÙ´Î×ܽá,ÔÚHadoop2¼¯ÈºÀï,Ò»¸ö¿Í»§¶ËÌá½»ÈÎÎñµÄÒ»ÕûÌ×µÄÁ÷³Ìͼ:

1.¿Í»§¶ËµÄmapreduce³ÌÐòͨ¹ýhadoop shellÌá½»µ½hadoopµÄ¼¯ÈºÖÐ.
2.³ÌÐò»áͨ¹ýRPCͨÐŽ«´ò³Éjar°üµÄ³ÌÐòµÄÓйØÐÅÏ¢´«µÝ¸øHadoop¼¯ÈºÖÐRM(ResourceManager),¿É³ÆÎªÁìÈ¡JOBIDµÄ¹ý³Ì
3.RM¸ü¼ÓÌá½»ÉÏÀ´µÄÐÅÏ¢¸øÈÎÎñ·ÖÅäÒ»¸öΨһµÄID,ͬʱ»á½«run.jarµÄÔÚHDFSÉϵĴ洢·¾¶·¢Ë͸ø¿Í»§¶Ë.
4.¿Í»§¶ËµÃµ½ÄǸö´æ´¢Â·¾¶Ö®ºó,»áÏàÓ¦µÄÆ´½Ó³ö×îÖյĴæ·Å·¾¶Ä¿Â¼,È»ºó½«run.jar·Ö¶à·Ý´æ´¢ÔÚHDFSĿ¼ÖÐ,ĬÈÏÇé¿öϱ¸·ÝÊýÁ¿Îª10·Ý.¿ÉÅäÖÃ.
5.¿Í»§¶ËÌύһЩÅäÖÃÐÅÏ¢,ÀýÈç:×îÖմ洢·¾¶£¬JOB IDµÈ.
6.RM»á½«ÕâЩÅäÖÃÐÅÏ¢·ÅÈëÒ»¸ö¶ÓÁе±ÖÐ,ËùνµÄµ÷¶ÈÆ÷.ÖÁÓÚµ÷¶ÈµÄËã·¨,Ôò²»±ØÉ.
7.NM(NodeManager)ºÍRMÊÇͨ¹ýÐÄÌø»úÖÆ±£³Ö×ÅͨÐŵÄ,NM»á¶¨ÆÚµÄÏòRMÈ¥ÁìÈ¡ÈÎÎñ.
8.RM»áÔÚÈÎÒâµÄһ̨»ò¶ą̀µÄNMÖУ¬Æô¶¯ÈÎÎñ¼à¿ØµÄ½ø³ÌApplication
Master.ÓÃÀ´¼à¿ØÆäËûNMÖÐYARN CHildµÄÖ´ÐеÄÇé¿ö
9.NMÔÚÁìÈ¡µ½ÈÎÎñÖ®ºó,µÃµ½ÐÅÏ¢£¬»áÈ¥HDFSµÄÏÂÔØrun.jar.È»ºóÔÚ±¾µØµÄ»úÆ÷ÉÏÆô¶¯YARN
Child½ø³ÌÀ´Ö´ÐÐmap»òÕßreduceº¯Êý.mapº¯ÊýµÄ´¦ÀíÖ®ºóµÄÖмä½á¹ûÊý¾Ý»á·ÅÔÚ±¾µØÎļþϵͳÖеÄ.
10.ÔÚ½áÊø³ÌÐòÖ®ºó,½«½á¹ûÊý¾Ýд»áHDFSÖÐ.Õû¸öÁ÷³Ì´ó¸Å¾ÍÊÇÕâÑù×ÓµÄ.
4.YARN³öÏÖµÄÒâÒå----ÒýÓÃ
Ëæ×Å YARN µÄ³öÏÖ£¬Äú²»ÔÙÊܵ½¸ü¼òµ¥µÄ MapReduce ¿ª·¢Ä£Ê½Ô¼Êø£¬¶øÊÇ¿ÉÒÔ´´½¨¸ü¸´Ôӵķֲ¼Ê½Ó¦ÓóÌÐò¡£Êµ¼ÊÉÏ£¬Äú¿ÉÒÔ½«
MapReduce Ä£ÐÍÊÓΪ YARN ¼Ü¹¹¿ÉÔËÐеÄһЩӦÓóÌÐòÖÐµÄÆäÖÐÒ»¸ö£¬Ö»ÊÇΪ×Ô¶¨Ò忪·¢¹«¿ªÁË»ù´¡¿ò¼ÜµÄ¸ü¶à¹¦ÄÜ¡£ÕâÖÖÄÜÁ¦·Ç³£Ç¿´ó£¬ÒòΪ
YARN µÄʹÓÃÄ£Ðͼ¸ºõûÓÐÏÞÖÆ£¬²»ÔÙÐèÒªÓëÒ»¸ö¼¯ÈºÉÏ¿ÉÄÜ´æÔ򵀮äËû¸ü¸´Ôӵķֲ¼Ê½Ó¦ÓóÌÐò¿ò¼ÜÏà¸ôÀ룬¾ÍÏñ
MRv1 Ò»Ñù¡£ÉõÖÁ¿ÉÒÔ˵£¬Ëæ×Å YARN ±äµÃ¸ü¼Ó½¡È«£¬ËüÓÐÄÜÁ¦È¡´úÆäËûһЩ·Ö²¼Ê½´¦Àí¿ò¼Ü£¬´Ó¶øÍêÈ«Ïû³ýÁËרÓÃÓÚÆäËû¿ò¼ÜµÄ×ÊÔ´¿ªÏú£¬Í¬Ê±»¹¼ò»¯ÁËÕû¸öϵͳ¡£
ΪÁËÑÝʾ YARN Ïà¶ÔÓÚ MRv1 µÄЧÂÊÌáÉý£¬¿É¿¼ÂÇÂùÁ¦²âÊԾɰ汾µÄ
LAN Manager Hash µÄ²¢ÐÐÎÊÌ⣬ÕâÊǾɰæ Windows? ÓÃÓÚÃÜÂëÉ¢ÁÐÔËËãµÄµäÐÍ·½·¨¡£Ôڴ˳¡¾°ÖУ¬MapReduce
·½·¨Ã»Óжà´óÒâÒ壬ÒòΪ Mapping/Reducing ½×¶ÎÉæ¼°µ½Ì«¶à¿ªÏú¡£Ïà·´£¬¸üºÏÀíµÄ·½·¨ÊdzéÏó»¯×÷Òµ·ÖÅ䣬ÒÔ±ãÿ¸öÈÝÆ÷ÓµÓÐÃÜÂëËÑË÷¿Õ¼äµÄÒ»²¿·Ö£¬ÔÚÆäÖ®ÉϽøÐÐö¾Ù£¬²¢Í¨ÖªÄúÊÇ·ñÕÒµ½ÁËÕýÈ·µÄÃÜÂë¡£ÕâÀïµÄÖØµãÊÇ£¬ÃÜÂ뽫ͨ¹ýÒ»¸öº¯Êý
À´¶¯Ì¬È·¶¨£¨ÕâȷʵÓе㼬ÊÖ£©£¬¶ø²»ÐèÒª½«ËùÓпÉÄÜÐÔÓ³Éäµ½Ò»¸öÊý¾Ý½á¹¹ÖУ¬Õâ¾ÍʹµÃ MapReduce ·ç¸ñÏԵò»±ØÒªÇÒ²»ÊµÓá£
¹é½á¶øÑÔ£¬MRv1 ¿ò¼ÜϵÄÎÊÌâ½öÊÇÐèÒªÒ»¸ö¹ØÁªÊý×飬¶øÇÒÕâЩÎÊÌâÓÐרÃų¯´óÊý¾Ý²Ù×÷·½ÏòÑݱäµÄÇãÏò¡£µ«ÊÇ£¬ÎÊÌâÒ»¶¨²»»áÓÀÔ¶½ö¾ÖÏÞÓÚ´Ë·¶Ê½ÖУ¬ÒòΪÄúÏÖÔÚ¿ÉÒÔ¸üΪ¼òµ¥µØ½«ËüÃdzéÏ󻯣¬±àд×Ô¶¨Òå¿Í»§¶Ë¡¢Ó¦ÓóÌÐòÖ÷³ÌÐò£¬ÒÔ¼°·ûºÏÈκÎÄúÏëÒªµÄÉè¼ÆµÄÓ¦ÓóÌÐò¡£
5.±àд¼òµ¥MapReduce YarnµÄÓ¦ÓóÌÐò
ÎÒÃÇÖ±½ÓÄÃApache Hadoop¹ÙÍøÖеÄwordcountµÄÀý×ÓÀ´ËµÃ÷MapReduce³ÌÐòµÄ±àд.
Source Code
import java.io.IOException; import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class WordCount { //±àд×Ô¼ºµÄMapper,ÐèÒª¼Ì³Ðorg.apache.hadoop.mapreduce.Mapper public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable>{ //ÊäÈëµÄ<Key,Value>µÄÀàÐÍ,Êä³öµÄ<Key,Value> //×÷ΪÀàÖгÉÔ±±äÁ¿ private final static IntWritable one = new IntWritable(1); private Text word = new Text(); //key : offset Æ«ÒÆÁ¿,¼¸ºõ¿ÉÒÔºöÂÔ //value : one line string Ò»ÐеÄÊý¾Ý //context : the context of computer ¼ÆËãµÄÉÏÏÂÎÄ public void map(Object key, Text value, Context context ) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } } //±àд×Ô¼ºµÄReducer,ÐèÒª¼Ì³Ðorg.apache.hadoop.mapreduce.Reducer public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWritable> { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable<IntWritable> values, Context context ) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } } //Ö÷º¯Êý¿ªÊ¼ÔËÐÐJOB public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = Job.getInstance(conf, "word count"); job.setJarByClass(WordCount.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); //Ìá½»JOB³É¹¦,Í˳öJVMÐéÄâ»ú } } |
6.Hadoop2.0ÖÐÌá½»JobµÄÔ´Âë·ÖÎö







ÖÁ´Ë£¬Óë·þÎñÆ÷RMµÄͨÐÅÒѽ¨Á¢.


½ÓÏÂÀ´µÄ»°£¬¾ÍÊÇÌá½»jobÈÎÎñÁË.





7.±¸×¢£¨PS£©
¿ª·¢Ê¹ÓÃMapReduce³ÌÐò,´úÂë±¾ÉíÊÇûÓÐÄѶȵġ£ÒòΪ¶¼±»Hadoop±¾ÉíµÄ¿ò¼Ü·â×°ºÃÁË.ÎÒÃÇÒª×öµÄÖ»ÊÇʹÓÃËüµÄÏà¹ØAPIÀ´Íê³ÉÎÒÃǵÄʵ¼ÊµÄÒµÎñÐèÇó.µ«ÊÇMapReduce±¾ÉíµÄ³ÌÐòÊÇÓкܶàµÄÀ©Õ¹µÄ£¬°üÀ¨£¨Partitioner±à³Ì,×Ô¶¨ÒåÅÅÐò±à³Ì,Combiner±à³Ì,³£¼ûµÄMapReduceËã·¨£©¡£ÔÚʵ¼Ê¿ª·¢ÖÐ,ÓÉÓÚÊÇ·Ö²¼Ê½µÄ»·¾³¡£Ò²»áÔì³ÉÎÒÃÇ¿ª·¢µ÷ÊÔµÄÄѶÈÒ²»áÔö¼Ó¡£ÓкܶàµÄϸ½ÚºÍ֪ʶµãÐèÒªÁ˽â.ºÜÔÓºÜÂҵĸоõ¡£ÈçºÎÔÚwindowÏÂeclipseÔ¶³ÌDebug³ÌÐò.¶¼ÊÇ¿ª·¢Öг£Óõ½µÄ¡£ËùÒÔMapReduce֪ʶµÄ²Å¸Õ¿ªÊ¼.
|