Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Modeler   Code  
»áÔ±   
 
   
 
 
     
   
 ¶©ÔÄ
  ¾èÖú
HadoopÖÐMapReduce¿ò¼ÜÈëÃÅ
 
×÷ÕߣºÖì¾ü»ªRonzhu À´Ô´£º²úÆ·Öйú ·¢²¼ÓÚ£º2015-2-4
  3119  次浏览      27
 

¡¡MapReduceÊÇÒ»ÖÖ·Ö²¼Ê½¼ÆËãÄ£ÐÍ£¬ÓÉGoogleÌá³ö£¬Ö÷ÒªÓÃÓÚËÑË÷ÁìÓò£¬½â¾öº£Á¿Êý¾ÝµÄ¼ÆËãÎÊÌâ.¶ÔÓÚÒµ ½çµÄ´óÊý¾Ý´æ´¢¼°·Ö²¼Ê½´¦ÀíϵͳÀ´ËµHadoop2Ìá³öµÄÐÂMapReudce¾ÍÊÇYARN: A framework for job scheduling and cluster resource management.

1.MapReduceµÄ¼òµ¥¸ÅÄî

°Ù¶È°Ù¿Æ:MapReduceÊÇÒ»ÖÖ±à³ÌÄ£ÐÍ£¬ÓÃÓÚ´ó¹æÄ£Êý¾Ý¼¯£¨´óÓÚ1TB£©µÄ²¢ÐÐÔËËã¡£¸ÅÄî"Map£¨Ó³É䣩"ºÍ"Reduce£¨¹éÔ¼£©"£¬ºÍËûÃǵÄÖ÷Ҫ˼Ï룬¶¼ÊÇ´Óº¯Êýʽ±à³ÌÓïÑÔÀï½èÀ´µÄ£¬»¹ÓдÓʸÁ¿±à³ÌÓïÑÔÀï½èÀ´µÄÌØÐÔ¡£Ëü¼«´óµØ·½±ãÁ˱à³ÌÈËÔ±ÔÚ²»»á·Ö²¼Ê½²¢Ðбà³ÌµÄÇé¿öÏ£¬½«×Ô¼ºµÄ³ÌÐòÔËÐÐÔÚ·Ö²¼Ê½ÏµÍ³ÉÏ¡£ µ±Ç°µÄÈí¼þʵÏÖÊÇÖ¸¶¨Ò»¸öMap£¨Ó³É䣩º¯Êý£¬ÓÃÀ´°ÑÒ»×é¼üÖµ¶ÔÓ³Éä³ÉÒ»×éеļüÖµ¶Ô£¬Ö¸¶¨²¢·¢µÄReduce£¨¹éÔ¼£©º¯Êý£¬ÓÃÀ´±£Ö¤ËùÓÐÓ³ÉäµÄ¼üÖµ¶ÔÖеÄÿһ¸ö¹²ÏíÏàͬµÄ¼ü×é¡£ÖÁÓÚʲôÊǺ¯Êýʽ±à³ÌÓïÑÔºÍʸÁ¿±à³ÌÓïÑÔ£¬×Ô¼ºÒ²¸ãµÃ²»Ì«Çå³þ£¬¼û½âÊÍÁ´½Ó:

http://www.cnblogs.com/kym/archive/2011/03/07/1976519.html.

×Ô¼ºµÄÀí½â:MapReduceÊÇÒ»ÖÖ·Ö²¼Ê½¼ÆËãÄ£ÐÍ£¬ÓÉGoogleÌá³ö£¬Ö÷ÒªÓÃÓÚËÑË÷ÁìÓò£¬½â¾öº£Á¿Êý¾ÝµÄ¼ÆËãÎÊÌâ.µ±ÄãÏòMapReduce ¿ò¼ÜÌá½»Ò»¸ö¼ÆËã×÷ҵʱ£¬Ëü»áÊ×ÏȰѼÆËã×÷Òµ²ð·Ö³ÉÈô¸É¸öMap ÈÎÎñ£¬È»ºó·ÖÅäµ½²»Í¬µÄ½ÚµãÉÏÈ¥Ö´ÐУ¬Ã¿Ò»¸öMap ÈÎÎñ´¦ÀíÊäÈëÊý¾ÝÖеÄÒ»²¿·Ö£¬µ±Map ÈÎÎñÍê³Éºó£¬Ëü»áÉú³ÉһЩÖмäÎļþ£¬ÕâЩÖмäÎļþ½«»á×÷ΪReduce ÈÎÎñµÄÊäÈëÊý¾Ý¡£Reduce ÈÎÎñµÄÖ÷ҪĿ±ê¾ÍÊǰÑÇ°ÃæÈô¸É¸öMap µÄÊä³ö»ã×ܵ½Ò»Æð²¢Êä³ö.¾ÍÊÇ˵HDFSÒѾ­ÎªÎÒÃÇÌṩÁ˸ßÐÔÄÜ¡¢¸ß²¢·¢µÄ·þÎñ£¬µ«ÊDz¢Ðбà³Ì¿É²»ÊÇËùÓгÌÐòÔ±¶¼ÍæµÃתµÄ»î¶ù£¬Èç¹ûÎÒÃǵÄÓ¦Óñ¾Éí²»Äܲ¢·¢£¬ÄÇHadoopµÄHDFSÒ²¶¼ÊÇûÓÐÒâÒåµÄ¡£MapReduceµÄΰ´óÖ®´¦¾ÍÔÚÓÚÈò»ÊìϤ²¢Ðбà³ÌµÄ³ÌÐòÔ±(±ÈÈçÏñÎÒÕâµÄ)Ò²Äܳä·Ö·¢»Ó·Ö²¼Ê½ÏµÍ³µÄÍþÁ¦¡£ÕâÀï˵Ã÷ÒÔÏÂ:Hadoop±¾ÉíÕâ¸ö¿ò¼Ü¾ÍÊÇÑóÈË»ùÓÚÑóÈ˹«Ë¾¹È¸èµÄÈý´óÂÛÎÄGFS,BigTable,MapReduce(±à³ÌÄ£ÐÍ),ÓÃJavaÓïÑÔʵÏֵĿò¼Ü.¹È¸èËü¾ÍÓõÄC++ʵÏÖ,¶øMapReduce±à³ÌÄ£ÐÍ£¨ÊǸ߶ȳéÏóµÄ£©´óÌåÀë²»¿ªÏÂÃæÕâÕÅͼ.Spark²¢ÐÐÔËËã¿ò¼Ü(ºÍHadoopµÄMapReduce)µÄ²»Í¬µã:ÔÚÓÚËü½«Öмä½á¹û¼´mapº¯Êý½á¹ûÖ±½Ó·ÅÈëÄÚ´æÖÐ,¶ø²»ÊÇ·ÅÈë±¾µØ´ÅÅ̵ÄHDFSÖÐ.ÕâЩ¶¼²»ÊÇÖØµã,ÖØµãÊÇÏÂÃæÍ¼µÄÁ÷³Ì:

ÉÏͼÊÇÂÛÎÄÀï¸ø³öµÄÁ÷³Ìͼ¡£Ò»Çж¼ÊÇ´Ó×îÉÏ·½µÄuser program¿ªÊ¼µÄ£¬user programÁ´½ÓÁËMapReduce¿â£¬ÊµÏÖÁË×î»ù±¾µÄMapº¯ÊýºÍReduceº¯Êý¡£Í¼ÖÐÖ´ÐеÄ˳Ðò¶¼ÓÃÊý×Ö±ê¼ÇÁË¡£

1.MapReduce¿âÏȰÑuser programµÄÊäÈëÎļþ»®·ÖΪM·Ý£¨MΪÓû§¶¨Ò壩£¬Ã¿Ò»·Ýͨ³£ÓÐ16MBµ½64MB£¬Èçͼ×ó·½Ëùʾ·Ö³ÉÁËsplit0~4£»È»ºóʹÓÃfork½«Óû§½ø³Ì¿½±´µ½¼¯ÈºÄÚÆäËü»úÆ÷ÉÏ¡£

2.user programµÄ¸±±¾ÖÐÓÐÒ»¸ö³ÆÎªmaster£¬ÆäÓà³ÆÎªworker£¬masterÊǸºÔðµ÷¶ÈµÄ£¬Îª¿ÕÏÐworker·ÖÅä×÷Òµ£¨Map×÷Òµ3»òÕßReduce×÷Òµ£©£¬workerµÄÊýÁ¿Ò²ÊÇ¿ÉÒÔÓÉÓû§Ö¸¶¨µÄ¡£

3.±»·ÖÅäÁËMap×÷ÒµµÄworker£¬¿ªÊ¼¶ÁÈ¡¶ÔÓ¦·ÖƬµÄÊäÈëÊý¾Ý£¬Map×÷ÒµÊýÁ¿ÊÇÓÉM¾ö¶¨µÄ£¬ºÍsplitÒ»Ò»¶ÔÓ¦£»Map×÷Òµ´ÓÊäÈëÊý¾ÝÖгéÈ¡³ö¼üÖµ¶Ô£¬Ã¿Ò»¸ö¼üÖµ¶Ô¶¼×÷Ϊ²ÎÊý´«µÝ¸ømapº¯Êý£¬mapº¯Êý²úÉúµÄÖмä¼üÖµ¶Ô±»»º´æÔÚÄÚ´æÖС£

4.»º´æµÄÖмä¼üÖµ¶Ô»á±»¶¨ÆÚдÈë±¾µØ´ÅÅÌ£¬¶øÇÒ±»·ÖΪR¸öÇø£¬RµÄ´óСÊÇÓÉÓû§¶¨ÒåµÄ£¬½«À´Ã¿¸öÇø»á¶ÔÓ¦Ò»¸öReduce×÷Òµ£»ÕâЩÖмä¼üÖµ¶ÔµÄλÖûᱻͨ±¨¸ømaster£¬master¸ºÔð½«ÐÅϢת·¢¸øReduce worker¡£

5.master֪ͨ·ÖÅäÁËReduce×÷ÒµµÄworkerËü¸ºÔðµÄ·ÖÇøÔÚʲôλÖ㨿϶¨²»Ö¹Ò»¸öµØ·½£¬Ã¿¸öMap×÷Òµ²úÉúµÄÖмä¼üÖµ¶Ô¶¼¿ÉÄÜÓ³Éäµ½ËùÓÐR¸ö²»Í¬·ÖÇø£©£¬µ±Reduce worker°ÑËùÓÐËü¸ºÔðµÄÖмä¼üÖµ¶Ô¶¼¶Á¹ýÀ´ºó£¬ÏȶÔËüÃǽøÐÐÅÅÐò£¬Ê¹µÃÏàͬ¼üµÄ¼üÖµ¶Ô¾Û¼¯ÔÚÒ»Æð¡£ÒòΪ²»Í¬µÄ¼ü¿ÉÄÜ»áÓ³É䵽ͬһ¸ö·ÖÇøÒ²¾ÍÊÇͬһ¸öReduce×÷Òµ£¨Ë­È÷ÖÇøÉÙÄØ£©£¬ËùÒÔÅÅÐòÊDZØÐëµÄ¡£

6.reduce worker±éÀúÅÅÐòºóµÄÖмä¼üÖµ¶Ô£¬¶ÔÓÚÿ¸öΨһµÄ¼ü£¬¶¼½«¼üÓë¹ØÁªµÄÖµ´«µÝ¸øreduceº¯Êý£¬reduceº¯Êý²úÉúµÄÊä³ö»áÌí¼Óµ½Õâ¸ö·ÖÇøµÄÊä³öÎļþÖС£

7.µ±ËùÓеÄMapºÍReduce×÷Òµ¶¼Íê³ÉÁË£¬master»½ÐÑÕý°æµÄuser program£¬MapReduceº¯Êýµ÷Ó÷µ»Øuser programµÄ´úÂë

ËùÓÐÖ´ÐÐÍê±Ïºó£¬MapReduceÊä³ö·ÅÔÚÁËR¸ö·ÖÇøµÄÊä³öÎļþÖУ¨·Ö±ð¶ÔÓ¦Ò»¸öReduce×÷Òµ£©¡£Óû§Í¨³£²¢²»ÐèÒªºÏ²¢ÕâR¸öÎļþ£¬¶øÊǽ«Æä×÷ΪÊäÈë½»¸øÁíÒ»¸öMapReduce³ÌÐò´¦Àí¡£Õû¸ö¹ý³ÌÖУ¬ÊäÈëÊý¾ÝÊÇÀ´×Եײã·Ö²¼Ê½Îļþϵͳ£¨GFS£©µÄ£¬ÖмäÊý¾ÝÊÇ·ÅÔÚ±¾µØÎļþϵͳµÄ£¬×îÖÕÊä³öÊý¾ÝÊÇдÈëµ×²ã·Ö²¼Ê½Îļþϵͳ£¨GFS£©µÄ¡£¶øÇÒÎÒÃÇҪעÒâMap/Reduce×÷ÒµºÍmap/reduceº¯ÊýµÄÇø±ð£ºMap×÷Òµ´¦ÀíÒ»¸öÊäÈëÊý¾ÝµÄ·ÖƬ£¬¿ÉÄÜÐèÒªµ÷Óöà´Îmapº¯ÊýÀ´´¦Àíÿ¸öÊäÈë¼üÖµ¶Ô£»Reduce×÷Òµ´¦ÀíÒ»¸ö·ÖÇøµÄÖмä¼üÖµ¶Ô£¬ÆÚ¼äÒª¶Ôÿ¸ö²»Í¬µÄ¼üµ÷ÓÃÒ»´Îreduceº¯Êý£¬Reduce×÷Òµ×îÖÕÒ²¶ÔÓ¦Ò»¸öÊä³öÎļþ¡£

ÖÁÓÚÏÂÃæÒ»ÕÅͼHadoop MapReduce(²ÊÉ«µÄ)µÄÄ£ÐÍʵÏÖÔòÈçÏÂͼ(µ±È»ÕâÒ²²»ÊÇÎÒ»­µÄ,Ö»ÊÇ´ó×ÔÈ»µÄ°áÔ˹¤):

(input) <k1, v1> -> map -> <k2, v2> -> combine -> <k2, v2> -> reduce -> <k3, v3> (output)

2.Hadoop1.xÖеÄMapReduce

ÔÚHadoopÀïÃæµÄMapReduceµÄÊÇÓдæÔÚÁ½¸ö²»Í¬µÄʱÆÚ.¸Õ¿ªÊ¼µÄHadoopÖеÄMapReduceʵÏÖÊÇ×öµ½ºÜ¶àµÄÊÂÇ飬¶ø¸Ã¿ò¼ÜµÄºËÐÄJob Tracker(×÷Òµ¸ú×ÙÕß)ÔòÊǼȵ±µùÓÖµ±ÂèµÄÒâ˼.¿´ÏÂͼ:

Ô­MapReduce ³ÌÐòµÄÁ÷³Ì¼°Éè¼ÆË¼Â·£º

1.Ê×ÏÈÓû§³ÌÐò (JobClient) Ìá½»ÁËÒ»¸ö job£¬job µÄÐÅÏ¢»á·¢Ë͵½ Job Tracker ÖУ¬Job Tracker ÊÇ Map-reduce ¿ò¼ÜµÄÖÐÐÄ£¬ËûÐèÒªÓ뼯ȺÖеĻúÆ÷¶¨Ê±Í¨ÐÅ (heartbeat), ÐèÒª¹ÜÀíÄÄЩ³ÌÐòÓ¦¸ÃÅÜÔÚÄÄЩ»úÆ÷ÉÏ£¬ÐèÒª¹ÜÀíËùÓÐ job ʧ°Ü¡¢ÖØÆôµÈ²Ù×÷¡£

2.TaskTracker ÊÇ Map-reduce ¼¯ÈºÖÐÿ̨»úÆ÷¶¼ÓеÄÒ»¸ö²¿·Ö£¬Ëû×öµÄÊÂÇéÖ÷ÒªÊǼàÊÓ×Ô¼ºËùÔÚ»úÆ÷µÄ×ÊÔ´Çé¿ö¡£

3.TaskTracker ͬʱ¼àÊÓµ±Ç°»úÆ÷µÄ tasks ÔËÐÐ×´¿ö¡£TaskTracker ÐèÒª°ÑÕâЩÐÅϢͨ¹ý heartbeat·¢Ë͸øJobTracker£¬JobTracker »áËѼ¯ÕâЩÐÅÏ¢ÒÔ¸øÐÂÌá½»µÄ job ·ÖÅäÔËÐÐÔÚÄÄЩ»úÆ÷ÉÏ¡£

¼ÈÈ»³öÏÖHadoop2¸Ä½øËü,ÄÇËü¾ÍÓÐһЩÎÊÌâ¿©¡£Ö÷ÒªµÄÎÊÌâÈçÏÂ:

1.JobTracker ÊÇ Map-reduce µÄ¼¯Öд¦Àíµã£¬´æÔÚµ¥µã¹ÊÕÏ¡£

2.JobTracker Íê³ÉÁËÌ«¶àµÄÈÎÎñ£¬Ôì³ÉÁ˹ý¶àµÄ×ÊÔ´ÏûºÄ£¬µ± map-reduce job ·Ç³£¶àµÄʱºò£¬»áÔì³ÉºÜ´óµÄÄڴ濪Ïú£¬Ç±ÔÚÀ´Ëµ£¬Ò²Ôö¼ÓÁË JobTracker fail µÄ·çÏÕ£¬ÕâÒ²ÊÇÒµ½çÆÕ±é×ܽá³öÀÏ Hadoop µÄ Map-Reduce Ö»ÄÜÖ§³Ö 4000 ½ÚµãÖ÷»úµÄÉÏÏÞ¡£

3.ÔÚ TaskTracker ¶Ë£¬ÒÔ map/reduce task µÄÊýÄ¿×÷Ϊ×ÊÔ´µÄ±íʾ¹ýÓÚ¼òµ¥£¬Ã»Óп¼Âǵ½ cpu/ ÄÚ´æµÄÕ¼ÓÃÇé¿ö£¬Èç¹ûÁ½¸ö´óÄÚ´æÏûºÄµÄ task ±»µ÷¶Èµ½ÁËÒ»¿é£¬ºÜÈÝÒ׳öÏÖ OOM¡£

4.ÔÚ TaskTracker ¶Ë£¬°Ñ×ÊÔ´Ç¿ÖÆ»®·ÖΪ map task slot ºÍ reduce task slot, Èç¹ûµ±ÏµÍ³ÖÐÖ»ÓÐ map task »òÕßÖ»ÓÐ reduce task µÄʱºò£¬»áÔì³É×ÊÔ´µÄÀË·Ñ£¬Ò²¾ÍÊÇÇ°ÃæÌá¹ýµÄ¼¯Èº×ÊÔ´ÀûÓõÄÎÊÌâ¡£

Ô´´úÂë²ãÃæ·ÖÎöµÄʱºò£¬»á·¢ÏÖ´úÂë·Ç³£µÄÄѶÁ£¬³£³£ÒòΪһ¸ö class ×öÁËÌ«¶àµÄÊÂÇ飬´úÂëÁ¿´ï 3000 ¶àÐÐÔì³É class µÄÈÎÎñ²»ÇåÎú£¬Ôö¼Ó bug ÐÞ¸´ºÍ°æ±¾Î¬»¤µÄÄѶȡ£

5.´Ó²Ù×÷µÄ½Ç¶ÈÀ´¿´£¬ÏÖÔÚµÄ Hadoop MapReduce ¿ò¼ÜÔÚÓÐÈκÎÖØÒªµÄ»òÕß²»ÖØÒªµÄ±ä»¯ ( ÀýÈç bug ÐÞ¸´£¬ÐÔÄÜÌáÉýºÍÌØÐÔ»¯ ) ʱ£¬¶¼»áÇ¿ÖÆ½øÐÐϵͳ¼¶±ðµÄÉý¼¶¸üС£¸üÔãµÄÊÇ£¬Ëü²»¹ÜÓû§µÄϲºÃ£¬Ç¿ÖÆÈ÷ֲ¼Ê½¼¯ÈºÏµÍ³µÄÿһ¸öÓû§¶Ëͬʱ¸üС£ÕâЩ¸üлáÈÃÓû§ÎªÁËÑéÖ¤ËûÃÇ֮ǰµÄÓ¦ÓóÌÐòÊDz»ÊÇÊÊÓÃÐ嵀 Hadoop °æ±¾¶øÀË·Ñ´óÁ¿Ê±¼ä¡£

3.Hadoop2.xÖÐз½°¸YARN+MapReduce

Ê×ÏȵIJ»Òª±»YARN¸øÃÔ»óסÁË,ËüÖ»ÊǸºÔð×ÊÔ´µ÷¶È¹ÜÀí,¶øMapReduce²ÅÊǸºÔðÔËËãµÄ¼Ò»ï,ËùÒÔYARN != MapReduce2.ÕâÊÇ´óʦ˵µÄ:

YARN²¢²»ÊÇÏÂÒ»´úMapReduce£¨MRv2£©£¬ÏÂÒ»´úMapReduceÓëµÚÒ»´úMapReduce£¨MRv1£©ÔÚ±à³Ì½Ó¿Ú¡¢Êý¾Ý´¦ÀíÒýÇæ£¨MapTaskºÍReduceTask£©ÊÇÍêȫһÑùµÄ£¬ ¿ÉÈÏΪMRv2ÖØÓÃÁËMRv1µÄÕâЩģ¿é£¬²»Í¬µÄÊÇ×ÊÔ´¹ÜÀíºÍ×÷Òµ¹ÜÀíϵͳ£¬MRv1ÖÐ×ÊÔ´¹ÜÀíºÍ×÷Òµ¹ÜÀí¾ùÊÇÓÉJobTrackerʵÏֵ쬼¯Á½¸ö¹¦ÄÜÓÚÒ»Éí£¬¶øÔÚMRv2ÖУ¬½«ÕâÁ½²¿·Ö·Ö¿ªÁË£¬ ÆäÖУ¬×÷Òµ¹ÜÀíÓÉApplicationMasterʵÏÖ£¬¶ø×ÊÔ´¹ÜÀíÓÉÐÂÔöϵͳYARNÍê³É£¬ÓÉÓÚYARN¾ßÓÐͨÓÃÐÔ£¬Òò´ËYARNÒ²¿ÉÒÔ×÷ΪÆäËû¼ÆËã¿ò¼ÜµÄ×ÊÔ´¹ÜÀíϵͳ£¬²»½öÏÞÓÚMapReduce£¬Ò²ÊÇÆäËû¼ÆËã¿ò¼Ü(Spark).

¿´ÉÏͼÎÒÃÇ¿ÉÒÔÖªµÀHadoop1ÖÐmapreduce¿ÉÒÔ˵ÊÇɶʶ¼¸É,¶øHadoop2ÖеÄMapReduceµÄ»°ÔòÊÇרÃÅ´¦ÀíÊý¾Ý·ÖÎö.¶øYARNµÄ»°Ôò×öΪ×ÊÔ´¹ÜÀíÆ÷´æÔÚ.

ÓÐÁËYARNÖ®ºó£¬¹ÙÍøÉÏÕâô˵Apache Hadoop NextGen MapReduce (YARN).ËüµÄ¼Ü¹¹Í¼ÈçÏÂ:

ÔÚHadoop2Öн«JobTrackerÁ½¸öÖ÷ÒªµÄ¹¦ÄÜ·ÖÀë³Éµ¥¶ÀµÄ×é¼þ£¬ÕâÁ½¸ö¹¦ÄÜÊÇ×ÊÔ´¹ÜÀíºÍÈÎÎñµ÷¶È/¼à¿Ø¡£ÐµÄ×ÊÔ´¹ÜÀíÆ÷È«¾Ö¹ÜÀíËùÓÐÓ¦ÓóÌÐò¼ÆËã×ÊÔ´µÄ·ÖÅ䣬ÿһ¸öÓ¦ÓÃµÄ ApplicationMaster ¸ºÔðÏàÓ¦µÄµ÷¶ÈºÍЭµ÷¡£Ò»¸öÓ¦ÓóÌÐòÎÞ·ÇÊÇÒ»¸öµ¥¶ÀµÄ´«Í³µÄ MapReduce ÈÎÎñ»òÕßÊÇÒ»¸ö DAG( ÓÐÏòÎÞ»·Í¼ ) ÈÎÎñ¡£ResourceManager ºÍÿһ̨»úÆ÷µÄ½Úµã¹ÜÀí·þÎñÆ÷Äܹ»¹ÜÀíÓû§ÔÚÄÇ̨»úÆ÷ÉϵĽø³Ì²¢ÄܶԼÆËã½øÐÐ×éÖ¯¡£

1.ÊÂʵÉÏ£¬Ã¿Ò»¸öÓ¦ÓõÄApplicationMasterÊÇÒ»¸öÏêϸµÄ¿ò¼Ü¿â£¬Ëü½áºÏ´ÓResourceManager»ñµÃµÄ×ÊÔ´ºÍ NodeManagr Эͬ¹¤×÷À´ÔËÐÐºÍ¼à¿ØÈÎÎñ¡£

2.ÔÚÉÏͼÖÐResourceManagerÖ§³Ö·Ö²ã¼¶µÄÓ¦ÓöÓÁУ¬ÕâЩ¶ÓÁÐÏíÓм¯ÈºÒ»¶¨±ÈÀýµÄ×ÊÔ´¡£´ÓijÖÖÒâÒåÉϽ²Ëü¾ÍÊÇÒ»¸ö´¿´âµÄµ÷¶ÈÆ÷£¬ËüÔÚÖ´Ðйý³ÌÖв»¶ÔÓ¦ÓýøÐÐ¼à¿ØºÍ״̬¸ú×Ù¡£Í¬Ñù£¬ËüÒ²²»ÄÜÖØÆôÒòÓ¦ÓÃʧ°Ü»òÕßÓ²¼þ´íÎó¶øÔËÐÐʧ°ÜµÄÈÎÎñ¡£

ResourceManager ÊÇ»ùÓÚÓ¦ÓóÌÐò¶Ô×ÊÔ´µÄÐèÇó½øÐе÷¶ÈµÄ ; ÿһ¸öÓ¦ÓóÌÐòÐèÒª²»Í¬ÀàÐ͵Ä×ÊÔ´Òò´Ë¾ÍÐèÒª²»Í¬µÄÈÝÆ÷¡£×ÊÔ´°üÀ¨£ºÄڴ棬CPU£¬´ÅÅÌ£¬ÍøÂçµÈµÈ¡£¿ÉÒÔ¿´³ö£¬ÕâͬÏÖ Mapreduce ¹Ì¶¨ÀàÐ͵Ä×ÊԴʹÓÃÄ£ÐÍÓÐÏÔÖøÇø±ð£¬Ëü¸ø¼¯ÈºµÄʹÓôøÀ´¸ºÃæµÄÓ°Ïì¡£×ÊÔ´¹ÜÀíÆ÷Ìṩһ¸öµ÷¶È²ßÂԵIJå¼þ£¬Ëü¸ºÔ𽫼¯Èº×ÊÔ´·ÖÅ䏸¶à¸ö¶ÓÁкÍÓ¦ÓóÌÐò¡£µ÷¶È²å¼þ¿ÉÒÔ»ùÓÚÏÖÓеÄÄÜÁ¦µ÷¶ÈºÍ¹«Æ½µ÷¶ÈÄ£ÐÍ¡£

3.ÔÚÉÏͼÖÐ NodeManager ÊÇÿһ̨»úÆ÷¿ò¼ÜµÄ´úÀí£¬ÊÇÖ´ÐÐÓ¦ÓóÌÐòµÄÈÝÆ÷£¬¼à¿ØÓ¦ÓóÌÐòµÄ×ÊԴʹÓÃÇé¿ö (CPU£¬Äڴ棬ӲÅÌ£¬ÍøÂç ) ²¢ÇÒÏòµ÷¶ÈÆ÷»ã±¨¡£

4.ÔÚÉÏͼÖÐ,ÿһ¸öÓ¦ÓÃµÄ ApplicationMasterµÄÖ°ÔðÓУºÏòµ÷¶ÈÆ÷Ë÷ÒªÊʵ±µÄ×ÊÔ´ÈÝÆ÷£¬ÔËÐÐÈÎÎñ£¬¸ú×ÙÓ¦ÓóÌÐòµÄ״̬ºÍ¼à¿ØËüÃǵĽø³Ì£¬´¦ÀíÈÎÎñµÄʧ°ÜÔ­Òò¡£

ÔÙ´Î×ܽá,ÔÚHadoop2¼¯ÈºÀï,Ò»¸ö¿Í»§¶ËÌá½»ÈÎÎñµÄÒ»ÕûÌ×µÄÁ÷³Ìͼ:

1.¿Í»§¶ËµÄmapreduce³ÌÐòͨ¹ýhadoop shellÌá½»µ½hadoopµÄ¼¯ÈºÖÐ.

2.³ÌÐò»áͨ¹ýRPCͨÐŽ«´ò³Éjar°üµÄ³ÌÐòµÄÓйØÐÅÏ¢´«µÝ¸øHadoop¼¯ÈºÖÐRM(ResourceManager),¿É³ÆÎªÁìÈ¡JOBIDµÄ¹ý³Ì

3.RM¸ü¼ÓÌá½»ÉÏÀ´µÄÐÅÏ¢¸øÈÎÎñ·ÖÅäÒ»¸öΨһµÄID,ͬʱ»á½«run.jarµÄÔÚHDFSÉϵĴ洢·¾¶·¢Ë͸ø¿Í»§¶Ë.

4.¿Í»§¶ËµÃµ½ÄǸö´æ´¢Â·¾¶Ö®ºó,»áÏàÓ¦µÄÆ´½Ó³ö×îÖյĴæ·Å·¾¶Ä¿Â¼,È»ºó½«run.jar·Ö¶à·Ý´æ´¢ÔÚHDFSĿ¼ÖÐ,ĬÈÏÇé¿öϱ¸·ÝÊýÁ¿Îª10·Ý.¿ÉÅäÖÃ.

5.¿Í»§¶ËÌύһЩÅäÖÃÐÅÏ¢,ÀýÈç:×îÖմ洢·¾¶£¬JOB IDµÈ.

6.RM»á½«ÕâЩÅäÖÃÐÅÏ¢·ÅÈëÒ»¸ö¶ÓÁе±ÖÐ,ËùνµÄµ÷¶ÈÆ÷.ÖÁÓÚµ÷¶ÈµÄËã·¨,Ôò²»±ØÉ.

7.NM(NodeManager)ºÍRMÊÇͨ¹ýÐÄÌø»úÖÆ±£³Ö×ÅͨÐŵÄ,NM»á¶¨ÆÚµÄÏòRMÈ¥ÁìÈ¡ÈÎÎñ.

8.RM»áÔÚÈÎÒâµÄһ̨»ò¶ą̀µÄNMÖУ¬Æô¶¯ÈÎÎñ¼à¿ØµÄ½ø³ÌApplication Master.ÓÃÀ´¼à¿ØÆäËûNMÖÐYARN CHildµÄÖ´ÐеÄÇé¿ö

9.NMÔÚÁìÈ¡µ½ÈÎÎñÖ®ºó,µÃµ½ÐÅÏ¢£¬»áÈ¥HDFSµÄÏÂÔØrun.jar.È»ºóÔÚ±¾µØµÄ»úÆ÷ÉÏÆô¶¯YARN Child½ø³ÌÀ´Ö´ÐÐmap»òÕßreduceº¯Êý.mapº¯ÊýµÄ´¦ÀíÖ®ºóµÄÖмä½á¹ûÊý¾Ý»á·ÅÔÚ±¾µØÎļþϵͳÖеÄ.

10.ÔÚ½áÊø³ÌÐòÖ®ºó,½«½á¹ûÊý¾Ýд»áHDFSÖÐ.Õû¸öÁ÷³Ì´ó¸Å¾ÍÊÇÕâÑù×ÓµÄ.

4.YARN³öÏÖµÄÒâÒå----ÒýÓÃ

Ëæ×Å YARN µÄ³öÏÖ£¬Äú²»ÔÙÊܵ½¸ü¼òµ¥µÄ MapReduce ¿ª·¢Ä£Ê½Ô¼Êø£¬¶øÊÇ¿ÉÒÔ´´½¨¸ü¸´Ôӵķֲ¼Ê½Ó¦ÓóÌÐò¡£Êµ¼ÊÉÏ£¬Äú¿ÉÒÔ½« MapReduce Ä£ÐÍÊÓΪ YARN ¼Ü¹¹¿ÉÔËÐеÄһЩӦÓóÌÐòÖÐµÄÆäÖÐÒ»¸ö£¬Ö»ÊÇΪ×Ô¶¨Ò忪·¢¹«¿ªÁË»ù´¡¿ò¼ÜµÄ¸ü¶à¹¦ÄÜ¡£ÕâÖÖÄÜÁ¦·Ç³£Ç¿´ó£¬ÒòΪ YARN µÄʹÓÃÄ£Ðͼ¸ºõûÓÐÏÞÖÆ£¬²»ÔÙÐèÒªÓëÒ»¸ö¼¯ÈºÉÏ¿ÉÄÜ´æÔ򵀮äËû¸ü¸´Ôӵķֲ¼Ê½Ó¦ÓóÌÐò¿ò¼ÜÏà¸ôÀ룬¾ÍÏñ MRv1 Ò»Ñù¡£ÉõÖÁ¿ÉÒÔ˵£¬Ëæ×Å YARN ±äµÃ¸ü¼Ó½¡È«£¬ËüÓÐÄÜÁ¦È¡´úÆäËûһЩ·Ö²¼Ê½´¦Àí¿ò¼Ü£¬´Ó¶øÍêÈ«Ïû³ýÁËרÓÃÓÚÆäËû¿ò¼ÜµÄ×ÊÔ´¿ªÏú£¬Í¬Ê±»¹¼ò»¯ÁËÕû¸öϵͳ¡£

ΪÁËÑÝʾ YARN Ïà¶ÔÓÚ MRv1 µÄЧÂÊÌáÉý£¬¿É¿¼ÂÇÂùÁ¦²âÊԾɰ汾µÄ LAN Manager Hash µÄ²¢ÐÐÎÊÌ⣬ÕâÊǾɰæ Windows? ÓÃÓÚÃÜÂëÉ¢ÁÐÔËËãµÄµäÐÍ·½·¨¡£Ôڴ˳¡¾°ÖУ¬MapReduce ·½·¨Ã»Óжà´óÒâÒ壬ÒòΪ Mapping/Reducing ½×¶ÎÉæ¼°µ½Ì«¶à¿ªÏú¡£Ïà·´£¬¸üºÏÀíµÄ·½·¨ÊdzéÏó»¯×÷Òµ·ÖÅ䣬ÒÔ±ãÿ¸öÈÝÆ÷ÓµÓÐÃÜÂëËÑË÷¿Õ¼äµÄÒ»²¿·Ö£¬ÔÚÆäÖ®ÉϽøÐÐö¾Ù£¬²¢Í¨ÖªÄúÊÇ·ñÕÒµ½ÁËÕýÈ·µÄÃÜÂë¡£ÕâÀïµÄÖØµãÊÇ£¬ÃÜÂ뽫ͨ¹ýÒ»¸öº¯Êý À´¶¯Ì¬È·¶¨£¨ÕâȷʵÓе㼬ÊÖ£©£¬¶ø²»ÐèÒª½«ËùÓпÉÄÜÐÔÓ³Éäµ½Ò»¸öÊý¾Ý½á¹¹ÖУ¬Õâ¾ÍʹµÃ MapReduce ·ç¸ñÏԵò»±ØÒªÇÒ²»ÊµÓá£

¹é½á¶øÑÔ£¬MRv1 ¿ò¼ÜϵÄÎÊÌâ½öÊÇÐèÒªÒ»¸ö¹ØÁªÊý×飬¶øÇÒÕâЩÎÊÌâÓÐרÃų¯´óÊý¾Ý²Ù×÷·½ÏòÑݱäµÄÇãÏò¡£µ«ÊÇ£¬ÎÊÌâÒ»¶¨²»»áÓÀÔ¶½ö¾ÖÏÞÓÚ´Ë·¶Ê½ÖУ¬ÒòΪÄúÏÖÔÚ¿ÉÒÔ¸üΪ¼òµ¥µØ½«ËüÃdzéÏ󻯣¬±àд×Ô¶¨Òå¿Í»§¶Ë¡¢Ó¦ÓóÌÐòÖ÷³ÌÐò£¬ÒÔ¼°·ûºÏÈκÎÄúÏëÒªµÄÉè¼ÆµÄÓ¦ÓóÌÐò¡£

5.±àд¼òµ¥MapReduce YarnµÄÓ¦ÓóÌÐò

ÎÒÃÇÖ±½ÓÄÃApache Hadoop¹ÙÍøÖеÄwordcountµÄÀý×ÓÀ´ËµÃ÷MapReduce³ÌÐòµÄ±àд.

Source Code

import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class WordCount {
//±àд×Ô¼ºµÄMapper,ÐèÒª¼Ì³Ðorg.apache.hadoop.mapreduce.Mapper
public static class TokenizerMapper
extends Mapper<Object, Text, Text, IntWritable>{ //ÊäÈëµÄ<Key,Value>µÄÀàÐÍ,Êä³öµÄ<Key,Value>
//×÷ΪÀàÖгÉÔ±±äÁ¿
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
//key : offset Æ«ÒÆÁ¿,¼¸ºõ¿ÉÒÔºöÂÔ
//value : one line string Ò»ÐеÄÊý¾Ý
//context : the context of computer ¼ÆËãµÄÉÏÏÂÎÄ
public void map(Object key, Text value, Context context
) throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
}
}
}
//±àд×Ô¼ºµÄReducer,ÐèÒª¼Ì³Ðorg.apache.hadoop.mapreduce.Reducer
public static class IntSumReducer
extends Reducer<Text,IntWritable,Text,IntWritable> {
private IntWritable result = new IntWritable();

public void reduce(Text key, Iterable<IntWritable> values,
Context context
) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
result.set(sum);
context.write(key, result);
}
}
//Ö÷º¯Êý¿ªÊ¼ÔËÐÐJOB
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "word count");
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1); //Ìá½»JOB³É¹¦,Í˳öJVMÐéÄâ»ú
}
}

6.Hadoop2.0ÖÐÌá½»JobµÄÔ´Âë·ÖÎö

 

ÖÁ´Ë£¬Óë·þÎñÆ÷RMµÄͨÐÅÒѽ¨Á¢.

½ÓÏÂÀ´µÄ»°£¬¾ÍÊÇÌá½»jobÈÎÎñÁË.

7.±¸×¢£¨PS£©

¿ª·¢Ê¹ÓÃMapReduce³ÌÐò,´úÂë±¾ÉíÊÇûÓÐÄѶȵġ£ÒòΪ¶¼±»Hadoop±¾ÉíµÄ¿ò¼Ü·â×°ºÃÁË.ÎÒÃÇÒª×öµÄÖ»ÊÇʹÓÃËüµÄÏà¹ØAPIÀ´Íê³ÉÎÒÃǵÄʵ¼ÊµÄÒµÎñÐèÇó.µ«ÊÇMapReduce±¾ÉíµÄ³ÌÐòÊÇÓкܶàµÄÀ©Õ¹µÄ£¬°üÀ¨£¨Partitioner±à³Ì,×Ô¶¨ÒåÅÅÐò±à³Ì,Combiner±à³Ì,³£¼ûµÄMapReduceËã·¨£©¡£ÔÚʵ¼Ê¿ª·¢ÖÐ,ÓÉÓÚÊÇ·Ö²¼Ê½µÄ»·¾³¡£Ò²»áÔì³ÉÎÒÃÇ¿ª·¢µ÷ÊÔµÄÄѶÈÒ²»áÔö¼Ó¡£ÓкܶàµÄϸ½ÚºÍ֪ʶµãÐèÒªÁ˽â.ºÜÔÓºÜÂҵĸоõ¡£ÈçºÎÔÚwindowÏÂeclipseÔ¶³ÌDebug³ÌÐò.¶¼ÊÇ¿ª·¢Öг£Óõ½µÄ¡£ËùÒÔMapReduce֪ʶµÄ²Å¸Õ¿ªÊ¼.

   
3119 ´Îä¯ÀÀ       27
Ïà¹ØÎÄÕÂ

»ùÓÚEAµÄÊý¾Ý¿â½¨Ä£
Êý¾ÝÁ÷½¨Ä££¨EAÖ¸ÄÏ£©
¡°Êý¾Ýºþ¡±£º¸ÅÄî¡¢ÌØÕ÷¡¢¼Ü¹¹Óë°¸Àý
ÔÚÏßÉ̳ÇÊý¾Ý¿âϵͳÉè¼Æ ˼·+Ч¹û
 
Ïà¹ØÎĵµ

GreenplumÊý¾Ý¿â»ù´¡Åàѵ
MySQL5.1ÐÔÄÜÓÅ»¯·½°¸
ijµçÉÌÊý¾ÝÖÐ̨¼Ü¹¹Êµ¼ù
MySQL¸ßÀ©Õ¹¼Ü¹¹Éè¼Æ
Ïà¹Ø¿Î³Ì

Êý¾ÝÖÎÀí¡¢Êý¾Ý¼Ü¹¹¼°Êý¾Ý±ê×¼
MongoDBʵս¿Î³Ì
²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿âÉè¼ÆÓëÓÅ»¯
PostgreSQLÊý¾Ý¿âʵսÅàѵ
×îл¼Æ»®
DeepSeekÔÚÈí¼þ²âÊÔÓ¦ÓÃʵ¼ù 4-12[ÔÚÏß]
DeepSeek´óÄ£ÐÍÓ¦Óÿª·¢Êµ¼ù 4-19[ÔÚÏß]
UAF¼Ü¹¹ÌåϵÓëʵ¼ù 4-11[±±¾©]
AIÖÇÄÜ»¯Èí¼þ²âÊÔ·½·¨Óëʵ¼ù 5-23[ÉϺ£]
»ùÓÚ UML ºÍEA½øÐзÖÎöÉè¼Æ 4-26[±±¾©]
ÒµÎñ¼Ü¹¹Éè¼ÆÓ뽨ģ 4-18[±±¾©]

MySQLË÷Òý±³ºóµÄÊý¾Ý½á¹¹
MySQLÐÔÄܵ÷ÓÅÓë¼Ü¹¹Éè¼Æ
SQL ServerÊý¾Ý¿â±¸·ÝÓë»Ö¸´
ÈÃÊý¾Ý¿â·ÉÆðÀ´ 10´óDB2ÓÅ»¯
oracleµÄÁÙʱ±í¿Õ¼äдÂú´ÅÅÌ
Êý¾Ý¿âµÄ¿çƽ̨Éè¼Æ


²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿â
¸ß¼¶Êý¾Ý¿â¼Ü¹¹Éè¼ÆÊ¦
HadoopÔ­ÀíÓëʵ¼ù
Oracle Êý¾Ý²Ö¿â
Êý¾Ý²Ö¿âºÍÊý¾ÝÍÚ¾ò
OracleÊý¾Ý¿â¿ª·¢Óë¹ÜÀí


GE Çø¿éÁ´¼¼ÊõÓëʵÏÖÅàѵ
º½Ìì¿Æ¹¤Ä³×Ó¹«Ë¾ Nodejs¸ß¼¶Ó¦Óÿª·¢
ÖÐÊ¢Òæ»ª ׿Խ¹ÜÀíÕß±ØÐë¾ß±¸µÄÎåÏîÄÜÁ¦
ijÐÅÏ¢¼¼Êõ¹«Ë¾ PythonÅàѵ
ij²©²ÊITϵͳ³§ÉÌ Ò×ÓÃÐÔ²âÊÔÓëÆÀ¹À
ÖйúÓÊ´¢ÒøÐÐ ²âÊÔ³ÉÊì¶ÈÄ£Ðͼ¯³É(TMMI)
ÖÐÎïÔº ²úÆ·¾­ÀíÓë²úÆ·¹ÜÀí