1 Ïà¹ØËµÃ÷
1.1 ±³¾°¼ò½é
Ëæ×ÅÒ»¸ö²úÆ·µÄ×Ô¶¯»¯¹¤×÷²»¶ÏÉîÈ룬×Ô¶¯»¯µÄcase»ýÀÛÊýÁ¿³ÖÐøÔö³¤£¬¾ø´ó²¿·ÖºÁÎÞÒÀÀµ¹ØÏµµÄcaseÓÉÓÚ´®ÐÐÔËÐУ¬²âÊÔÖ´ÐÐʱ¼ä´ïµ½Ð¡Ê±½ç±ð£¬ÇÒ²»Ò×ÓÚÓÅ»¯¡£ÁíÍ⣬ciÔËÐÐʱËùÐè»úÆ÷×ÊÔ´µÄÇÀÕ¼»¥³â£¬ÔËÐлúÆ÷µÄ²»Îȶ¨µÈÎÊÌâÒ²Öð½¥À©´ó¡£
Hadoop·Ö²¼Ê½²âÊÔÖ´Ðз½°¸ÕýÊÇΪÁ˽â¾öÒÔÉÏÎÊÌâ¶ø²úÉú£¬Í¨¹ý·Ö²¼Ê½Ö´ÐУ¬¿ÉÒÔ´ïµ½²¢ÐÐÔËÐУ¬Ìá¸ßÖ´ÐÐЧÂʵÄÄ¿µÄ£»ÁíÍ⣬hadoopÌṩµ÷¶È£¬ÖØÊԵȻúÖÆ¹¦ÄÜ£¬¿ÉÒÔÌṩ¸øÓû§Ò»¸öÏà¶Ô͸Ã÷µÄ¼ÆËã×ÊÔ´³Ø£¬¼õÉÙÓû§¶Ô»úÆ÷ÔËÐл·¾³µÄÒÀÀµ¡£
1.2 ·Ö²¼Ê½Æ½Ì¨µÄÑ¡Ôñ
±¾·½°¸²ÉÓÃhadoopÀ´×÷Ϊ·Ö²¼Ê½Æ½Ì¨¡£Ê×ÏÈÊÇHadoopÊÇÒ»¸ö¿ªÔ´ÏîÄ¿£¬Óзdz£ºÃµÄ¼¼ÊõÖ§³Ö£¬¶þ¾ÍÊÇhadoopÓгÉÊìµÄ·Ö²¼Ê½µ÷¶ÈËã·¨£¬¿ÉÒԺܺõÄÀûÓÃÿ̨»úÆ÷µÄcpuºÍÄÚ´æ×ÊÔ´£¬´ïµ½¼ÆËã×ÊÔ´×îÓÅ·ÖÅ䣬Èý¾ÍÊÇhadoop³ÌÐòÒ×ÓÚ±àд£¬±ãÓÚά»¤¡£
1.3 Ãû´Ê½âÊÍ
£ºapache»ù½ð»áµÄ¿ªÔ´·Ö²¼Ê½¿ò¼Ü¡£
Mapreduce £ºhadoopµÄ¼ÆËãÄ£ÐÍ£¬ÓÉmapÈÎÎñºÍreduceÈÎÎñ×é³É¡£
Jobtracker £ºhadoop¼ÆËãϵͳµÄ×ܿء£
Tasktracker £ºhadoop¼ÆËãϵͳµÄ×ӽڵ㡣
Slot(²Ûλ) £ºtasktrackerµÄ×îС¼ÆËã·ÖÅäµ¥Ôª£¬Ò»¸ö²Ûλ¿ÉÒÔ¶ÔÓ¦Ò»¸ömapÈÎÎñ£¬Ò»¸ö»úÆ÷Æô¶¯Ò»¸ötasktracker£¬²ÛλµÄ»°°´ÕÕ»úÆ÷µÄcpuºËÊýÀ´·ÖÅ䣬һ°ãÊÇ¡±ºËÊý-1¡±¡£
2 ·Ö²¼Ê½²âÊÔÖ´Ðз½°¸
2.1 ´«Í³µÄµ¥»ú²âÊÔÖ´ÐÐÁ÷³Ì
Ò»°ãµÄµ¥»ú²âÊÔÁ÷³Ì·ÖΪ5²½£¬ÈçÏÂͼËùʾ£º
1¡¢lib¿â°²×°¡£°üÀ¨²âÊÔ¿ò¼ÜµÄlib¿â°²×°ÒÔ¼°»ùÓڸòâÊÔ¿ò¼ÜµÄ²úÆ·ÒµÎñ²ãlib¡£
2¡¢²âÊÔ»·¾³°²×°¡£Ö÷ÒªÖ¸±»²â¶ÔÏóµÄ²âÊÔ»·¾³°²×°£¬°üÀ¨Êý¾Ý¿â°²×°£¬server¶Ë°²×°µÈ¡£
3¡¢caseÏÂÔØ¡£´Ósvn»òÕßcase¿â»ñÈ¡ÐèÒªÖ´ÐеÄcase¡£
4¡¢caseÔËÐС£
5¡¢·¢Ëͱ¨¸æ¡£

µ¥»ú²âÊÔÖ´ÐеÄÓŵãÔÚÓÚÂß¼¼òµ¥£¬Ò×ÓÚʵÏÖ£¬È±µã¾ÍÊÇcaseÒª´®ÐÐÖ´ÐУ¬ÎÞ·¨ÓÐЧÀïÓлúÆ÷µÄcpuºÍÄÚ´æ×ÊÔ´¡£¾Ù¸öÀý×Ó£¬ÏÖÔÚÓÐÒ»¸ö8ºË16GµÄ²âÊÔ»ú£¬Ã¿¸öcaseµÄƽ¾ùcpuʹÓÃÂÊΪ10%£¬ÄÚ´æÏûºÄ1G£¬ÔÚÕâÑùµÄÇé¿ö£¬Ò»°ã¿ÉÒÔ×öµ½ÖÁÉÙ6¸öcase²¢Ðл¯£¬ÓÅ»¯Ð§ÂÊÊDz»ÑÔ¶øÓ÷µÄ¡£
2.2 ´Óµ¥»ú²âÊÔµ½·Ö²¼Ê½²âÊÔÖ´ÐеÄÂß¼
ÓÐÁËÒÔÉϵÄÎå¸ö²½Öè¼°Ïà¹Ø·ÖÎö£¬ÎÒÃÇ¿¼ÂÇÆäÖпÉÒÔ²¢ÐÐÖ´ÐÐÀ´½øÐÐÓÅ»¯µÄ¾ÍÊDzâÊÔÖ´ÐÐÕâ¿éÁË£¬ÆäËûlib¿â°²×°£¬²âÊÔ»·¾³°²×°µÈ¶¼»ù±¾ÊÇ×îСµ¥Ôª£¬²»Ò×ÇзÖÁË¡£
ËùÒÔ´Óµ¥»úµ½·Ö²¼Ê½Ö÷ÒªÊÇCaseÖ´Ðм¯ºÏµÄÒ»¸ö²ð·Ö¡£ËùÒÔ¼òµ¥Ëµ£¬µ¥»úºÍ·Ö²¼Ê½µÄÇø±ð¾ÍÊÇcaseÊäÈ뼯ºÏÓб䡰¶øÒÑ£¬ÆäËûµ¥»úµÄ²âÊÔÖ´Ðйý³Ì»ù±¾²»±ä¡£¶ÔÓÚ²âÊÔ¹¤³ÌʦÀ´Ëµ£¬Õâ¸ö¹ý³ÌÊÇ͸Ã÷µÄ£¬Ö»ÊÇÖ´ÐÐcaseµÄ»·¾³´Óµ¥»úÇл»µ½¶à»ú¡£
ÏÂͼ¼òÒªµÄ±íʾÁËcase´Óµ¥»úµ½¶à»úµÄ±ä»¯(6λµÄÊý×ÖÊÇcaseid)¡£

2.3 ·Ö²¼Ê½ÔËÐÐÂß¼
ÕâÀïµÄÂß¼Ö÷ÒªÊÇÁ½¿é£¬Ò»²¿·ÖÊDZ¾µØ²¿·Ö£¬Ò»²¿·ÖÊÇ·Ö²¼Ê½½Úµã»úÆ÷²¿·Ö¡£ÎÒÃǽ«·Ö²¼Ê½²âÊÔÖ´Ðйý³Ì·â×°µ½Ò»¸öhadoop
jobÀï¡£
±¾µØ²¿·Ö£º
1¡¢»ñÈ¡¼ÆËã×ÊÔ´¡£ÕâÀïµÄ¼ÆËã×ÊÔ´Ö¸¿ÉÓõÄtasktrackerµÄ²ÛλÊý£¬Õâ¸ö²ÛλÊÇcaseÇзֵķÖĸ¡£
2¡¢¸ù¾Ý¼ÆËã×ÊÔ´Éú³ÉcaseÁÐ±í¡£ÓÐÁ˲ÛλÊý£¬×î¼òµ¥µÄÇзÖËã·¨¾ÍÊÇ£ºÃ¿½ÚµãcaseÊý=×ÜcaseÊý/²ÛλÊý¡£
3¡¢ÒµÎñ²ã×Ô¶¨Òå²Ù×÷¡£ÀýÈçÒµÎñ²ã²âÊÔÖ´ÐÐʱÐèÒªµÄ³ÌÐò»òÕßÅäÖûñÈ¡£¬ÒÀÀµµÄ´óÊý¾ÝÍÆË͵½hdfsµÈ¡£
4¡¢ÅäÖÃhadoopµÄjob¡£°üÀ¨input£¬output£¬Ö´ÐÐjobËùÐèµÄÎļþ»òÕßtar°üµÈ¡£ÕâÀïµÄinput¾ÍÊÇcaseÁÐ±í¡£
5¡¢Ö´ÐвâÊÔÖ´ÐÐjob¡£Õâ¸öʵ¼ÊÊǸöhadoop job¡£
6¡¢·¢Ëͱ¨¸æ¡£»ã×Üÿ¸ö½ÚµãµÄÔËÐнá¹û£¬²¢·¢³ö±¨¸æ¡£
ÿ¸ötasktrackerµÄmapÈÎÎñÊäÈëÊÇÇзֺóµÄcaseÁÐ±í£¬Í¨¹ýÕâÖÖ·½Ê½½«Õû¸ö²âÊÔÖ´Ðв¿·Ö·Ö·¢µ½Ã¿¸ötasktrackerÉÏ¡£
½Úµã²¿·Ö£º
1¡¢×¼±¸caseÁÐ±í¡£´ÓmapµÄinput»ñÈ¡¡£
2¡¢¸ù¾ÝcaseÁбíÏÂÔØcase¡££¬ÕâÀïÀàËÆÓÚ±¾µØµ¥»ú°æµÄcase»ñÈ¡£¬À´Ô´ÈÔÈ»ÊÇSVN»òÕßCASE¿â¡£
3¡¢°²×°lib¿â¡£Í¬±¾µØµ¥»ú°æ¡£
4¡¢°²×°²âÊÔ»·¾³¡£Í¬±¾µØµ¥»ú°æ¡£
5¡¢Ö´ÐÐcase¡£Í¬±¾µØµ¥»ú°æ¡£
6¡¢ÍÆËͱ¨¸æ¡£
ÕâÀïhadoop»¹»á¸ù¾Ýÿ¸ömapÈÎÎñµÄ·µ»ØÖµ£¬À´½øÐÐÖØÊÔÔËÐеĵ÷¶È¡£
´ÓÒÔÉϵÄÃèÊö¿ÉÒÔ¿´µ½£¬ÔÚhadoop¼¯Èº½Úµã»úÆ÷ÉÏ(tasktracker)£¬²âÊÔÖ´ÐеÄÂß¼ºÍµ¥»ú°æ»ù±¾ÎÞ²î±ð£¬ËùÒÔÕû¸ö¸ÄÔìµÄ¹ý³ÌÒ²ÊDZȽϼòµ¥µÄ

2.4 ·Ö²¼Ê½²âÊÔ¼¯Èº¼Ü¹¹Éè¼Æ
Õû¸ö·Ö²¼Ê½²âÊÔÖ´ÐÐÒÀÍÐÓÚÒ»¸ö¹«¹²µÄ¼ÆË㼯Ⱥ£¬Õâ¸ö¼ÆË㼯ȺÓÉÁ½²¿·Ö×é³É£¬Ò»²¿·ÖÊÇhadoopÏà¹ØµÄ£¬°üÀ¨hadoopµÄ×ܿأ¬×Ó½ÚµãµÄtasktracker·þÎñ¡£ÁíÍâÒ»²¿·Ö¾ÍÊǹ«¹²»·¾³£¬°üÀ¨²âÊÔ¿ò¼Ü£¬¹«¹²¹¤¾ßÀýÈçvalgrindµÈ¡£Ç°Õßͨ¹ýjobtrackerÀ´¹ÜÀí£¬ºóÕßͨ¹ýͳһÔËάϵͳÀ´¹ÜÀí£¬Æä¹¦ÄÜ»ù±¾¾ÍÊǹ«¹²»·¾³µÄ°²×°ºÍά»¤¡£

3 ÊÕÒæ
¾¹ýÎÒÃǵÄʵ¼ÊÏîĿʵ¼ù£¬Õⲿ·ÖµÄÊÕÒæÖ÷ÒªÌåÏÖÔÚÈçÏÂÁ½µã£º
1¡¢²âÊÔÖ´ÐÐʱ¼ä´ó·ùÓÅ»¯¡£15̨»úÆ÷µÄÇé¿ö£¬ËùÓÐÔ²âÊÔÖ´ÐÐʱ¼äÒª1-2СʱµÄÄ£¿é£¬ÓÅ»¯µ½10·ÖÖÓÒÔÄÚ¡£
2¡¢»úÆ÷×ÊÔ´µÄ½ÚÊ¡¡£Í¨¹ý¹«¹²¼¯ÈºµÄά»¤£¬±£Ö¤ËùÓлúÆ÷cpuÂú¸ººÉÔË×÷£¬±ÜÃâÁËÒÔÍùµ¥»ú²âÊÔÖ´ÐеÄcpuÀË·Ñ¡£
4 ×¼ÈëÔÔò¼°·¢Õ¹·½Ïò
4.1 ·Ö²¼Ê½¸ÄÔìµÄ×¼ÈëÔÔò
²¢²»ÊÇËùÓеIJâÊÔÖ´Ðж¼¿ÉÒÔ·Ö²¼Ê½»¯£¬ÔÚÎÒÃǵÄʵ¼Ê²Ù×÷¹ý³ÌÖУ¬×ܽá³öÒÔϼ¸µã×¼ÈëÔÔò£¬¹©²Î¿¼£º
1¡¢¿Õ°×»úÆ÷¿ÉÔËÐС£Í¨¹ýÒ»¸ö×ܿؽű¾¾Í¿ÉÒÔ×öµ½ÒÀÀµ»·¾³×¼±¸£¬lib¿â°²×°£¬²âÊÔcaseÖ´Ðеȡ£
2¡¢²âÊÔ¿ò¼ÜÔÊÐícase²¢ÐС£
3¡¢ÒµÎñ²ãcase¶ÔÍⲿ²»´æÔڹ̶¨ÒÀÀµ£¬ÀýÈçÒÀÀµÓÚij¸öдËÀµÄĿ¼¡£
4¡¢ÒµÎñ²ãcaseÒÀÀµµÄserver¶Ë¿Ú£¬×îºÃÊÇËæ»úµÄ¡£
5¡¢²»ÔÊÐíÒµÎñ²ãÈ¥²Ù×÷¹«¹²»·¾³¡£
4.2 ºóÐø¿ÉÄܵļ¼Êõ·½Ïò
1¡¢case°´ÕÕÖ´ÐÐʱ¼äÇз֡£°´ÕÕʱ¼äÇзÖÀ´Ìæ´ú°´ÕÕcaseÊýÇз֡£
2¡¢´Ó·Ö²¼Ê½²âÊÔÖ´Ðйý¶Éµ½ÔƲâÊÔ·þÎñ¡£
|