Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Model Center   Code  
»áÔ±   
   
 
     
   
 ¶©ÔÄ
  ¾èÖú
ÉîÈëÀí½âHDFS£ºHadoop·Ö²¼Ê½Îļþϵͳ
 
  2667  次浏览      27
 2019-9-26
 
±à¼­ÍƼö:
±¾ÎÄÀ´×Ôcsdn,Îı¾Ïêϸ½éÉÜÁËHDFSÖеÄÐí¶à¸ÅÄîÒÔ¼°¼¯Èº´æ´¢µÄÊý¾Ý£¬¶ÔÓÚÀí½âHadoop·Ö²¼Ê½ÎļþϵͳºÜÓаïÖú¡£

1. ½éÉÜ

ÔÚÏÖ´úµÄÆóÒµ»·¾³ÖУ¬µ¥»úÈÝÁ¿ÍùÍùÎÞ·¨´æ´¢´óÁ¿Êý¾Ý£¬ÐèÒª¿ç»úÆ÷´æ´¢¡£Í³Ò»¹ÜÀí·Ö²¼ÔÚ¼¯ÈºÉϵÄÎļþϵͳ³ÆÎª·Ö²¼Ê½Îļþϵͳ¡£¶øÒ»µ©ÔÚϵͳÖУ¬ÒýÈëÍøÂ磬¾Í²»¿É±ÜÃâµØÒýÈëÁËËùÓÐÍøÂç±à³ÌµÄ¸´ÔÓÐÔ£¬ÀýÈçÌôÕ½Ö®Ò»ÊÇÈç¹û±£Ö¤Ôڽڵ㲻¿ÉÓõÄʱºòÊý¾Ý²»¶ªÊ§¡£

´«Í³µÄÍøÂçÎļþϵͳ£¨NFS£©ËäȻҲ³ÆÎª·Ö²¼Ê½Îļþϵͳ£¬µ«ÊÇÆä´æÔÚһЩÏÞÖÆ¡£ÓÉÓÚNFSÖУ¬ÎļþÊÇ´æ´¢ÔÚµ¥»úÉÏ£¬Òò´ËÎÞ·¨Ìṩ¿É¿¿ÐÔ±£Ö¤£¬µ±ºÜ¶à¿Í»§¶Ëͬʱ·ÃÎÊNFS Serverʱ£¬ºÜÈÝÒ×Ôì³É·þÎñÆ÷ѹÁ¦£¬Ôì³ÉÐÔÄÜÆ¿¾±¡£ÁíÍâÈç¹ûÒª¶ÔNFSÖеÄÎļþÖнøÐвÙ×÷£¬ÐèÒªÊ×ÏÈͬ²½µ½±¾µØ£¬ÕâЩÐÞ¸ÄÔÚͬ²½µ½·þÎñ¶Ë֮ǰ£¬ÆäËû¿Í»§¶ËÊDz»¿É¼ûµÄ¡£Ä³Ö̶ֳÈÉÏ£¬NFS²»ÊÇÒ»ÖÖµäÐ͵ķֲ¼Ê½ÏµÍ³£¬ËäÈ»ËüµÄÎļþµÄÈ··ÅÔÚÔ¶¶Ë£¨µ¥Ò»£©µÄ·þÎñÆ÷ÉÏÃæ¡£

´ÓNFSµÄЭÒéÕ»¿ÉÒÔ¿´µ½£¬ËüÊÂʵÉÏÊÇÒ»ÖÖVFS£¨²Ù×÷ϵͳ¶ÔÎļþµÄÒ»ÖÖ³éÏó£©ÊµÏÖ¡£

HDFS£¬ÊÇHadoop Distributed File SystemµÄ¼ò³Æ£¬ÊÇHadoop³éÏóÎļþϵͳµÄÒ»ÖÖʵÏÖ¡£Hadoop³éÏóÎļþϵͳ¿ÉÒÔÓë±¾µØÏµÍ³¡¢Amazon S3µÈ¼¯³É£¬ÉõÖÁ¿ÉÒÔͨ¹ýWebЭÒ飨webhsfs£©À´²Ù×÷¡£HDFSµÄÎļþ·Ö²¼ÔÚ¼¯Èº»úÆ÷ÉÏ£¬Í¬Ê±Ìṩ¸±±¾½øÐÐÈÝ´í¼°¿É¿¿ÐÔ±£Ö¤¡£ÀýÈç¿Í»§¶ËдÈë¶ÁÈ¡ÎļþµÄÖ±½Ó²Ù×÷¶¼ÊÇ·Ö²¼ÔÚ¼¯Èº¸÷¸ö»úÆ÷Éϵģ¬Ã»Óе¥µãÐÔÄÜѹÁ¦¡£

Èç¹ûÄã´ÓÁ㿪ʼ´î½¨Ò»¸öÍêÕûµÄ¼¯Èº£¬²Î¿¼[Hadoop¼¯Èº´î½¨Ïêϸ²½Ö裨2.6.0£©]

2. HDFSÉè¼ÆÔ­Ôò

HDFSÉè¼ÆÖ®³õ¾Í·Ç³£Ã÷È·ÆäÓ¦Óó¡¾°£¬ÊÊÓÃÓëʲôÀàÐ͵ÄÓ¦Ó㬲»ÊÊÓÃʲôӦÓã¬ÓÐÒ»¸öÏà¶ÔÃ÷È·µÄÖ¸µ¼Ô­Ôò¡£

2.1 Éè¼ÆÄ¿±ê

´æ´¢·Ç³£´óµÄÎļþ£ºÕâÀï·Ç³£´óÖ¸µÄÊǼ¸°ÙM¡¢G¡¢»òÕßTB¼¶±ð¡£Êµ¼ÊÓ¦ÓÃÖÐÒÑÓкܶ༯Ⱥ´æ´¢µÄÊý¾Ý´ïµ½PB¼¶±ð¡£¸ù¾ÝHadoop¹ÙÍø£¬Yahoo£¡µÄHadoop¼¯ÈºÔ¼ÓÐ10Íò¿ÅCPU£¬ÔËÐÐÔÚ4Íò¸ö»úÆ÷½ÚµãÉÏ¡£¸ü¶àÊÀ½çÉϵÄHadoop¼¯ÈºÊ¹ÓÃÇé¿ö£¬²Î¿¼Hadoop¹ÙÍø.

²ÉÓÃÁ÷ʽµÄÊý¾Ý·ÃÎÊ·½Ê½: HDFS»ùÓÚÕâÑùµÄÒ»¸ö¼ÙÉ裺×îÓÐЧµÄÊý¾Ý´¦ÀíģʽÊÇÒ»´ÎдÈë¡¢¶à´Î¶ÁÈ¡Êý¾Ý¼¯¾­³£´ÓÊý¾ÝÔ´Éú³É»òÕß¿½±´Ò»´Î£¬È»ºóÔÚÆäÉÏ×öºÜ¶à·ÖÎö¹¤×÷

·ÖÎö¹¤×÷¾­³£¶ÁÈ¡ÆäÖеĴ󲿷ÖÊý¾Ý£¬¼´Ê¹²»ÊÇÈ«²¿¡£ Òò´Ë¶ÁÈ¡Õû¸öÊý¾Ý¼¯ËùÐèʱ¼ä±È¶ÁÈ¡µÚÒ»Ìõ¼Ç¼µÄÑÓʱ¸üÖØÒª¡£

ÔËÐÐÓÚÉÌÒµÓ²¼þÉÏ: Hadoop²»ÐèÒªÌØ±ð¹óµÄ¡¢reliableµÄ»úÆ÷£¬¿ÉÔËÐÐÓÚÆÕͨÉÌÓûúÆ÷£¨¿ÉÒÔ´Ó¶à¼Ò¹©Ó¦É̲ɹº£© ÉÌÓûúÆ÷²»´ú±íµÍ¶Ë»úÆ÷ÔÚ¼¯ÈºÖУ¨ÓÈÆäÊÇ´óµÄ¼¯Èº£©£¬½Úµãʧ°ÜÂÊÊDZȽϸߵÄHDFSµÄÄ¿±êÊÇÈ·±£¼¯ÈºÔÚ½Úµãʧ°ÜµÄʱºò²»»áÈÃÓû§¸Ð¾õµ½Ã÷ÏÔµÄÖжϡ£

2.2 HDFS²»ÊʺϵÄÓ¦ÓÃÀàÐÍ

ÓÐЩ³¡¾°²»ÊʺÏʹÓÃHDFSÀ´´æ´¢Êý¾Ý¡£ÏÂÃæÁоټ¸¸ö£º

1£© µÍÑÓʱµÄÊý¾Ý·ÃÎÊ

¶ÔÑÓʱҪÇóÔÚºÁÃë¼¶±ðµÄÓ¦Ó㬲»ÊʺϲÉÓÃHDFS¡£HDFSÊÇΪ¸ßÍÌÍÂÊý¾Ý´«ÊäÉè¼ÆµÄ,Òò´Ë¿ÉÄÜÎþÉüÑÓʱHBase¸üÊʺϵÍÑÓʱµÄÊý¾Ý·ÃÎÊ¡£

2£©´óÁ¿Ð¡Îļþ

ÎļþµÄÔªÊý¾Ý£¨ÈçĿ¼½á¹¹£¬ÎļþblockµÄ½ÚµãÁÐ±í£¬block-node mapping£©±£´æÔÚNameNodeµÄÄÚ´æÖУ¬ Õû¸öÎļþϵͳµÄÎļþÊýÁ¿»áÊÜÏÞÓÚNameNodeµÄÄÚ´æ´óС¡£

¾­Ñé¶øÑÔ£¬Ò»¸öÎļþ/Ŀ¼/Îļþ¿éÒ»°ãÕ¼ÓÐ150×Ö½ÚµÄÔªÊý¾ÝÄÚ´æ¿Õ¼ä¡£Èç¹ûÓÐ100Íò¸öÎļþ£¬Ã¿¸öÎļþÕ¼ÓÃ1¸öÎļþ¿é£¬ÔòÐèÒª´óÔ¼300MµÄÄÚ´æ¡£Òò´ËÊ®ÒÚ¼¶±ðµÄÎļþÊýÁ¿ÔÚÏÖÓÐÉÌÓûúÆ÷ÉÏÄÑÒÔÖ§³Ö¡£

3£©¶à·½¶Áд£¬ÐèÒªÈÎÒâµÄÎļþÐÞ¸Ä

HDFS²ÉÓÃ×·¼Ó£¨append-only£©µÄ·½Ê½Ð´ÈëÊý¾Ý¡£²»Ö§³ÖÎļþÈÎÒâoffsetµÄÐ޸ġ£²»Ö§³Ö¶à¸öдÈëÆ÷£¨writer£©¡£

3. HDFSºËÐĸÅÄî

3.1 Blocks

ÎïÀí´ÅÅÌÖÐÓпéµÄ¸ÅÄ´ÅÅ̵ÄÎïÀíBlockÊÇ´ÅÅ̲Ù×÷×îСµÄµ¥Ôª£¬¶Áд²Ù×÷¾ùÒÔBlockΪ×îСµ¥Ôª£¬Ò»°ãΪ512 Byte¡£ÎļþϵͳÔÚÎïÀíBlockÖ®ÉϳéÏóÁËÁíÒ»²ã¸ÅÄÎļþϵͳBlockÎïÀí´ÅÅÌBlockµÄÕûÊý±¶¡£Í¨³£Îª¼¸KB¡£HadoopÌṩµÄdf¡¢fsckÕâÀàÔËά¹¤¾ß¶¼ÊÇÔÚÎļþϵͳµÄBlock¼¶±ðÉϽøÐвÙ×÷¡£

HDFSµÄBlock¿é±ÈÒ»°ãµ¥»úÎļþϵͳ´óµÃ¶à£¬Ä¬ÈÏΪ128M¡£HDFSµÄÎļþ±»²ð·Ö³Éblock-sizedµÄchunk£¬chunk×÷Ϊ¶ÀÁ¢µ¥Ôª´æ´¢¡£±ÈBlockСµÄÎļþ²»»áÕ¼ÓÃÕû¸öBlock£¬Ö»»áÕ¼¾Ýʵ¼Ê´óС¡£ÀýÈ磬 Èç¹ûÒ»¸öÎļþ´óСΪ1M£¬ÔòÔÚHDFSÖÐÖ»»áÕ¼ÓÃ1MµÄ¿Õ¼ä£¬¶ø²»ÊÇ128M¡£

HDFSµÄBlockΪʲôÕâô´ó£¿

ÊÇΪÁË×îС»¯²éÕÒ£¨seek£©Ê±¼ä£¬¿ØÖƶ¨Î»ÎļþÓë´«ÊäÎļþËùÓõÄʱ¼ä±ÈÀý¡£¼ÙÉ趨λµ½BlockËùÐèµÄʱ¼äΪ10ms£¬´ÅÅÌ´«ÊäËÙ¶ÈΪ100M/s¡£Èç¹ûÒª½«¶¨Î»µ½BlockËùÓÃʱ¼äÕ¼´«Êäʱ¼äµÄ±ÈÀý¿ØÖÆ1%£¬ÔòBlock´óСÐèÒªÔ¼100M¡£

µ«ÊÇÈç¹ûBlockÉèÖùý´ó£¬ÔÚMapReduceÈÎÎñÖУ¬Map»òÕßReduceÈÎÎñµÄ¸öÊý Èç¹ûСÓÚ¼¯Èº»úÆ÷ÊýÁ¿£¬»áʹµÃ×÷ÒµÔËÐÐЧÂʺܵ͡£

Block³éÏóµÄºÃ´¦

blockµÄ²ð·ÖʹµÃµ¥¸öÎļþ´óС¿ÉÒÔ´óÓÚÕû¸ö´ÅÅ̵ÄÈÝÁ¿£¬¹¹³ÉÎļþµÄBlock¿ÉÒÔ·Ö²¼ÔÚÕû¸ö¼¯Èº£¬ ÀíÂÛÉÏ£¬µ¥¸öÎļþ¿ÉÒÔÕ¼¾Ý¼¯ÈºÖÐËùÓлúÆ÷µÄ´ÅÅÌ¡£

BlockµÄ³éÏóÒ²¼ò»¯Á˴洢ϵͳ£¬¶ÔÓÚBlock£¬ÎÞÐè¹Ø×¢ÆäȨÏÞ£¬ËùÓÐÕßµÈÄÚÈÝ£¨ÕâЩÄÚÈݶ¼ÔÚÎļþ¼¶±ðÉϽøÐпØÖÆ£©¡£

Block×÷ΪÈÝ´íºÍ¸ß¿ÉÓûúÖÆÖеĸ±±¾µ¥Ôª£¬¼´ÒÔBlockΪµ¥Î»½øÐи´ÖÆ¡£

3.2 Namenode & Datanode

Õû¸öHDFS¼¯ÈºÓÉNamenodeºÍDatanode¹¹³Émaster-worker£¨Ö÷´Ó£©Ä£Ê½¡£Namenode¸´ÔÓ¹¹½¨ÃüÃû¿Õ¼ä£¬¹ÜÀíÎļþµÄÔªÊý¾ÝµÈ£¬¶øDatanode¸ºÔðʵ¼Ê´æ´¢Êý¾Ý£¬¸ºÔð¶Áд¹¤×÷¡£

Namenode

Namenode´æ·ÅÎļþϵͳÊ÷¼°ËùÓÐÎļþ¡¢Ä¿Â¼µÄÔªÊý¾Ý¡£ÔªÊý¾Ý³Ö¾Ã»¯Îª2ÖÖÐÎʽ£º

NFS£º´«Í³µÄÍøÂçÎļþϵͳ
QJM£ºquorum journal manager

µ«Êdz־û¯Êý¾ÝÖв»°üÀ¨BlockËùÔڵĽڵãÁÐ±í£¬¼°ÎļþµÄBlock·Ö²¼ÔÚ¼¯ÈºÖеÄÄÄЩ½ÚµãÉÏ£¬ÕâЩÐÅÏ¢ÊÇÔÚÏµÍ³ÖØÆôµÄʱºòÖØÐ¹¹½¨£¨Í¨¹ýDatanode»ã±¨µÄBlockÐÅÏ¢£©¡£

ÔÚHDFSÖУ¬Namenode¿ÉÄܳÉΪ¼¯ÈºµÄµ¥µã¹ÊÕÏ£¬Namenode²»¿ÉÓÃʱ£¬Õû¸öÎļþϵͳÊDz»¿ÉÓõġ£HDFSÕë¶Ôµ¥µã¹ÊÕÏÌṩÁË2ÖÖ½â¾ö»úÖÆ£º

1£©±¸·Ý³Ö¾Ã»¯ÔªÊý¾Ý

½«ÎļþϵͳµÄÔªÊý¾Ýͬʱдµ½¶à¸öÎļþϵͳ£¬ ÀýÈçͬʱ½«ÔªÊý¾Ýдµ½±¾µØÎļþϵͳ¼°NFS¡£ÕâЩ±¸·Ý²Ù×÷¶¼ÊÇͬ²½µÄ¡¢Ô­×ӵġ£

2£©Secondary Namenode

Secondary½Úµã¶¨ÆÚºÏ²¢Ö÷NamenodeµÄnamespace imageºÍedit log£¬ ±ÜÃâedit log¹ý´ó£¬Í¨¹ý´´½¨¼ì²éµãcheckpointÀ´ºÏ²¢¡£Ëü»áά»¤Ò»¸öºÏ²¢ºóµÄnamespace image¸±±¾£¬ ¿ÉÓÃÓÚÔÚNamenodeÍêÈ«±ÀÀ£Ê±»Ö¸´Êý¾Ý¡£ÏÂͼΪSecondary NamenodeµÄ¹ÜÀí½çÃæ£º

Secondary Namenodeͨ³£ÔËÐÐÔÚÁíһ̨»úÆ÷£¬ÒòΪºÏ²¢²Ù×÷ÐèÒªºÄ·Ñ´óÁ¿µÄCPUºÍÄÚ´æ¡£ÆäÊý¾ÝÂäºóÓÚNamenode£¬Òò´Ëµ±NamenodeÍêÈ«±ÀÀ£Ê±£¬»á³öÏÖÊý¾Ý¶ªÊ§¡£ ͨ³£×ö·¨ÊÇ¿½±´NFSÖеı¸·ÝÔªÊý¾Ýµ½Second£¬½«Æä×÷ΪеÄÖ÷Namenode¡£

ÔÚHAÖпÉÒÔÔËÐÐÒ»¸öHot Standby£¬×÷ΪÈȱ¸·Ý£¬ÔÚActive Namenode¹ÊÕÏÖ®ºó£¬Ìæ´úÔ­ÓÐNamenode³ÉΪActive Namenode¡£

Datanode

Êý¾Ý½Úµã¸ºÔð´æ´¢ºÍÌáÈ¡Block£¬¶ÁдÇëÇó¿ÉÄÜÀ´×Ônamenode£¬Ò²¿ÉÄÜÖ±½ÓÀ´×Ô¿Í»§¶Ë¡£Êý¾Ý½ÚµãÖÜÆÚÐÔÏòNamenode»ã±¨×Ô¼º½ÚµãÉÏËù´æ´¢µÄBlockÏà¹ØÐÅÏ¢¡£

3.3 Block Caching

DataNodeͨ³£Ö±½Ó´Ó´ÅÅ̶ÁÈ¡Êý¾Ý£¬µ«ÊÇÆµ·±Ê¹ÓõÄBlock¿ÉÒÔÔÚÄÚ´æÖлº´æ¡£Ä¬ÈÏÇé¿öÏ£¬Ò»¸öBlockÖ»ÓÐÒ»¸öÊý¾Ý½Úµã»á»º´æ¡£µ«ÊÇ¿ÉÒÔÕë¶Ôÿ¸öÎļþ¿ÉÒÔ¸öÐÔ»¯ÅäÖá£

×÷Òµµ÷¶ÈÆ÷¿ÉÒÔÀûÓûº´æÌáÉýÐÔÄÜ£¬ÀýÈçMapReduce¿ÉÒÔ°ÑÈÎÎñÔËÐÐÔÚÓÐBlock»º´æµÄ½ÚµãÉÏ¡£

Óû§»òÕßÓ¦ÓÿÉÒÔÏòNameNode·¢ËÍ»º´æÖ¸Á»º´æÄĸöÎļþ£¬»º´æ¶à¾Ã£©£¬ »º´æ³ØµÄ¸ÅÄîÓÃÓÚ¹ÜÀíÒ»×黺´æµÄȨÏÞºÍ×ÊÔ´¡£

3.4 HDFS Federation

ÎÒÃÇÖªµÀNameNodeµÄÄÚ´æ»áÖÆÔ¼ÎļþÊýÁ¿£¬HDFS FederationÌṩÁËÒ»ÖÖºáÏòÀ©Õ¹NameNodeµÄ·½Ê½¡£ÔÚFederationģʽÖУ¬Ã¿¸öNameNode¹ÜÀíÃüÃû¿Õ¼äµÄÒ»²¿·Ö£¬ÀýÈçÒ»¸öNameNode¹ÜÀí/userĿ¼ÏµÄÎļþ£¬ ÁíÒ»¸öNameNode¹ÜÀí/shareĿ¼ÏµÄÎļþ¡£

ÿ¸öNameNode¹ÜÀíÒ»¸önamespace volumn£¬ËùÓÐvolumn¹¹³ÉÎļþϵͳµÄÔªÊý¾Ý¡£Ã¿¸öNameNodeͬʱά»¤Ò»¸öBlock Pool£¬±£´æBlockµÄ½ÚµãÓ³ÉäµÈÐÅÏ¢¡£¸÷NameNodeÖ®¼äÊǶÀÁ¢µÄ£¬Ò»¸ö½ÚµãµÄʧ°Ü²»»áµ¼ÖÂÆäËû½Úµã¹ÜÀíµÄÎļþ²»¿ÉÓá£

¿Í»§¶ËʹÓÃmount table½«Îļþ·¾¶Ó³Éäµ½NameNode¡£mount tableÊÇÔÚNamenodeȺ×éÖ®ÉÏ·â×°ÁËÒ»²ã£¬ÕâÒ»²ãÒ²ÊÇÒ»¸öHadoopÎļþϵͳµÄʵÏÖ£¬Í¨¹ýviewfs:ЭÒé·ÃÎÊ¡£

3.5 HDFS HA

ÔÚHDFS¼¯ÈºÖУ¬NameNodeÒÀÈ»Êǵ¥µã¹ÊÕÏ£¨SPOF£©¡£ÔªÊý¾Ýͬʱдµ½¶à¸öÎļþϵͳÒÔ¼°Second NameNode¶¨ÆÚcheckpointÓÐÀûÓÚ±£»¤Êý¾Ý¶ªÊ§£¬µ«ÊDz¢²»ÄÜÌá¸ß¿ÉÓÃÐÔ¡£

ÕâÊÇÒòΪNameNodeÊÇΨһһ¸ö¶ÔÎļþÔªÊý¾ÝºÍfile-blockÓ³É为ÔðµÄµØ·½£¬ µ±Ëü¹ÒÁËÖ®ºó£¬°üÀ¨MapReduceÔÚÄÚµÄ×÷Òµ¶¼ÎÞ·¨½øÐжÁд¡£

µ±NameNode¹ÊÕÏʱ£¬³£¹æµÄ×ö·¨ÊÇʹÓÃÔªÊý¾Ý±¸·ÝÖØÐÂÆô¶¯Ò»¸öNameNode¡£ÔªÊý¾Ý±¸·Ý¿ÉÄÜÀ´Ô´ÓÚ£º

¶àÎļþϵͳдÈëÖеı¸·Ý

Second NameNodeµÄ¼ì²éµãÎļþ

Æô¶¯ÐµÄNamenodeÖ®ºó£¬ÐèÒªÖØÐÂÅäÖÿͻ§¶ËºÍDataNodeµÄNameNodeÐÅÏ¢¡£ÁíÍâÖØÆôºÄʱһ°ã±È½Ï¾Ã£¬ÉԾ߹æÄ£µÄ¼¯ÈºÖØÆô¾­³£ÐèÒª¼¸Ê®·ÖÖÓÉõÖÁÊýСʱ£¬Ôì³ÉÖØÆôºÄʱµÄÔ­Òò´óÖÂÓУº

1£© ÔªÊý¾Ý¾µÏñÎļþÔØÈëµ½ÄÚ´æºÄʱ½Ï³¤¡£

2£© ÐèÒªÖØ·Åedit log

3£© ÐèÒªÊÕµ½À´×ÔDataNodeµÄ״̬±¨¸æ²¢ÇÒÂú×ãÌõ¼þºó²ÅÄÜÀ뿪°²È«Ä£Ê½Ìṩд·þÎñ¡£

HadoopµÄHA·½°¸

²ÉÓÃHAµÄHDFS¼¯ÈºÅäÖÃÁ½¸öNameNode£¬·Ö±ð´¦ÓÚActiveºÍStandby״̬¡£µ±Active NameNode¹ÊÕÏÖ®ºó£¬Standby½Ó¹ýÔðÈμÌÐøÌṩ·þÎñ£¬Óû§Ã»ÓÐÃ÷ÏÔµÄÖжϸоõ¡£Ò»°ãºÄʱÔÚ¼¸Ê®Ãëµ½Êý·ÖÖÓ¡£

HAÉæ¼°µ½µÄÖ÷ҪʵÏÖÂß¼­ÓÐ

1£© Ö÷±¸Ðè¹²Ïíedit log´æ´¢¡£

Ö÷NameNodeºÍ´ýÃüµÄNameNode¹²ÏíÒ»·Ýedit log£¬µ±Ö÷±¸Çл»Ê±£¬Standbyͨ¹ý»Ø·Åedit logͬ²½Êý¾Ý¡£

¹²Ïí´æ´¢Í¨³£ÓÐ2ÖÖÑ¡Ôñ

NFS£º´«Í³µÄÍøÂçÎļþϵͳ

QJM£ºquorum journal manager

QJMÊÇרÃÅΪHDFSµÄHAʵÏÖ¶øÉè¼ÆµÄ£¬ÓÃÀ´Ìṩ¸ß¿ÉÓõÄedit log¡£QJMÔËÐÐÒ»×éjournal node£¬edit log±ØÐëдµ½´ó²¿·ÖµÄjournal nodes¡£Í¨³£Ê¹ÓÃ3¸ö½Úµã£¬Òò´ËÔÊÐíÒ»¸ö½Úµãʧ°Ü£¬ÀàËÆZooKeeper¡£×¢ÒâQJMûÓÐʹÓÃZK£¬ËäÈ»HDFS HAµÄȷʹÓÃÁËZKÀ´Ñ¡¾ÙÖ÷Namenode¡£Ò»°ãÍÆ¼öʹÓÃQJM¡£

2£©DataNodeÐèҪͬʱÍùÖ÷±¸·¢ËÍBlock Report

ÒòΪBlockÓ³ÉäÊý¾Ý´æ´¢ÔÚÄÚ´æÖУ¨²»ÊÇÔÚ´ÅÅÌÉÏ£©£¬ÎªÁËÔÚActive NameNode¹ÒµôÖ®ºó£¬ÐµÄNameNodeÄܹ»¿ìËÙÆô¶¯£¬²»ÐèÒªµÈ´ýÀ´×ÔDatanodeµÄBlock Report£¬DataNodeÐèҪͬʱÏòÖ÷±¸Á½¸öNameNode·¢ËÍBlock Report¡£

3£©¿Í»§¶ËÐèÒªÅäÖÃfailoverģʽ£¨¶ÔÓû§Í¸Ã÷£©

NamenodeµÄÇл»¶Ô¿Í»§¶ËÀ´ËµÊÇÎÞ¸ÐÖªµÄ£¬Í¨¹ý¿Í»§¶Ë¿âÀ´ÊµÏÖ¡£¿Í»§¶ËÔÚÅäÖÃÎļþÖÐʹÓõÄHDFS URIÊÇÂß¼­Â·¾¶£¬Ó³Éäµ½Ò»¶ÔNamenodeµØÖ·¡£¿Í»§¶Ë»á²»¶Ï³¢ÊÔÿһ¸öNamenodeµØÖ·Ö±µ½³É¹¦¡£

4£©StandbyÌæ´úSecondary NameNode

Èç¹ûûÓÐÆôÓÃHA£¬HDFS¶ÀÁ¢ÔËÐÐÒ»¸öÊØ»¤½ø³Ì×÷ΪSecondary Namenode¡£¶¨ÆÚcheckpoint£¬ºÏ²¢¾µÏñÎļþºÍeditÈÕÖ¾¡£

Èç¹ûµ±Ö÷Namenodeʧ°Üʱ£¬±¸·ÝNamenodeÕýÔڹػú£¨Í£Ö¹ Standby£©£¬ÔËάÈËÔ±ÒÀÈ»¿ÉÒÔ´ÓÍ·Æô¶¯±¸·ÝNamenode£¬ÕâÑù±ÈûÓÐHAµÄʱºò¸üʡʣ¬ËãÊÇÒ»ÖָĽø£¬ÒòÎªÖØÆôÕû¸ö¹ý³ÌÒѾ­±ê×¼»¯µ½HadoopÄÚ²¿£¬ÎÞÐèÔËά½øÐи´ÔÓµÄÇл»²Ù×÷¡£

NameNodeµÄÇл»Í¨¹ý´úfailover controllerÀ´ÊµÏÖ¡£failover controllerÓжàÖÖʵÏÖ£¬Ä¬ÈÏʵÏÖʹÓÃZooKeeperÀ´±£Ö¤Ö»ÓÐÒ»¸öNamenode´¦ÓÚactive״̬¡£

ÿ¸öNamenodeÔËÐÐÒ»¸öÇáÁ¿¼¶µÄfailover controller½ø³Ì£¬¸Ã½ø³ÌʹÓüòµ¥µÄÐÄÌø»úÖÆÀ´¼à¿ØNamenodeµÄ´æ»î״̬²¢ÔÚNamenodeʧ°ÜÊÇ´¥·¢failover¡£Failover¿ÉÒÔÓÉÔËάÊÖ¶¯´¥·¢£¬ÀýÈçÔÚÈÕ³£Î¬»¤ÖÐÐèÒªÇл»Ö÷Namenode£¬ÕâÖÖÇé¿ögraceful failover£¬·ÇÊÖ¶¯´¥·¢µÄfailover³ÆÎªungraceful failover¡£

ÔÚungraceful failoverµÄÇé¿öÏ£¬Ã»Óа취ȷ¶¨Ê§°Ü£¨±»Åж¨ÎªÊ§°Ü£©µÄ½ÚµãÊÇ·ñÍ£Ö¹ÔËÐУ¬Ò²¾ÍÊÇ˵´¥·¢failoverºó£¬Ö®Ç°µÄÖ÷Namenode¿ÉÄÜ»¹ÔÚÔËÐС£QJMÒ»´ÎÖ»ÔÊÐíÒ»¸öNamenodeдedit log£¬µ«ÊÇ֮ǰµÄÖ÷NamenodeÈÔÈ»¿ÉÒÔ½ÓÊܶÁÇëÇó¡£HadoopʹÓÃfencingÀ´É±µô֮ǰµÄNamenode¡£Fencingͨ¹ýÊÕ»ØÖ®Ç°Namenode¶Ô¹²ÏíµÄedit logµÄ·ÃÎÊȨÏÞ¡¢¹Ø±ÕÆäÍøÂç¶Ë¿ÚʹµÃÔ­ÓеÄNamenode²»ÄÜÔÙ¼ÌÐø½ÓÊÜ·þÎñÇëÇó¡£Ê¹ÓÃSTONITH¼¼ÊõÒ²¿ÉÒÔ½«Ö®Ç°µÄÖ÷Namenode¹Ø»ú¡£

×îºó£¬HA·½°¸ÖÐNamenodeµÄÇл»¶Ô¿Í»§¶ËÀ´ËµÊDz»¿É¼ûµÄ£¬Ç°ÃæÒѾ­½éÉܹý£¬Ö÷Ҫͨ¹ý¿Í»§¶Ë¿âÀ´Íê³É¡£

4. ÃüÁîÐнӿÚ

HDFSÌṩÁ˸÷ÖÖ½»»¥·½Ê½£¬ÀýÈçͨ¹ýJava API¡¢HTTP¡¢shellÃüÁîÐеġ£ÃüÁîÐеĽ»»¥Ö÷Ҫͨ¹ýhadoop fsÀ´²Ù×÷¡£ÀýÈ磺

hadoop fs -copyFromLocal // ´Ó±¾µØ¸´ÖÆÎļþµ½HDFS
hadoop fs mkdir // ´´½¨Ä¿Â¼
hadoop fs -ls // ÁгöÎļþÁбí

HadoopÖУ¬ÎļþºÍĿ¼µÄȨÏÞÀàËÆÓÚPOSIXÄ£ÐÍ£¬°üÀ¨¶Á¡¢Ð´¡¢Ö´ÐÐ3ÖÖȨÏÞ£º

¶ÁȨÏÞ£¨r£©£ºÓÃÓÚ¶ÁÈ¡Îļþ»òÕßÁгöĿ¼ÖеÄÄÚÈÝ

дȨÏÞ£¨w£©£º¶ÔÓÚÎļþ£¬¾ÍÊÇÎļþµÄдȨÏÞ¡£Ä¿Â¼µÄдȨÏÞÖ¸ÔÚ¸ÃĿ¼Ï´´½¨»òÕßɾ³ýÎļþ£¨Ä¿Â¼£©µÄȨÏÞ¡£

Ö´ÐÐȨÏÞ£¨x£©£ºÎļþûÓÐËùνµÄÖ´ÐÐȨÏÞ£¬±»ºöÂÔ¡£¶ÔÓÚĿ¼£¬Ö´ÐÐȨÏÞÓÃÓÚ·ÃÎÊÆ÷Ŀ¼ÏµÄÄÚÈÝ¡£

ÿ¸öÎļþ»òĿ¼¶¼ÓÐowner£¬group£¬modeÈý¸öÊôÐÔ£¬ownerÖ¸ÎļþµÄËùÓÐÕߣ¬groupΪȨÏÞ×é¡£mode

ÓÉËùÓÐÕßȨÏÞ¡¢ÎļþËùÊôµÄ×éÖÐ×éÔ±µÄȨÏÞ¡¢·ÇËùÓÐÕß·Ç×éÔ±µÄȨÏÞ×é³É¡£ÏÂͼ±íʾÆäËùÓÐÕßrootÓµÓжÁдȨÏÞ£¬supergroup×éµÄ×éÔ±ÓжÁȨÏÞ£¬ÆäËûÈËÓжÁȨÏÞ¡£

ÎļþȨÏÞÊÇ·ñ¿ªÆôͨ¹ýdfs.permissions.enabledÊôÐÔÀ´¿ØÖÆ£¬Õâ¸öÊôÐÔĬÈÏΪfalse£¬Ã»Óдò¿ª°²È«ÏÞÖÆ£¬Òò´Ë²»»á¶Ô¿Í»§¶Ë×öÊÚȨУÑ飬Èç¹û¿ªÆô°²È«ÏÞÖÆ£¬»á¶Ô²Ù×÷ÎļþµÄÓû§×öȨÏÞУÑé¡£ÌØÊâÓû§superuserÊÇNamenode½ø³ÌµÄ±êʶ£¬²»»áÕë¶Ô¸ÃÓû§×öȨÏÞУÑé¡£

×îºó¿´Ò»ÏÂlsÃüÁîµÄÖ´Ðнá¹û£º

Õâ¸ö·µ»Ø½á¹ûÀàËÆÓÚUnixϵͳϵÄlsÃüÁµÚÒ»À¸ÎªÎļþµÄmode£¬d±íʾĿ¼£¬½ô½Ó×Å3ÖÖȨÏÞ9λ¡£ µÚ¶þÀ¸ÊÇÖ¸ÎļþµÄ¸±±¾Êý£¬Õâ¸öÊýÁ¿Í¨¹ýdfs.replicationÅäÖã¬Ä¿Â¼ÔòʹÓÃ-±íʾûÓи±±¾Ò»Ëµ¡£ÆäËûÖîÈçËùÓÐÕß¡¢×é¡¢¸üÐÂʱ¼ä¡¢Îļþ´óС¸úUnixϵͳÖеÄlsÃüÁîÒ»Ö¡£

Èç¹ûÐèÒª²é¿´¼¯Èº×´Ì¬»òÕßä¯ÀÀÎļþĿ¼£¬¿ÉÒÔ·ÃÎÊNamenode±©Â¶µÄHttp Server²é¿´¼¯ÈºÐÅÏ¢£¬Ò»°ãÔÚnamenodeËùÔÚ»úÆ÷µÄ50070¶Ë¿Ú¡£

5. HadoopÎļþϵͳ

Ç°ÃæHadoopµÄÎļþϵͳ¸ÅÄîÊdzéÏóµÄ£¬HDFSÖ»ÊÇÆäÖеÄÒ»ÖÖʵÏÖ¡£HadoopÌṩµÄʵÏÖÈçÏÂͼ£º

¼òµ¥½éÉÜһϣ¬LocalÊǶԱ¾µØÎļþϵͳµÄ³éÏó£¬hdfs¾ÍÊÇÎÒÃÇ×î³£¼ûµÄ£¬Á½ÖÖwebÐÎʽ£¨webhdfs£¬swebhdfs£©µÄʵÏÖͨ¹ýHTTPÌṩÎļþ²Ù×÷½Ó¿Ú¡£harÊÇHadoopÌåϵϵÄѹËõÎļþ£¬µµÎļþºÜ¶àµÄʱºò¿ÉÒÔѹËõ³ÉÒ»¸ö´óÎļþ£¬¿ÉÒÔÓÐЧ¼õÉÙÔªÊý¾ÝµÄÊýÁ¿¡£viewfs¾ÍÊÇÎÒÃÇÇ°Ãæ½éÉÜHDFS FederationÕÅÌáµ½µÄ£¬ÓÃÀ´ÔÚ¿Í»§¶ËÆÁ±Î¶à¸öNamenodeµÄµ×²ãϸ½Ú¡£ftp¹ËÃû˼Ò壬¾ÍÊÇʹÓÃftpЭÒéÀ´ÊµÏÖ£¬¶ÔÎļþµÄ²Ù×÷ת»¯ÎªftpЭÒé¡£s3aÊǶÔAmazonÔÆ·þÎñÌṩµÄ´æ´¢ÏµÍ³µÄʵÏÖ£¬azureÔòÊÇ΢ÈíµÄÔÆ·þÎñƽ̨ʵÏÖ¡£

Ç°ÃæÎÒÃÇÌáµ½ÁËʹÓÃÃüÁîÐиúHDFS½»»¥£¬ÊÂʵÉÏ»¹Óкܶ෽ʽÀ´²Ù×÷Îļþϵͳ¡£ÀýÈçJavaÓ¦ÓóÌÐò¿ÉÒÔʹÓÃorg.apache.hadoop.fs.FileSystemÀ´²Ù×÷£¬ÆäËûÐÎʽµÄ²Ù×÷Ò²¶¼ÊÇ»ùÓÚFileSystem½øÐзâ×°¡£ÎÒÃÇÕâÀïÖ÷Òª½éÉÜÒ»ÏÂHTTPµÄ½»»¥·½Ê½¡£

WebHDFSºÍSWebHDFSЭÒ齫Îļþϵͳ±©Â¶HTTP²Ù×÷£¬ÕâÖÖ½»»¥·½Ê½±ÈÔ­ÉúµÄJav¿Í»§¶ËÂý£¬²»ÊʺϲÙ×÷´óÎļþ¡£Í¨¹ýHTTP£¬ÓÐ2ÖÖ·ÃÎÊ·½Ê½£¬Ö±½Ó·ÃÎʺÍͨ¹ý´úÀí·ÃÎÊ

Ö±½Ó·ÃÎÊ

Ö±½Ó·ÃÎʵÄʾÒâͼÈçÏ£º

NamenodeºÍDatanodeĬÈÏ´ò¿ªÁËǶÈëʽweb server£¬¼´dfs.webhdfs.enabledĬÈÏΪtrue¡£webhdfsͨ¹ýÕâЩ·þÎñÆ÷À´½»»¥¡£ÔªÊý¾ÝµÄ²Ù×÷ͨ¹ýnamenodeÍê³É£¬ÎļþµÄ¶ÁдÊ×ÏÈ·¢µ½namenode£¬È»ºóÖØ¶¨Ïòµ½datanode¶ÁÈ¡£¨Ð´È룩ʵ¼ÊµÄÊý¾ÝÁ÷¡£

ͨ¹ýHDFS´úÀí

²ÉÓôúÀíµÄʾÒâͼÈçÉÏËùʾ¡£ ʹÓôúÀíµÄºÃ´¦ÊÇ¿ÉÒÔͨ¹ý´úÀíʵÏÖ¸ºÔؾùºâ»òÕß¶Ô´ø¿í½øÐÐÏÞÖÆ£¬»òÕß·À»ðǽÉèÖᣴúÀíͨ¹ýHTTP»òÕßHTTPS±©Â¶ÎªWebHDFS£¬¶ÔӦΪwebhdfsºÍswebhdfs URL Schema¡£

´úÀí×÷Ϊ¶ÀÁ¢µÄÊØ»¤½ø³Ì£¬¶ÀÁ¢ÓÚnamenodeºÍdatanode£¬Ê¹ÓÃhttpfs.sh½Å±¾£¬Ä¬ÈÏÔËÐÐÔÚ14000¶Ë¿Ú

³ýÁËFileSystemÖ±½Ó²Ù×÷£¬ÃüÁîÐУ¬HTTTPÍ⣬»¹ÓÐCÓïÑÔAPI£¬NFS£¬FUSERµÈ·½Ê½£¬ÕâÀï²»×ö¹ý¶à½éÉÜ¡£

6. Java½Ó¿Ú

ʵ¼ÊµÄÓ¦ÓÃÖУ¬¶ÔHDFSµÄ´ó¶àÊý²Ù×÷»¹ÊÇͨ¹ýFileSystemÀ´²Ù×÷£¬Õⲿ·ÖÖØµã½éÉÜÒ»ÏÂÏà¹ØµÄ½Ó¿Ú£¬Ö÷Òª¹Ø×¢HDFSµÄʵÏÖÀàDistributedFileSystem¼°Ïà¹ØÀà¡£

6.1 ¶Á²Ù×÷

¿ÉÒÔʹÓÃURLÀ´¶ÁÈ¡Êý¾Ý£¬»òÕß¶øÖ±½ÓʹÓÃFileSystem²Ù×÷¡£

´ÓHadoop URL¶ÁÈ¡Êý¾Ý

java.net.URLÀàÌṩÁË×ÊÔ´¶¨Î»µÄͳһ³éÏó£¬ÈκÎÈ˶¼¿ÉÒÔ×Ô¼º¶¨ÒåÒ»ÖÖURL Schema£¬²¢ÌṩÏàÓ¦µÄ´¦ÀíÀàÀ´½øÐÐʵ¼ÊµÄ²Ù×÷¡£hdfs schema±ãÊÇÕâÑùµÄÒ»ÖÖʵÏÖ¡£

InputStream in = null;
try {
in = new URL("hdfs://master/user/hadoop").openStream();
}finally{
IOUtils.closeStream(in);
}

ΪÁËʹÓÃ×Ô¶¨ÒåµÄSchema£¬ÐèÒªÉèÖÃURLStreamHandlerFactory£¬Õâ¸ö²Ù×÷Ò»¸öJVMÖ»ÄܽøÐÐÒ»´Î£¬¶à´Î²Ù×÷»áµ¼Ö²»¿ÉÓã¬Í¨³£ÔÚ¾²Ì¬¿éÖÐÍê³É¡£ÏÂÃæµÄ½ØÍ¼ÊÇÒ»¸öʹÓÃʾÀý£º

ʹÓÃFileSystem API¶ÁÈ¡Êý¾Ý

1£© Ê×ÏÈ»ñÈ¡FileSystemʵÀý£¬Ò»°ãʹÓþ²Ì¬get¹¤³§·½·¨

public static FileSystem get(Configuration conf) throws IOException
public static FileSystem get(URI uri , Configuration conf) throws IOException
public static FileSystem get(URI uri , Configuration conf£¬String user) throws IOException

Èç¹ûÊDZ¾µØÎļþ£¬Í¨¹ýgetLocal»ñÈ¡±¾µØÎļþϵͳ¶ÔÏó£º

public static LocalFileSystem getLocal(COnfiguration conf) thrown IOException

2£©µ÷ÓÃFileSystemµÄopen·½·¨»ñȡһ¸öÊäÈëÁ÷:

public FSDataInputStream open(Path f) throws IOException
public abstarct FSDataInputStream open(Path f , int bufferSize) throws IOException

ĬÈÏÇé¿öÏ£¬openʹÓÃ4KBµÄBuffer£¬¿ÉÒÔ¸ù¾ÝÐèÒª×ÔÐÐÉèÖá£

3£©Ê¹ÓÃFSDataInputStream½øÐÐÊý¾Ý²Ù×÷

FSDataInputStreamÊÇjava.io.DataInputStreamµÄÌØÊâʵÏÖ£¬ÔÚÆä»ù´¡ÉÏÔö¼ÓÁËËæ»ú¶ÁÈ¡¡¢²¿·Ö¶ÁÈ¡µÄÄÜÁ¦

public class FSDataInputStream extends DataInputStream
implements Seekable, PositionedReadable,
ByteBufferReadable, HasFileDescriptor, CanSetDropBehind, CanSetReadahead,
HasEnhancedByteBufferAccess

Ëæ»ú¶ÁÈ¡²Ù×÷ͨ¹ýSeekable½Ó¿Ú¶¨Ò壺

public interface Seekable {
void seek(long pos) throws IOException;
long getPos() throws IOException;
}

seek²Ù×÷¿ªÏú°º¹ó£¬É÷Óá£

²¿·Ö¶Áȡͨ¹ýPositionedReadable½Ó¿Ú¶¨Ò壺

public interface PositionedReadable{
public int read(long pistion ,byte[] buffer,int offser , int length) throws IOException;
public int readFully(long pistion ,byte[] buffer,int offser , int length) throws IOException;
public int readFully(long pistion ,byte[] buffer) throws IOException;
}

6.2 дÊý¾Ý

ÔÚHDFSÖУ¬ÎļþʹÓÃFileSystemÀàµÄcreate·½·¨¼°ÆäÖØÔØÐÎʽÀ´´´½¨£¬create·½·¨·µ»ØÒ»¸öÊä³öÁ÷FSDataOutputStream£¬¿ÉÒÔµ÷Ó÷µ»ØÊä³öÁ÷µÄgetPos·½·¨²é¿´µ±Ç°ÎļþµÄÎ»ÒÆ£¬µ«ÊDz»ÄܽøÐÐseek²Ù×÷£¬HDFS½öÖ§³Ö×·¼Ó²Ù×÷¡£

´´½¨Ê±£¬¿ÉÒÔ´«µÝÒ»¸ö»Øµ÷½Ó¿ÚPeofressable£¬»ñÈ¡½ø¶ÈÐÅÏ¢

append(Path f)·½·¨ÓÃÓÚ×·¼ÓÄÚÈݵ½ÒÑÓÐÎļþ£¬µ«ÊDz¢²»ÊÇËùÓеÄʵÏÖ¶¼Ìṩ¸Ã·½·¨£¬ÀýÈçAmazonµÄÎļþʵÏÖ¾ÍûÓÐÌṩ׷¼Ó¹¦ÄÜ¡£

ÏÂÃæÊÇÒ»¸öÀý×Ó£º

String localSrc = args[0];
String dst = args[1];
InputStream in = new BufferedInputStream(new FileInputStream(localSrc));
COnfiguration conf = new Configuration();
FileSystem fs = FileSystem.get(URI.create(dst),conf);
OutputStream out = fs.create(new Path(dst), new Progressable(){
public vid progress(){
System.out.print(.);
}
});
IOUtils.copyBytes(in , out, 4096,true);

6.3 Ŀ¼²Ù×÷

ʹÓÃmkdirs£¨£©·½·¨,»á×Ô¶¯´´½¨Ã»ÓеÄÉϼ¶Ä¿Â¼

HDFSÖÐÔªÊý¾Ý·â×°ÔÚFileStatusÀàÖУ¬°üÀ¨³¤¶È¡¢block size£¬replicaions£¬ÐÞ¸Äʱ¼ä¡¢ËùÓÐÕß¡¢È¨ÏÞµÈÐÅÏ¢¡£Ê¹ÓÃFileSystemÌṩµÄgetFileStatus·½·¨»ñÈ¡FileStatus¡£exists()·½·¨ÅжÏÎļþ»òÕßĿ¼ÊÇ·ñ´æÔÚ£»

ÁгöÎļþ£¨list£©£¬ÔòʹÓÃlistStatus·½·¨£¬¿ÉÒԲ鿴Îļþ»òÕßĿ¼µÄÐÅÏ¢

public abstract FileStatus[] listStatus(Path f) throws FileNotFoundException,
IOException;

PathÊǸöÎļþµÄʱºò£¬·µ»Ø³¤¶ÈΪ1µÄÊý×é¡£FileUtilÌṩµÄstat2Paths·½·¨ÓÃÓÚ½«FileStatusת»¯ÎªPath¶ÔÏó¡£

globStatusÔòʹÓÃͨÅä·û¶ÔÎļþ·¾¶½øÐÐÆ¥Å䣺

public FileStatus[] globStatus(Path pathPattern) throws IOException

PathFilterÓÃÓÚ×Ô¶¨ÒåÎļþÃû¹ýÂË£¬²»Äܸù¾ÝÎļþÊôÐÔ½øÐйýÂË£¬ÀàËÆÓÚjava.io.FileFilter¡£ÀýÈçÏÂÃæÕâ¸öÀý×ÓÅųýµ½¸ø¶¨ÕýÔò±í´ïʽµÄÎļþ£º

public interfacePathFilter{
boolean accept(Path path);
}

6.4 ɾ³ýÊý¾Ý

ʹÓÃFileSystemµÄdelete()·½·¨

public boolean delete(Path f , boolean recursive) throws IOException;

recursive²ÎÊýÔÚfÊǸöÎļþµÄʱºò±»ºöÂÔ¡£Èç¹ûfÊÇÎļþ²¢ÇÒrecursiceΪtrue£¬Ôòɾ³ýÕû¸öĿ¼£¬·ñÔòÅ׳öÒì³£.

7. Êý¾ÝÁ÷(¶ÁдÁ÷³Ì£©

½ÓÏÂÀ´Ïêϸ½éÉÜHDFS¶ÁдÊý¾ÝµÄÁ÷³Ì£¬ÒÔ¼°Ò»ÖÂÐÔÄ£ÐÍÏà¹ØµÄһЩ¸ÅÄî¡£

7.1 ¶ÁÎļþ

´óÖ¶ÁÎļþµÄÁ÷³ÌÈçÏ£º

1£©¿Í»§¶Ë´«µÝÒ»¸öÎļþPath¸øFileSystemµÄopen·½·¨

2£©DFS²ÉÓÃRPCÔ¶³Ì»ñÈ¡Îļþ×ʼµÄ¼¸¸öblockµÄdatanodeµØÖ·¡£Namenode»á¸ù¾ÝÍøÂçÍØÆË½á¹¹¾ö¶¨·µ»ØÄÄЩ½Úµã£¨Ç°ÌáÊǽڵãÓÐblock¸±±¾£©£¬Èç¹û¿Í»§¶Ë±¾ÉíÊÇDatanode²¢ÇÒ½ÚµãÉϸպÃÓÐblock¸±±¾£¬Ö±½Ó´Ó±¾µØ¶ÁÈ¡¡£

3£©¿Í»§¶ËʹÓÃopen·½·¨·µ»ØµÄFSDataInputStream¶ÔÏó¶ÁÈ¡Êý¾Ý£¨µ÷ÓÃread·½·¨£©

4£©DFSInputStream£¨FSDataInputStreamʵÏÖÁ˸ÄÀࣩÁ¬½Ó³ÖÓеÚÒ»¸öblockµÄ¡¢×î½üµÄ½Úµã£¬·´¸´µ÷ÓÃread·½·¨¶ÁÈ¡Êý¾Ý

5£©µÚÒ»¸öblock¶ÁÈ¡Íê±ÏÖ®ºó£¬Ñ°ÕÒÏÂÒ»¸öblockµÄ×î¼Ñdatanode£¬¶ÁÈ¡Êý¾Ý¡£Èç¹ûÓбØÒª£¬DFSInputStream»áÁªÏµNamenode»ñÈ¡ÏÂÒ»ÅúBlock µÄ½ÚµãÐÅÏ¢(´æ·ÅÓÚÄڴ棬²»³Ö¾Ã»¯£©£¬ÕâЩѰַ¹ý³Ì¶Ô¿Í»§¶Ë¶¼ÊDz»¿É¼ûµÄ¡£

6£©Êý¾Ý¶ÁÈ¡Íê±Ï£¬¿Í»§¶Ëµ÷ÓÃclose·½·¨¹Ø±ÕÁ÷¶ÔÏó

ÔÚ¶ÁÊý¾Ý¹ý³ÌÖУ¬Èç¹ûÓëDatanodeµÄͨÐÅ·¢Éú´íÎó£¬DFSInputStream¶ÔÏó»á³¢ÊÔ´ÓÏÂÒ»¸ö×î¼Ñ½Úµã¶ÁÈ¡Êý¾Ý£¬²¢ÇÒ¼Çס¸Ãʧ°Ü½Úµã£¬ ºóÐøBlockµÄ¶ÁÈ¡²»»áÔÙÁ¬½Ó¸Ã½Úµã

¶Áȡһ¸öBlockÖ®ºó£¬DFSInputStram»á½øÐмìÑéºÍÑéÖ¤£¬Èç¹ûBlockË𻵣¬³¢ÊÔ´ÓÆäËû½Úµã¶ÁÈ¡Êý¾Ý£¬²¢ÇÒ½«Ë𻵵Äblock»ã±¨¸øNamenode¡£

¿Í»§¶ËÁ¬½ÓÄĸödatanode»ñÈ¡Êý¾Ý£¬ÊÇÓÉnamenodeÀ´Ö¸µ¼µÄ£¬ÕâÑù¿ÉÒÔÖ§³Ö´óÁ¿²¢·¢µÄ¿Í»§¶ËÇëÇó£¬namenode¾¡¿ÉÄܽ«Á÷Á¿¾ùÔÈ·Ö²¼µ½Õû¸ö¼¯Èº¡£

BlockµÄλÖÃÐÅÏ¢ÊÇ´æ´¢ÔÚnamenodeµÄÄÚ´æÖУ¬Òò´ËÏàӦλÖÃÇëÇó·Ç³£¸ßЧ£¬²»»á³ÉΪƿ¾±¡£

7.2 дÎļþ

²½Öè·Ö½â

1£©¿Í»§¶Ëµ÷ÓÃDistributedFileSystemµÄcreate·½·¨

2£©DistributedFileSystemÔ¶³ÌRPCµ÷ÓÃNamenodeÔÚÎļþϵͳµÄÃüÃû¿Õ¼äÖд´½¨Ò»¸öÐÂÎļþ£¬´Ëʱ¸ÃÎļþûÓйØÁªµ½ÈκÎblock¡£ Õâ¸ö¹ý³ÌÖУ¬Namenode»á×öºÜ¶àУÑ鹤×÷£¬ÀýÈçÊÇ·ñÒѾ­´æÔÚͬÃûÎļþ£¬ÊÇ·ñÓÐȨÏÞ£¬Èç¹ûÑé֤ͨ¹ý£¬·µ»ØÒ»¸öFSDataOutputStream¶ÔÏó¡£ Èç¹ûÑéÖ¤²»Í¨¹ý£¬Å׳öÒì³£µ½¿Í»§¶Ë¡£

3£©¿Í»§¶ËдÈëÊý¾ÝµÄʱºò£¬DFSOutputStream·Ö½âΪpackets£¬²¢Ð´Èëµ½Ò»¸öÊý¾Ý¶ÓÁÐÖУ¬¸Ã¶ÓÁÐÓÉDataStreamerÏû·Ñ¡£

4£©DateStreamer¸ºÔðÇëÇóNamenode·ÖÅäеÄblock´æ·ÅµÄÊý¾Ý½Úµã¡£ÕâЩ½Úµã´æ·Åͬһ¸öBlockµÄ¸±±¾£¬¹¹³ÉÒ»¸ö¹ÜµÀ¡£ DataStreamer½«packerдÈëµ½¹ÜµÀµÄµÚÒ»¸ö½Úµã£¬µÚÒ»¸ö½Úµã´æ·ÅºÃpackerÖ®ºó£¬×ª·¢¸øÏÂÒ»¸ö½Úµã£¬ÏÂÒ»¸ö½Úµã´æ·Å Ö®ºó¼ÌÐøÍùÏ´«µÝ¡£

5£©DFSOutputStreamͬʱά»¤Ò»¸öack queue¶ÓÁУ¬µÈ´ýÀ´×ÔdatanodeÈ·ÈÏÏûÏ¢¡£µ±¹ÜµÀÉϵÄËùÓÐdatanode¶¼È·ÈÏÖ®ºó£¬packer´Óack¶ÓÁÐÖÐÒÆ³ý¡£

6£©Êý¾ÝдÈëÍê±Ï£¬¿Í»§¶ËcloseÊä³öÁ÷¡£½«ËùÓеÄpacketˢе½¹ÜµÀÖУ¬È»ºó°²ÐĵȴýÀ´×ÔdatanodeµÄÈ·ÈÏÏûÏ¢¡£È«²¿µÃµ½È·ÈÏÖ®ºó¸æÖªNamenodeÎļþÊÇÍêÕûµÄ¡£ Namenode´ËʱÒѾ­ÖªµÀÎļþµÄËùÓÐBlockÐÅÏ¢£¨ÒòΪDataStreamerÊÇÇëÇóNamenode·ÖÅäblockµÄ£©£¬Ö»ÐèµÈ´ý´ïµ½×îС¸±±¾ÊýÒªÇó£¬È»ºó·µ»Ø³É¹¦ÐÅÏ¢¸ø¿Í»§¶Ë¡£

NamenodeÈçºÎ¾ö¶¨¸±±¾´æÔÚÄĸöDatanode£¿

HDFSµÄ¸±±¾µÄ´æ·Å²ßÂÔÊǿɿ¿ÐÔ¡¢Ð´´ø¿í¡¢¶Á´ø¿íÖ®¼äµÄȨºâ¡£Ä¬ÈϲßÂÔÈçÏ£º

µÚÒ»¸ö¸±±¾·ÅÔÚ¿Í»§¶ËÏàͬµÄ»úÆ÷ÉÏ£¬Èç¹û»úÆ÷ÔÚ¼¯ÈºÖ®Íâ£¬Ëæ»úÑ¡ÔñÒ»¸ö£¨µ«Êǻᾡ¿ÉÄÜÑ¡ÔñÈÝÁ¿²»ÊÇÌ«Âý»òÕßµ±Ç°²Ù×÷Ì«·±Ã¦µÄ£©

µÚ¶þ¸ö¸±±¾Ëæ»ú·ÅÔÚ²»Í¬ÓÚµÚÒ»¸ö¸±±¾µÄ»ú¼ÜÉÏ¡£

µÚÈý¸ö¸±±¾·ÅÔÚ¸úµÚ¶þ¸ö¸±±¾Í¬Ò»»ú¼ÜÉÏ£¬µ«ÊDz»Í¬µÄ½ÚµãÉÏ£¬Âú×ãÌõ¼þµÄ½ÚµãÖÐËæ»úÑ¡Ôñ¡£

¸ü¶àµÄ¸±±¾ÔÚÕû¸ö¼¯ÈºÉÏËæ»úÑ¡Ôñ£¬ËäÈ»»á¾¡Á¿±ãÃæÌ«¶à¸±±¾ÔÚͬһ»ú¼ÜÉÏ¡£

¸±±¾µÄλÖÃÈ·¶¨Ö®ºó£¬ÔÚ½¨Á¢Ð´Èë¹ÜµÀµÄʱºò£¬»á¿¼ÂÇÍøÂçÍØÆË½á¹¹¡£ÏÂÃæÊÇ¿ÉÄܵÄÒ»¸ö´æ·Å²ßÂÔ£ºÕâÑùÑ¡ÔñºÜºÃµÎƽºâÁ˿ɿ¿ÐÔ¡¢¶ÁдÐÔÄÜ

¿É¿¿ÐÔ£ºBlock·Ö²¼ÔÚÁ½¸ö»ú¼ÜÉÏ

д´ø¿í£ºÐ´Èë¹ÜµÀµÄ¹ý³ÌÖ»ÐèÒª¿çÔ½Ò»¸ö½»»»»ú

¶Á´ø¿í£º¿ÉÒÔ´ÓÁ½¸ö»ú¼ÜÖÐÈÎѡһ¸ö¶ÁÈ¡

7.3 Ò»ÖÂÐÔÄ£ÐÍ

Ò»ÖÂÐÔÄ£ÐÍÃèÊöÎļþϵͳÖжÁд²Ù×ݵĿɼûÐÔ¡£HDFSÖУ¬ÎļþÒ»µ©´´½¨Ö®ºó£¬ÔÚÎļþϵͳµÄÃüÃû¿Õ¼äÖпɼû£º

Path p = new Path("p");
fs.create(p);
assertTaht(fs.exists(p),is(true));

µ«ÊÇÈκα»Ð´Èëµ½ÎļþµÄÄÚÈݲ»±£Ö¤¿É¼û£¬¼´Ê¹¶ÔÏóÁ÷ÒѾ­±»Ë¢Ð¡£

¡°`java
Path p = new Path(¡°p¡±);
OutputStream out = fs.create(p);
out.write(¡°content¡±.getBytes(¡°UTF-8¡±));
out.flush();
assertTaht(fs.getFileStatus(p).getLen,0L); // Ϊ0£¬¼´Ê¹µ÷ÓÃÁËflush

Èç¹ûÐèÒªÇ¿ÖÆË¢ÐÂÊý¾Ýµ½Datanode£¬Ê¹ÓÃFSDataOutputStreamµÄhflush·½·¨Ç¿Öƽ«»º³åË¢µ½datanode

hflushÖ®ºó£¬HDFS±£Ö¤µ½Õâ¸öʱ¼äµãΪֹдÈëµ½ÎļþµÄÊý¾Ý¶¼µ½´ïËùÓеÄÊý¾Ý½Úµã¡£

```java
Path p = new Path("p");
OutputStream out = fs.create(p);
out.write("content".getBytes("UTF-8"));
out.flush();
assertTaht(fs.getFileStatus(p).getLen,is
(((long,"content".length())));

¹Ø±Õ¶ÔÏóÁ÷ʱ£¬ÄÚ²¿»áµ÷ÓÃhflush·½·¨,µ«ÊÇhflush²»±£Ö¤datanodeÊý¾ÝÒѾ­Ð´Èëµ½´ÅÅÌ£¬Ö»ÊDZ£Ö¤Ð´Èëµ½datanodeµÄÄڴ棬 Òò´ËÔÚ»úÆ÷¶ÏµçµÄʱºò¿ÉÄܵ¼ÖÂÊý¾Ý¶ªÊ§£¬Èç¹ûÒª±£Ö¤Ð´Èë´ÅÅÌ£¬Ê¹ÓÃhsync·½·¨£¬hsyncÀàÐÍÓëfsync£¨£©µÄϵͳµ÷Óã¬fsyncÌύij¸öÎļþ¾ä±úµÄ»º³åÊý¾Ý¡£

FileOutputStreamout = new FileOutPutStream
(localFile);
out.write("content".getBytes("UTF-8"));
out.flush();
out.getFD().sync();
assertTaht(localFile.getLen,is(((long,
"content".length())));

ʹÓÃhflush»òhsync»áµ¼ÖÂÍÌÍÂÁ¿Ï½µ£¬Òò´ËÉè¼ÆÓ¦ÓÃʱ£¬ÐèÒªÔÚÍÌÍÂÁ¿ÒÔ¼°Êý¾ÝµÄ½¡×³ÐÔÖ®¼ä×öȨºâ¡£

ÁíÍ⣬ÎļþдÈë¹ý³ÌÖУ¬µ±Ç°ÕýÔÚдÈëµÄBlock¶ÔÆäËûReader²»¿É¼û¡£

7.4 Hadoop½Úµã¾àÀë

ÔÚ¶ÁÈ¡ºÍдÈëµÄ¹ý³ÌÖУ¬namenodeÔÚ·ÖÅäDatanodeµÄʱºò£¬»á¿¼ÂǽڵãÖ®¼äµÄ¾àÀë¡£HDFSÖУ¬¾àÀëûÓÐ

²ÉÓôø¿íÀ´ºâÁ¿£¬ÒòΪʵ¼ÊÖкÜÄÑ׼ȷ¶ÈÁ¿Á½Ì¨»úÆ÷Ö®¼äµÄ´ø¿í¡£

Hadoop°Ñ»úÆ÷Ö®¼äµÄÍØÆË½á¹¹×éÖ¯³ÉÊ÷½á¹¹£¬²¢ÇÒÓõ½´ï¹«¹²¸¸½ÚµãËùÐèÌø×ªÊýÖ®ºÍ×÷Ϊ¾àÀë¡£ÊÂʵÉÏÕâÊÇÒ»¸ö¾àÀë¾ØÕóµÄÀý×Ó¡£ÏÂÃæµÄÀý×Ó¼òÃ÷µØËµÃ÷Á˾àÀëµÄ¼ÆË㣺

Hadoop¼¯ÈºµÄÍØÆË½á¹¹ÐèÒªÊÖ¶¯ÅäÖã¬Èç¹ûûÅäÖã¬HadoopĬÈÏËùÓнڵãλÓÚͬһ¸öÊý¾ÝÖÐÐĵÄͬһ»ú¼ÜÉÏ¡£

8 Ïà¹ØÔËά¹¤¾ß

8.1 ʹÓÃdistcp²¢Ðи´ÖÆ

Ç°ÃæµÄ¹Ø×¢µã¶¼ÔÚÓÚµ¥Ï̵߳ķÃÎÊ£¬Èç¹ûÐèÒª²¢Ðд¦ÀíÎļþ£¬ÐèÒª×Ô¼º±àдӦÓá£HadoopÌṩµÄdistcp¹¤¾ßÓÃÓÚ²¢Ðе¼ÈëÊý¾Ýµ½Hadoop»òÕß´ÓHadoopµ¼³ö¡£Ò»Ð©Àý×Ó£º

hadoop distcp file1 file2 //¿ÉÒÔ×÷Ϊfs -cpÃüÁîµÄ¸ßÐ§Ìæ´ú
hadoop distcp dir1 dir2
hadoop distcp -update dir1 dir2 #update²ÎÊý±íʾֻͬ²½±»¸üеÄÎļþ£¬ÆäËû±£³Ö²»±ä

distcpÊǵײãʹÓÃMapReduceʵÏÖ£¬Ö»ÓÐmapʵÏÖ£¬Ã»ÓÐreduce¡£ÔÚmapÖв¢Ðи´ÖÆÎļþ¡£ distcp¾¡¿ÉÄÜÔÚmapÖ®¼äƽ¾ù·ÖÅäÎļþ¡£mapµÄÊýÁ¿¿ÉÒÔͨ¹ý-m²ÎÊýÖ¸¶¨:

hadoop distcp -update -delete -p hdfs://master1:9000/foo hdfs://master2/foo

ÕâÑùµÄ²Ù×÷³£ÓÃÓÚÔÚÁ½¸ö¼¯ÈºÖ®¼ä¸´ÖÆÊý¾Ý£¬update²ÎÊý±íʾֻͬ²½±»¸üйýµÄÊý¾Ý£¬delete»áɾ³ýÄ¿±êĿ¼ÖдæÔÚ£¬µ«ÊÇԴĿ¼²»´æÔÚµÄÎļþ¡£p²ÎÊý±íʾ±£ÁôÎļþµÄȫУ¡¢block´óС¡¢¸±±¾ÊýÁ¿µÈÊôÐÔ¡£

Èç¹ûÁ½¸ö¼¯ÈºµÄHadoop°æ±¾²»¼æÈÝ£¬¿ÉÒÔʹÓÃwebhdfsЭÒ飺

hadoop distcp webhdfs: //namenode1: 50070/foo webhdfs: //namenode2: 50070/foo

8.2 ƽºâHDFS¼¯Èº

ÔÚdistcp¹¤¾ßÖУ¬Èç¹ûÎÒÃÇÖ¸¶¨mapÊýÁ¿Îª1£¬²»½öËٶȺÜÂý£¬Ã¿¸öBlockµÚÒ»¸ö¸±±¾½«È«²¿Âäµ½ÔËÐÐÕâ¸öΨһmapµÄ½ÚµãÉÏ£¬Ö±µ½´ÅÅÌÒç³ö¡£Òò´ËʹÓÃdistcpµÄʱºò£¬×îºÃʹÓÃĬÈϵÄmapÊýÁ¿£¬¼´20.

HDFSÔÚBlock¾ùÔÈ·Ö²¼ÔÚ¸÷¸ö½ÚµãÉϵÄʱºò¹¤×÷µÃ×îºÃ£¬Èç¹ûûÓа취ÔÚ×÷ÒµÖо¡Á¿±£³Ö¼¯ÈºÆ½ºâ£¬ÀýÈçΪÁËÏÞÖÆmapÊýÁ¿£¨ÒÔ±ãÆäËû½Úµã¿ÉÒÔ±»±ðµÄ×÷ҵʹÓã©£¬ÄÇô¿ÉÒÔʹÓÃbalancer¹¤¾ßÀ´µ÷Õû¼¯ÈºµÄBlock·Ö²¼¡£

 

   
2667 ´Îä¯ÀÀ       27
Ïà¹ØÎÄÕÂ

»ùÓÚEAµÄÊý¾Ý¿â½¨Ä£
Êý¾ÝÁ÷½¨Ä££¨EAÖ¸ÄÏ£©
¡°Êý¾Ýºþ¡±£º¸ÅÄî¡¢ÌØÕ÷¡¢¼Ü¹¹Óë°¸Àý
ÔÚÏßÉ̳ÇÊý¾Ý¿âϵͳÉè¼Æ ˼·+Ч¹û
 
Ïà¹ØÎĵµ

GreenplumÊý¾Ý¿â»ù´¡Åàѵ
MySQL5.1ÐÔÄÜÓÅ»¯·½°¸
ijµçÉÌÊý¾ÝÖÐ̨¼Ü¹¹Êµ¼ù
MySQL¸ßÀ©Õ¹¼Ü¹¹Éè¼Æ
Ïà¹Ø¿Î³Ì

Êý¾ÝÖÎÀí¡¢Êý¾Ý¼Ü¹¹¼°Êý¾Ý±ê×¼
MongoDBʵս¿Î³Ì
²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿âÉè¼ÆÓëÓÅ»¯
PostgreSQLÊý¾Ý¿âʵսÅàѵ