±à¼ÍƼö: |
±¾ÎÄÖ÷Òª½éÉÜÁËHDFS¡¢YARN¡¢MapReduce¡¢´î½¨HadoopµÈÏà¹ØÄÚÈÝ¡£
±¾ÎÄÀ´×Ô²©¿ÍÔ°£¬ÓÉ»ðÁú¹ûÈí¼þAnna±à¼¡¢ÍƼö¡£ |
|
Hadoop
HadoopÊÇÒ»¿î¿ªÔ´µÄ´óÊý¾ÝͨÓô¦ÀíÆ½Ì¨£¬ÆäÌṩÁË3¸ö×é¼þ£¬·Ö±ðÊÇHDFS·Ö²¼Ê½Îļþϵͳ¡¢YARN·Ö²¼Ê½×ÊÔ´µ÷¶È¡¢MapReduce·Ö²¼Ê½ÀëÏß¼ÆËã¡£
MapReduceÊʺϴó¹æÄ£µÄÊý¾Ýͬʱ¶ÔʵʱÐÔÒªÇ󲻸ߵij¡¾°£¬²»ÊʺϴóÁ¿µÄСÎļþÒÔ¼°Æµ·±Ð޸ĵÄÎļþ¡£
HadoopµÄÌØµã
1.ˮƽÀ©Õ¹£ºHadoop¼¯Èº¿ÉÒÔ´ïµ½ÉÏǧ¸ö½Úµã£¬Í¬Ê±Äܹ»¶¯Ì¬µÄÐÂÔöºÍɾ³ý½Úµã£¬Äܹ»´æ´¢ºÍ´¦ÀíPB¼¶µÄÊý¾ÝÁ¿¡£
2.µÍ³É±¾£º²»ÐèÒªÒÀÀµ»úÆ÷µÄÐÔÄÜ£¬Ö»ÐèÒªÆÕͨµÄPC»ú¾ÍÄÜÔËÐС£
Ŀǰһ°ã»áʹÓÃHDFS×÷ΪÎļþ´æ´¢£¬Ê¹ÓÃYARN¶Ô×ÊÔ´½øÐйÜÀí¡£
1.HDFS
HDFSÊÇ·Ö²¼Ê½Îļþϵͳ£¬¿ÉÒÔ´æ´¢º£Á¿µÄÎļþ¡£
HDFSÓÉNameNode¡¢DataNode¡¢SecondaryNameNode½Úµã×é³É¡£

1.1 ¹ØÓÚBlockÊý¾Ý¿é
BlockÊÇHDFSÖÐ×îСµÄ´æ´¢µ¥Ôª£¬Ã¿¸öBlockµÄ´óСĬÈÏΪ128M¡£
Ò»¸ö´óÎļþ»á±»²ð·Ö³É¶à¸öBlock½øÐд洢£¬Èç¹ûÒ»¸öÎļþµÄ´óССÓÚBlockµÄ´óС£¬ÄÇôBlockʵ¼ÊÕ¼ÓõĴóСΪÎļþ±¾ÉíµÄ´óС¡£
ÿ¸öBlock¶¼»áÔÚ²»Í¬µÄDataNode½ÚµãÖдæÔÚ±¸·Ý¡£
1.2 DataNode½Úµã
DataNode½ÚµãÓÃÓÚ±£´æBlock£¬Í¬Ê±¸ºÔðÊý¾ÝµÄ¶ÁдºÍ¸´ÖƲÙ×÷¡£
DataNode½ÚµãÆô¶¯Ê±»áÏòNameNode½Úµã»ã±¨µ±Ç°´æ´¢µÄBlockÐÅÏ¢¡£
1.3 NameNode½Úµã
NameNode½ÚµãÓÃÓÚ´æ´¢ÎļþµÄÔªÐÅÏ¢¡¢ÎļþÓëBlockºÍDataNodeµÄ¹ØÏµ¡£
NameNodeÔËÐÐʱµÄËùÓÐÊý¾Ý¶¼±£´æÔÚÄÚ´æµ±ÖУ¬Òò´ËÕû¸öHDFS¿É´æ´¢µÄÎļþÊýÁ¿ÊÜÏÞÓÚNameNode½ÚµãµÄÄÚ´æ´óС¡£
NameNode½ÚµãÖеÄÊý¾Ý»á¶¨Ê±±£´æµ½´ÅÅÌÎļþµ±ÖУ¨Ö»ÓÐÎļþµÄÔªÐÅÏ¢£©£¬µ«²»±£´æÎļþÓëBlockºÍDataNodeµÄ¹ØÏµ£¬Õⲿ·ÖÊý¾ÝÓÉDataNodeÆô¶¯Ê±Éϱ¨ºÍÔËÐÐʱά»¤¡£
DataNode½Úµã»á¶¨ÆÚÏòNameNode½Úµã·¢ËÍÐÄÌøÇëÇó£¬Ò»µ©NameNode½ÚµãÔÚÒ»¶¨µÄʱ¼äÄÚûÓÐÊÕµ½DataNode½Úµã·¢Ë͵ÄÐÄÌøÔòÈÏΪÆäÒѾ崻ú£¬²»»áÔÙ¸ø¸ÃDataNode½Úµã·ÖÅäÈκεÄIOÇëÇó¡£
ÿ¸öBlockÔÚNameNodeÖж¼¶ÔÓ¦Ò»Ìõ¼Ç¼£¬Èç¹ûÊÇ´óÁ¿µÄСÎļþ½«»áÏûºÄ´óÁ¿Äڴ棬Òò´ËHDFSÊʺϴ洢´óÎļþ¡£
1.4 SecondaryNameNode
SecondaryNameNode½Úµã»á¶¨Ê±ÓëNameNode½Úµã½øÐÐͬ²½£¨HA£©
ÍùHDFSдÈëÎļþµÄÁ÷³Ì

1.HDFS ClientÏòNameNode½ÚµãÉêÇëдÈëÎļþ¡£
2.NameNode½Úµã¸ù¾ÝÎļþµÄ´óС£¬·µ»ØÎļþҪдÈëµÄBlockIdÒÔ¼°DataNode½ÚµãÁÐ±í£¬Í¬Ê±´æ´¢ÎļþµÄÔªÐÅÏ¢ÒÔ¼°ÎļþÓëBlockºÍDataNode½ÚµãÖ®¼äµÄ¹ØÏµ¡£
3.HDFS Client½ÓÊÕµ½NameNode½ÚµãµÄ·µ»ØÖ®ºó£¬»á½«Êý¾ÝÒÀ´ÎдÈëµ½Ö¸¶¨µÄDataNode½Úµãµ±ÖУ¬Ã¿¸öDataNode½Úµã½ÓÊÕµ½Êý¾ÝÖ®ºó»á°ÑÊý¾ÝдÈëµ½´ÅÅÌÎļþ£¬È»ºó½«Êý¾Ýͬ²½¸øÆäËûµÄDataNode½Úµã½øÐб¸·Ý£¨±¸·ÝÊý-1¸öDataNode½Úµã£©
4.ÔÚ½øÐб¸·ÝµÄ¹ý³ÌÖУ¬Ã¿Ò»¸öDataNode½Úµã½ÓÊÕµ½Êý¾Ýºó¶¼»áÏòǰһ¸öDataNode½Úµã½øÐÐÏìÓ¦£¬×îÖÕµÚÒ»¸öDataNode½Úµã·µ»ØHDFS
Client³É¹¦¡£
5.µ±HDFS Client½ÓÊÕµ½DataNode½ÚµãµÄÏìÓ¦ºó£¬»áÏòNameNode½Úµã·¢ËÍ×îÖÕÈ·ÈÏÇëÇ󣬴ËʱNameNode½Úµã²Å»áÌá½»Îļþ¡£
ÔÚ½øÐб¸·ÝµÄ¹ý³ÌÖУ¬Èç¹ûij¸öDataNode½ÚµãдÈëʧ°Ü£¬NameNode½Úµã»áÖØÐÂѰÕÒDataNode½Úµã¼ÌÐø¸´ÖÆ£¬ÒÔ±£Ö¤Êý¾ÝµÄ¿É¿¿ÐÔ¡£
Ö»Óе±ÏòNameNode½Úµã·¢ËÍ×îÖÕÈ·ÈÏÇëÇóºóÎļþ²Å¿É¼û£¬Èç¹ûÔÚ·¢ËÍ×îÖÕÈ·ÈÏÇëÇóǰNameNode¾ÍÒѾ崻ú£¬ÄÇôÎļþ½«»á¶ªÊ§¡£
´ÓHDFS¶ÁÈ¡ÎļþµÄÁ÷³Ì

1.HDFS ClientÏòNameNode½ÚµãÉêÇë¶ÁÈ¡Îļþ¡£
2.NameNode½Úµã·µ»ØÎļþËùÓжÔÓ¦µÄBlockIdÒÔ¼°ÕâЩBlockIdËùÔÚµÄDataNode½ÚµãÁÐ±í£¨°üÀ¨±¸·Ý½Úµã£©
3.HDFS Client»áÓÅÏÈ´Ó±¾µØµÄDataNodeÖнøÐжÁÈ¡Block£¬·ñÔòͨ¹ýÍøÂç´Ó±¸·Ý½ÚµãÖнøÐжÁÈ¡¡£
»ú¼Ü¸ÐÖª

·Ö²¼Ê½¼¯ÈºÖÐͨ³£»á°üº¬·Ç³£¶àµÄ»úÆ÷£¬ÓÉÓÚÊܵ½»ú¼Ü²ÛλºÍ½»»»»úÍø¿ÚµÄÏÞÖÆ£¬Í¨³£´óÐ͵ķֲ¼Ê½¼¯Èº¶¼»á¿çºÃ¼¸¸ö»ú¼Ü£¬Óɶà¸ö»ú¼ÜÉϵĻúÆ÷¹²Í¬×é³ÉÒ»¸ö·Ö²¼Ê½¼¯Èº¡£
»ú¼ÜÄڵĻúÆ÷Ö®¼äµÄÍøÂçËÙ¶Èͨ³£¸ßÓÚ¿ç»ú¼Ü»úÆ÷Ö®¼äµÄÍøÂçËÙ¶È£¬²¢ÇÒ»ú¼ÜÖ®¼ä»úÆ÷µÄÍøÂçͨÐÅͨ³£»áÊܵ½Éϲ㽻»»»úÍøÂç´ø¿íµÄÏÞÖÆ¡£
HadoopĬÈÏûÓпªÆô»ú¼Ü¸ÐÖª¹¦ÄÜ£¬Ä¬ÈÏÇé¿öÏÂÿ¸öBlock¶¼ÊÇËæ»ú·ÖÅäDataNode½Úµã£¬µ±Hadoop¿ªÆô»ú¼Ü¸ÐÖª¹¦Äܺó£¬ÄÇôµ±NameNode½ÚµãÆô¶¯Ê±£¬»á½«»úÆ÷Óë»ú¼ÜÖ®¼äµÄ¹ØÏµ±£´æÔÚÄÚ´æÖУ¬µ±HDFS
ClientÉêÇëдÈëÎļþʱ£¬Äܹ»¸ù¾ÝÔ¤Ïȶ¨ÒåµÄ»ú¼Ü¹ØÏµºÏÀíµÄ·ÖÅäDataNode¡£
»ú¼Ü¸Ð֪ĬÈ϶ÔBlockµÄ3¸ö±¸·ÝµÄ´æ·Å²ßÂÔ
µÚ1¸öBlock±¸·Ý´æ·ÅÔÚÓëHDFS Clientͬһ¸ö½ÚµãµÄDataNode½ÚµãÖУ¨ÈôHDFS
Client²»ÔÚ¼¯Èº·¶Î§ÄÚÔòËæ»úѡȡ£©
µÚ2¸öBlock±¸·Ý´æ·ÅÔÚÓëµÚÒ»¸ö½Úµã²»Í¬»ú¼ÜϵĽڵãÖС£
µÚ3¸öBlock±¸·Ý´æ·ÅÔÚÓëµÚ2¸ö±¸·ÝËùÔÚ½ÚµãµÄ»ú¼ÜϵÄÁíÒ»¸ö½ÚµãÖУ¬Èç¹û»¹Óиü¶àµÄ¸±±¾ÔòËæ»ú´æ·ÅÔÚ¼¯ÈºµÄ½ÚµãÖС£
ʹÓô˲ßÂÔ¿ÉÒÔ±£Ö¤¶ÔÎļþµÄ·ÃÎÊÄܹ»ÓÅÏÈÔÚ±¾»ú¼ÜÏÂÕÒµ½£¬²¢ÇÒÈç¹ûÕû¸ö»ú¼ÜÉÏ·¢ÉúÁËÒì³£Ò²¿ÉÒÔÔÚÁíÍâµÄ»ú¼ÜÉÏÕÒµ½¸ÃBlockµÄ±¸·Ý¡£
2 YARN
YARNÊÇ·Ö²¼Ê½×ÊÔ´µ÷¶È¿ò¼Ü£¬ÓÉResourceManger¡¢NodeManagerÒÔ¼°ApplicationMaster×é³É¡£
2.1 ResourceManager
ResourceManagerÊǼ¯ÈºµÄ×ÊÔ´¹ÜÀíÕߣ¬¸ºÔð¼¯ÈºÖÐ×ÊÔ´µÄ·ÖÅäÒÔ¼°µ÷¶È£¬Í¬Ê±¹ÜÀí¸÷¸öNodeManager£¬Í¬Ê±¸ºÔð´¦Àí¿Í»§¶ËµÄÈÎÎñÇëÇó¡£
2.2 NodeManager
NodeManagerÊǽڵãµÄ¹ÜÀíÕߣ¬¸ºÔð´¦ÀíÀ´×ÔResourceManagerºÍApplicationMasterµÄÇëÇó¡£
2.3 ApplicationMaster
ApplicationMasterÓÃÓÚ¼ÆËãÈÎÎñËùÐèÒªµÄ×ÊÔ´¡£
2.4 ÈÎÎñÔËÐÐÔÚYARNµÄÁ÷³Ì

1.¿Í»§¶ËÏòResourceManagerÌá½»ÈÎÎñÇëÇó¡£
2.ResourceManagerÉú³ÉÒ»¸öApplicationManager½ø³Ì£¬ÓÃÓÚÈÎÎñµÄ¹ÜÀí¡£
3.ApplicationManager´´½¨Ò»¸öContainerÈÝÆ÷ÓÃÓÚ´æ·ÅÈÎÎñËùÐèÒªµÄ×ÊÔ´¡£
4.ApplicationManagerѰÕÒÆäÖÐÒ»¸öNodeManager£¬ÔÚ´ËNodeManagerÖÐÆô¶¯Ò»¸öApplicationMaster£¬ÓÃÓÚÈÎÎñµÄ¹ÜÀíÒÔ¼°¼à¿Ø¡£
5.ApplicationMasterÏòResourceManager½øÐÐ×¢²á£¬²¢¼ÆËãÈÎÎñËùÐèµÄ×ÊÔ´»ã±¨¸øResourceManager£¨CPUÓëÄڴ棩
6.ResourceManagerΪ´ËÈÎÎñ·ÖÅä×ÊÔ´£¬×ÊÔ´·â×°ÔÚContainerÈÝÆ÷ÖС£
7.ApplicationMaster֪ͨ¼¯ÈºÖÐÏà¹ØµÄNodeManager½øÐÐÈÎÎñµÄÖ´ÐС£
8.¸÷¸öNodeManager´ÓContainerÈÝÆ÷ÖлñÈ¡×ÊÔ´²¢Ö´ÐÐMap¡¢ReduceÈÎÎñ¡£
3 MapReduce
MapReduceÊÇ·Ö²¼Ê½ÀëÏß¼ÆËã¿ò¼Ü£¬ÆäÔÀíÊǽ«Êý¾Ý²ð·Ö³É¶à·Ý£¬È»ºóͨ¹ý¶à¸ö½Úµã²¢Ðд¦Àí¡£
MapReduceÖ´ÐÐÁ÷³Ì
MapReduce·ÖΪMapÈÎÎñÒÔ¼°ReduceÈÎÎñÁ½²¿·Ö¡£
3.1 MapÈÎÎñ
1.¶ÁÈ¡ÎļþÖеÄÄÚÈÝ£¬½âÎö³ÉKey ValueµÄÐÎʽ (KeyÎªÆ«ÒÆÁ¿£¬ValueΪÿÐеÄÊý¾Ý)
2.ÖØÐ´map·½·¨£¬Éú³ÉеÄKeyºÍValue¡£
3.¶ÔÊä³öµÄKeyºÍValue½øÐзÖÇø¡£
4.½«Êý¾Ý°´ÕÕKey½øÐзÖ×飬keyÏàͬµÄvalue·Åµ½Ò»¸ö¼¯ºÏÖУ¨Êý¾Ý»ã×Ü£©
´¦ÀíµÄÎļþ±ØÐëÒªÔÚHDFSÖС£
3.2 ReduceÈÎÎñ
1.¶Ô¶à¸öMapÈÎÎñµÄÊä³ö£¬°´ÕÕ²»Í¬µÄ·ÖÇø£¬Í¨¹ýÍøÂç¸´ÖÆµ½²»Í¬µÄreduce½Úµã¡£
2.¶Ô¶à¸öMapÈÎÎñµÄÊä³ö½øÐкϲ¢¡¢ÅÅÐò¡£
3.½«reduceµÄÊä³ö±£´æµ½Îļþ£¬´æ·ÅÔÚHDFSÖС£
4.´î½¨Hadoop
4.1 °²×°
1.ÓÉÓÚHadoopʹÓÃJavaÓïÑÔ½øÐбàд£¬Òò´ËÐèÒª°²×°JDK¡£
2.´ÓCDHÖÐÏÂÔØHadoop 2.X²¢½øÐнâѹ£¬CDHÊÇCloudrea¹«Ë¾¶Ô¸÷ÖÖ¿ªÔ´¿ò¼ÜµÄÕûºÏÓëÓÅ»¯£¨½ÏÎȶ¨£©

4.2 ÐÞ¸ÄÅäÖÃ
1.Ð޸Ļ·¾³ÅäÖÃ
±à¼etc/hadoop/hadoop-env.shÎļþ£¬ÐÞ¸ÄJAVA_HOMEÅäÖ㨴ËÎļþÊÇHadoopÆô¶¯Ê±¼ÓÔØµÄ»·¾³±äÁ¿£©
±à¼/etc/hostsÎļþ£¬Ìí¼ÓÖ÷»úÃûÓëIPµÄÓ³Éä¹ØÏµ¡£

2.ÅäÖÃHadoop¹«¹²ÊôÐÔ£¨core-site.xml£©
<configuration>
<!-- Hadoop¹¤×÷Ŀ¼,ÓÃÓÚ´æ·ÅHadoopÔËÐÐʱ²úÉúµÄÁÙʱÊý¾Ý -->
<property> <name>hadoop.tmp.dir</name>
<value>/usr/hadoop/hadoop-2.9.0/data </value>
</property> <!-- NameNodeµÄͨÐŵØÖ·,1.xĬÈÏ9000,2.x¿ÉÒÔʹÓÃ8020
--> <property> <name>fs.default.name</name>
<value>hdfs://192.168.1.80:8020 </value>
</property>
</configuration> |
3.ÅäÖÃHDFS£¨hdfs-site.xml£©
<configuration>
<!--Ö¸¶¨blockµÄ±¸·ÝÊýÁ¿(½«block¸´ÖƵ½¼¯ÈºÖб¸·ÝÊý-1¸öDataNode½ÚµãÖÐ)-->
<property> <name>dfs.replication</name>
<value>1</value> </property>
<!-- ¹Ø±ÕHDFSµÄ·ÃÎÊȨÏÞ --> <property>
<name>dfs.permissions.enabled</name>
<value>false</value> </property>
</configuration> |
4.ÅäÖÃYARN£¨yarn-site.xml£©
<configuration>
<!-- ÅäÖÃReduceÈ¡Êý¾ÝµÄ·½Ê½ÊÇshuffle(Ëæ»ú) -->
<property> <name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration> |
5.ÅäÖÃMapReduce£¨mapred-site.xml£©
<configuration>
<!-- ÈÃMapReduceÈÎÎñʹÓÃYARN½øÐе÷¶È -->
<property> <name>mapreduce.framework.name</name>
<value>yarn</value> </property>
</configuration> |
6.ÅäÖÃSSH
ÓÉÓÚÔÚÆô¶¯HDFSºÍYARNʱ¶¼ÐèÒª¶ÔÓû§µÄÉí·Ý½øÐÐÑéÖ¤£¬Òò´Ë¿ÉÒÔÅäÖÃSSHÉèÖÃÃâÃÜÂëµÇ¼¡£
//Éú³ÉÃØÔ¿
ssh-keygen -t rsa
//¸´ÖÆÃØÔ¿µ½±¾»ú
ssh-copy-id 192.168.1.80 |
4.3 Æô¶¯HDFS
1.¸ñʽ»¯NameNode
bin/hdfs namenode
-format |

2.Æô¶¯HDFS

Æô¶¯HDFSºó½«»áÆô¶¯NameNode¡¢DataNode¡¢SecondaryNameNodeÈý¸ö½ø³Ì¡£

Æô¶¯Ê±Èô³öÏÖ´íÎó¿ÉÒÔ½øÈëlogsĿ¼²é¿´ÏàÓ¦µÄÈÕÖ¾Îļþ¡£
3.·ÃÎÊHDFSµÄ¿ÉÊÓ»¯¹ÜÀí½çÃæ
µ±HDFSÆô¶¯Íê±Ïºó£¬¿ÉÒÔ·ÃÎÊhttp://localhost:50070½øÈëHDFSµÄ¿ÉÊÓ»¯¹ÜÀí½çÃæ£¬ÔÚ´ËÒ³ÃæÖпÉÒÔ¶ÔÕû¸öHDFS¼¯Èº½øÐÐ¼à¿ØÒÔ¼°ÎļþµÄÉÏ´«ºÍÏÂÔØ¡£

µ±ÏÂÔØÎļþʱ»á½øÐÐÇëÇóµÄÖØ¶¨Ïò£¬Öض¨ÏòµÄµØÖ·µÄhostΪNameNodeµÄÖ÷»úÃû£¬Òò´Ë¿Í»§¶Ë±¾µØµÄhostÎļþÖÐÐèÒªÅäÖÃNameNodeÖ÷»úÃûÓëIPµÄÓ³Éä¹ØÏµ¡£
4.4 Æô¶¯YARN

Æô¶¯YARNºó£¬½«»áÆô¶¯ResourceManagerÒÔ¼°NodeManager½ø³Ì¡£

¿ÉÒÔ·ÃÎÊhttp://localhost:8088½øÈëYARNµÄ¿ÉÊÓ»¯¹ÜÀí½çÃæ£¬¿ÉÒÔÔÚ´ËÒ³ÃæÖв鿴ÈÎÎñµÄÖ´ÐÐÇé¿öÒÔ¼°×ÊÔ´µÄ·ÖÅä¡£

4.5 ʹÓÃShell²Ù×÷HDFS
HDFSÓëLinuxÀàËÆ£¬ÓÐ/¸ùĿ¼¡£
#ÏÔʾÎļþÖеÄÄÚÈÝ
bin/hadoop fs -cat <src>
½«±¾µØÖеÄÎļþÉÏ´«µ½HDFS
bin/hadoop fs -copyFromLocal <localsrc>
<dst>
#½«±¾µØÖеÄÎļþÉÏ´«µ½HDFS
bin/hadoop fs -put <localsrc> <dst>
#½«HDFSÖеÄÎļþÏÂÔØµ½±¾µØ
bin/hadoop fs -copyToLocal <src> <localdst>
#½«HDFSÖеÄÎļþÏÂÔØµ½±¾µØ
bin/hadoop fs -get <src> <localdst>
#½«±¾µØÖеÄÎļþ¼ôÇе½HDFSÖÐ
bin/hadoop fs -moveFromLocal <localsrc>
<dst>
#½«HDFSÖеÄÎļþ¼ôÇе½±¾µØÖÐ
bin/hadoop fs -moveToLocal <src> <localdst>
#ÔÚHDFSÄÚ¶ÔÎļþ½øÐÐÒÆ¶¯
bin/hadoop fs -mv <src> <dst>
#ÔÚHDFSÄÚ¶ÔÎļþ½øÐи´ÖÆ
bin/hadoop fs -cp <src> <dst>
#ɾ³ýHDFSÖеÄÎļþ
bin/hadoop fs -rm <src>
#´´½¨Ä¿Â¼
bin/hadoop fs -mkdir <path>
#²éѯָ¶¨Â·¾¶ÏÂÎļþµÄ¸öÊý
bin/hadoop fs -count <path>
#ÏÔʾָ¶¨Ä¿Â¼ÏµÄÄÚÈÝ
bin/hadoop fs -ls <path> |
4.6 ʹÓÃJAVA²Ù×÷HDFS
/**
* @Auther: ZHUANGHAOTANG
* @Date: 2018/11/6 11:49
* @Description:
*/
public class HDFSUtils {
private static Logger logger = LoggerFactory.getLogger(HDFSUtils.class);
/**
* NameNode URL
*/
private static final String NAMENODE_URL = "192.168.1.80:8020";
/**
* HDFSÎļþϵͳÁ¬½Ó¶ÔÏó
*/
private static FileSystem fs = null;
static {
Configuration conf = new Configuration();
try {
fs = FileSystem.get(URI.create(NAMENODE_URL),
conf);
} catch (IOException e) {
logger.info("³õʼ»¯HDFSÁ¬½Óʧ°Ü£º{}", e);
}
}
/**
* ´´½¨Ä¿Â¼
*/
public static void mkdir(String dir) throws Exception
{
dir = NAMENODE_URL + dir;
if (!fs.exists(new Path(dir))) {
fs.mkdirs(new Path(dir));
}
}
/**
* ɾ³ýĿ¼»òÎļþ
*/
public static void delete(String dir) throws Exception
{
dir = NAMENODE_URL + dir;
fs.delete(new Path(dir), true);
}
/**
* ±éÀúÖ¸¶¨Â·¾¶ÏµÄĿ¼ºÍÎļþ
*/
public static List<String> listAll(String
dir) throws Exception {
List<String> names = new ArrayList<>();
dir = NAMENODE_URL + dir;
FileStatus[] files = fs.listStatus(new Path(dir));
for (FileStatus file : files) {
if (file.isFile()) { //Îļþ
names.add(file.getPath().toString());
} else if (file.isDirectory()) { //Ŀ¼
names.add(file.getPath().toString());
} else if (file.isSymlink()) { //Èí»òÓ²Á´½Ó
names.add(file.getPath().toString());
}
}
return names;
}
/**
* ÉÏ´«µ±Ç°·þÎñÆ÷µÄÎļþµ½HDFSÖÐ
*/
public static void uploadLocalFileToHDFS(String
localFile, String hdfsFile) throws Exception {
hdfsFile = NAMENODE_URL + hdfsFile;
Path src = new Path(localFile);
Path dst = new Path(hdfsFile);
fs.copyFromLocalFile(src, dst);
}
/**
* ͨ¹ýÁ÷ÉÏ´«Îļþ
*/
public static void uploadFile(String hdfsPath,
InputStream inputStream) throws Exception {
hdfsPath = NAMENODE_URL + hdfsPath;
FSDataOutputStream os = fs.create(new Path(hdfsPath));
BufferedInputStream bufferedInputStream = new
BufferedInputStream(inputStream);
byte[] data = new byte[1024];
int len;
while ((len = bufferedInputStream.read(data))
!= -1) {
if (len == data.length) {
os.write(data);
} else { //×îºóÒ»´Î¶ÁÈ¡
byte[] lastData = new byte[len];
System.arraycopy(data, 0, lastData, 0, len);
os.write(lastData);
}
}
inputStream.close();
bufferedInputStream.close();
os.close();
}
/**
* ´ÓHDFSÖÐÏÂÔØÎļþ
*/
public static byte[] readFile(String hdfsFile)
throws Exception {
hdfsFile = NAMENODE_URL + hdfsFile;
Path path = new Path(hdfsFile);
if (fs.exists(path)) {
FSDataInputStream is = fs.open(path);
FileStatus stat = fs.getFileStatus(path);
byte[] data = new byte[(int) stat.getLen()];
is.readFully(0, data);
is.close();
return data;
} else {
throw new Exception("File Not Found In HDFS");
}
}
} |
4.7 Ö´ÐÐÒ»¸öMapReduceÈÎÎñ
HadoopÌṩÁËhadoop-mapreduce-examples-2.9.0.jar£¬Æä·â×°Á˺ܶàÈÎÎñ¼ÆËãµÄ·½·¨£¬Óû§¿ÉÒÔÖ±½Ó½øÐе÷Óá£
#ʹÓÃhadoop jarÃüÁîÀ´Ö´ÐÐJAR°ü
hadoop jar |
#ʹÓÃhadoop jarÃüÁîÀ´Ö´ÐÐJAR°ü
hadoop jar
1.´´½¨Ò»¸öÎļþͬʱ½«´ËÎļþÉÏ´«µ½HDFSÖÐ

2.ʹÓÃHadoopÌṩµÄhadoop-mapreduce-examples-2.9.0.jarÖ´ÐÐwordcount´ÊƵͳ¼Æ¹¦ÄÜ
bin/hadoop jar
/usr/hadoop/hadoop-2.0.0/share/hadoop/mapreduce/hadoop -mapreduce-examples-2.9.0.jar
wordcount /words /result |

3.ÔÚYARNµÄ¿ÉÊÓ»¯¹ÜÀí½çÃæÖпÉÒԲ鿴ÈÎÎñµÄÖ´ÐÐÇé¿ö

4.µ±ÈÎÎñÖ´ÐÐÍê±Ïºó¿ÉÒԲ鿴ÈÎÎñµÄÖ´Ðнá¹û

ÈÎÎñµÄÖ´Ðнá¹û½«»á±£´æµ½HDFSµÄÎļþÖС£ |