±à¼ÍƼö: |
±¾ÎÄÀ´×Ôcsdn
£¬ÎÄÕÂÖ÷Òª½éÉÜÁËhdfsµÄ¹¤×÷»úÖÆºÍHDFS¶Á¡¢Ð´Êý¾ÝÁ÷³Ì£¬NAMENODEºÍDATANODEµÄ¹¤×÷»úÖÆµÈÏà¹ØÄÚÈÝ£¬Ï£ÍûÄܶÔÄúÓÐËù°ïÖú¡£ |
|
¿Î³Ì´ó¸Ù£¨HDFSÏê½â£©
Hadoop HDFS ·Ö²¼Ê½ÎļþϵͳDFS¼ò½é
HDFSµÄϵͳ×é³É½éÉÜ
HDFSµÄ×é³É²¿·ÖÏê½â
¸±±¾´æ·Å²ßÂÔ¼°Â·ÓɹæÔò
ÃüÁîÐнӿÚ
Java½Ó¿Ú
¿Í»§¶ËÓëHDFSµÄÊý¾ÝÁ÷½²½â
ѧϰĿ±ê£º
ÕÆÎÕhdfsµÄshell²Ù×÷
ÕÆÎÕhdfsµÄjava api²Ù×÷
Àí½âhdfsµÄ¹¤×÷ÔÀí
HDFS»ù±¾¸ÅÄîÆª
1.1HDFSǰÑÔ
Éè¼ÆË¼Ïë
·Ö¶øÖÎÖ®£º½«´óÎļþ¡¢´óÅúÁ¿Îļþ£¬·Ö²¼Ê½´æ·ÅÔÚ´óÁ¿·þÎñÆ÷ÉÏ£¬ÒÔ±ãÓÚ²ÉÈ¡·Ö¶øÖÎÖ®µÄ·½Ê½¶Ôº£Á¿Êý¾Ý½øÐÐÔËËã·ÖÎö£»
ÔÚ´óÊý¾ÝϵͳÖÐ×÷Óãº
Ϊ¸÷Àà·Ö²¼Ê½ÔËËã¿ò¼Ü£¨È磺mapreduce£¬spark£¬tez£¬¡¡£©ÌṩÊý¾Ý´æ´¢·þÎñ
ÖØµã¸ÅÄÎļþÇп飬¸±±¾´æ·Å£¬ÔªÊý¾Ý
²¹³ä£º
hdfsÊǼÜÔÚ±¾µØÎļþϵͳÉÏÃæµÄ·Ö²¼Ê½Îļþϵͳ£¬Ëü¾ÍÊǸöÈí¼þ£¬Ò²¾ÍÊÇÓÃÒ»Ì×´úÂë°Ñµ×ÏÂËùÓлúÆ÷µÄÓ²Å̱ä³ÉÒ»¸öÈí¼þϵÄĿ¼£¬ºÍmysqlûÓÐÊ²Ã´Çø±ð£¬Ë¼ÏëÒ»Ñù¡£
mysql ±¾ÖÊÊÇÒ»¸ö½âÎöÆ÷£¬°Ñsql±ä³ÉioÈ¥¶ÁÎļþ£¬ÔÙ°ÑÊý¾Ýת»»³öÀ´¸øÓû§£¬´æÎļþµÄµ×²ã¾ÍÊÇʹÓÃlinux»òÕßwindowsµÄÎļþϵͳ£¬ÎļþÃû¾ÍÊDZíÃû£¬Ä¿Â¼Ãû¾ÍÊÇ¿âÃû¡£
1.2HDFSµÄ¸ÅÄîºÍÌØÐÔ
Ê×ÏÈ£¬ËüÊÇÒ»¸öÎļþϵͳ£¬ÓÃÓÚ´æ´¢Îļþ£¬Í¨¹ýͳһµÄÃüÃû¿Õ¼ä¡ª¡ªÄ¿Â¼Ê÷À´¶¨Î»Îļþ
Æä´Î£¬ËüÊÇ·Ö²¼Ê½µÄ£¬Óɺܶà·þÎñÆ÷ÁªºÏÆðÀ´ÊµÏ֯书ÄÜ£¬¼¯ÈºÖеķþÎñÆ÷Óи÷×ԵĽÇÉ«£»
ÖØÒªÌØÐÔÈçÏ£º
£¨1£©HDFSÖеÄÎļþÔÚÎïÀíÉÏÊÇ·Ö¿é´æ´¢£¨block£©£¬¿éµÄ´óС¿ÉÒÔͨ¹ýÅäÖòÎÊý( dfs.blocksize)À´¹æ¶¨£¬Ä¬ÈÏ´óСÔÚhadoop2.x°æ±¾ÖÐÊÇ128M£¬Àϰ汾ÖÐÊÇ64M
£¨2£©HDFSÎļþϵͳ»á¸ø¿Í»§¶ËÌṩһ¸öͳһµÄ³éÏóĿ¼Ê÷£¬¿Í»§¶Ëͨ¹ý·¾¶À´·ÃÎÊÎļþ£¬ÐÎÈ磺hdfs://namenode:port/dir-a/dir-b/dir-c/file.data
£¨3£©**Ŀ¼½á¹¹¼°Îļþ·Ö¿éÐÅÏ¢(ÔªÊý¾Ý)**µÄ¹ÜÀíÓÉnamenode½Úµã³Ðµ£
¡ª¡ªnamenodeÊÇHDFS¼¯ÈºÖ÷½Úµã£¬¸ºÔðά»¤Õû¸öhdfsÎļþϵͳµÄĿ¼Ê÷£¬ÒÔ¼°Ã¿Ò»¸ö·¾¶£¨Îļþ£©Ëù¶ÔÓ¦µÄblock¿éÐÅÏ¢£¨blockµÄid£¬¼°ËùÔÚµÄdatanode·þÎñÆ÷£©
£¨4£©ÎļþµÄ¸÷¸öblockµÄ´æ´¢¹ÜÀíÓÉdatanode½Úµã³Ðµ£
---- datanodeÊÇHDFS¼¯Èº´Ó½Úµã£¬Ã¿Ò»¸öblock¶¼¿ÉÒÔÔÚ¶à¸ödatanodeÉÏ´æ´¢¶à¸ö¸±±¾£¨¸±±¾ÊýÁ¿Ò²¿ÉÒÔͨ¹ý²ÎÊýÉèÖÃdfs.replication£©
²¹³ä£ºÍ¬Ò»¸öblock²»»á´æ´¢¶à·Ý(´óÓÚ1)ÔÚͬһ¸ödatanodeÉÏ£¬ÒòΪÕâÑùûÓÐÒâÒå¡£
£¨5£©HDFSÊÇÉè¼Æ³ÉÊÊÓ¦Ò»´ÎдÈ룬¶à´Î¶Á³öµÄ³¡¾°£¬ÇÒ²»Ö§³ÖÎļþµÄÐÞ¸Ä
(×¢£ºÊʺÏÓÃÀ´×öÊý¾Ý·ÖÎö£¬²¢²»ÊʺÏÓÃÀ´×öÍøÅÌÓ¦Óã¬ÒòΪ£¬²»±ãÐ޸ģ¬ÑÓ³Ù´ó£¬ÍøÂ翪Ïú´ó£¬³É±¾Ì«¸ß)
HDFS»ù±¾²Ù×÷ƪ
2.1HDFSµÄshell(ÃüÁîÐпͻ§¶Ë)²Ù×÷
2.1.1 HDFSÃüÁîÐпͻ§¶ËʹÓÃ
HDFSÌṩshellÃüÁîÐпͻ§¶Ë£¬Ê¹Ó÷½·¨ÈçÏ£º

2.2 ÃüÁîÐпͻ§¶ËÖ§³ÖµÄÃüÁî²ÎÊý
[-appendToFile
<localsrc> ... <dst>]
[-cat [-ignoreCrc] <src> ...]
[-checksum <src> ...]
[-chgrp [-R] GROUP PATH...]
[-chmod [-R] <MODE[,MODE]... | OCTALMODE>
PATH...]
[-chown [-R] [OWNER][:[GROUP]] PATH...]
[-copyFromLocal [-f] [-p] <localsrc> ...
<dst>]
[-copyToLocal [-p] [-ignoreCrc] [-crc] <src>
... <localdst>]
[-count [-q] <path> ...]
[-cp [-f] [-p] <src> ... <dst>]
[-createSnapshot <snapshotDir> [<snapshotName>]]
[-deleteSnapshot <snapshotDir> <snapshotName>]
[-df [-h] [<path> ...]]
[-du [-s] [-h] <path> ...]
[-expunge]
[-get [-p] [-ignoreCrc] [-crc] <src> ...
<localdst>]
[-getfacl [-R] <path>]
[-getmerge [-nl] <src> <localdst>]
[-help [cmd ...]]
[-ls [-d] [-h] [-R] [<path> ...]]
[-mkdir [-p] <path> ...]
[-moveFromLocal <localsrc> ... <dst>]
[-moveToLocal <src> <localdst>]
[-mv <src> ... <dst>]
[-put [-f] [-p] <localsrc> ... <dst>]
[-renameSnapshot <snapshotDir> <oldName>
<newName>]
[-rm [-f] [-r|-R] [-skipTrash] <src> ...]
[-rmdir [--ignore-fail-on-non-empty] <dir>
...]
[-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>}
<path>]|[--set <acl_spec> <path>]]
[-setrep [-R] [-w] <rep> <path> ...]
[-stat [format] <path> ...]
[-tail [-f] <file>]
[-test -[defsz] <path>]
[-text [-ignoreCrc] <src> ...]
[-touchz <path> ...]
[-usage [cmd ...]] |
2.3 ³£ÓÃÃüÁî²ÎÊý½éÉÜ
-help
¹¦ÄÜ£ºÊä³öÕâ¸öÃüÁî²ÎÊýÊÖ²á
-ls
¹¦ÄÜ£ºÏÔʾĿ¼ÐÅÏ¢
ʾÀý£º hadoop fs -ls hdfs://hadoop-server01:9000/
±¸×¢£ºÕâЩ²ÎÊýÖУ¬ËùÓеÄhdfs·¾¶¶¼¿ÉÒÔ¼òд
¨C>hadoop fs -ls / µÈͬÓÚÉÏÒ»ÌõÃüÁîµÄЧ¹û
==-mkdir ==
¹¦ÄÜ£ºÔÚhdfsÉÏ´´½¨Ä¿Â¼
ʾÀý£ºhadoop fs -mkdir -p /aaa/bbb/cc/dd
-moveFromLocal
¹¦ÄÜ£º´Ó±¾µØ¼ôÇÐÕ³Ìùµ½hdfs
ʾÀý£ºhadoop fs - moveFromLocal /home/hadoop/a.txt /aaa/bbb/cc/dd
-moveToLocal
¹¦ÄÜ£º´Óhdfs¼ôÇÐÕ³Ìùµ½±¾µØ
ʾÀý£ºhadoop fs - moveToLocal /aaa/bbb/cc/dd /home/hadoop/a.txt
¨CappendToFile
¹¦ÄÜ£º×·¼ÓÒ»¸öÎļþµ½ÒѾ´æÔÚµÄÎļþĩβ
ʾÀý£ºhadoop fs -appendToFile ./hello.txt hdfs://hadoop-server01:9000/hello.txt
¿ÉÒÔ¼òдΪ£º
Hadoop fs -appendToFile ./hello.txt /hello.txt
-cat
¹¦ÄÜ£ºÏÔʾÎļþÄÚÈÝ
ʾÀý£ºhadoop fs -cat /hello.txt
-tail
¹¦ÄÜ£ºÏÔʾһ¸öÎļþµÄĩβ
ʾÀý£ºhadoop fs -tail /weblog/access_log.1
-text
¹¦ÄÜ£ºÒÔ×Ö·ûÐÎʽ´òÓ¡Ò»¸öÎļþµÄÄÚÈÝ
ʾÀý£ºhadoop fs -text /weblog/access_log.1
-chgrp
-chmod
-chown
¹¦ÄÜ£ºÕâÈý¸öÃüÁî¸úlinuxÎļþϵͳÖеÄÓ÷¨Ò»Ñù£¬¶ÔÎļþËùÊôȨÏÞ
ʾÀý£º
hadoop fs -chmod 666 /hello.txt
hadoop fs -chown someuser:somegrp /hello.txt
-copyFromLocal
¹¦ÄÜ£º´Ó±¾µØÎļþϵͳÖп½±´Îļþµ½hdfs·¾¶È¥
ʾÀý£ºhadoop fs -copyFromLocal ./jdk.tar.gz /aaa/
-copyToLocal
¹¦ÄÜ£º´Óhdfs¿½±´µ½±¾µØ
ʾÀý£ºhadoop fs -copyToLocal /aaa/jdk.tar.gz
-cp
¹¦ÄÜ£º´ÓhdfsµÄÒ»¸ö·¾¶¿½±´hdfsµÄÁíÒ»¸ö·¾¶
ʾÀý£º hadoop fs -cp /aaa/jdk.tar.gz /bbb/jdk.tar.gz.2
-mv
¹¦ÄÜ£ºÔÚhdfsĿ¼ÖÐÒÆ¶¯Îļþ
ʾÀý£º hadoop fs -mv /aaa/jdk.tar.gz /
-get
¹¦ÄÜ£ºµÈͬÓÚcopyToLocal£¬¾ÍÊÇ´ÓhdfsÏÂÔØÎļþµ½±¾µØ
ʾÀý£ºhadoop fs -get /aaa/jdk.tar.gz
-getmerge
¹¦ÄÜ£ººÏ²¢ÏÂÔØ¶à¸öÎļþ
ʾÀý£º±ÈÈçhdfsµÄĿ¼ /aaa/ÏÂÓжà¸öÎļþ:log.1, log.2,log.3,¡
hadoop fs -getmerge /aaa/log.* ./log.sum
-put
¹¦ÄÜ£ºµÈͬÓÚcopyFromLocal
ʾÀý£ºhadoop fs -put /aaa/jdk.tar.gz /bbb/jdk.tar.gz.2
-rm
¹¦ÄÜ£ºÉ¾³ýÎļþ»òÎļþ¼Ð
ʾÀý£ºhadoop fs -rm -r /aaa/bbb/
-rmdir
¹¦ÄÜ£ºÉ¾³ý¿ÕĿ¼
ʾÀý£ºhadoop fs -rmdir /aaa/bbb/ccc
-df
¹¦ÄÜ£ºÍ³¼ÆÎļþϵͳµÄ¿ÉÓÿռäÐÅÏ¢
ʾÀý£ºhadoop fs -df -h /
-du
¹¦ÄÜ£ºÍ³¼ÆÎļþ¼ÐµÄ´óСÐÅÏ¢
ʾÀý£º
hadoop fs -du -s -h /aaa/*
-count
¹¦ÄÜ£ºÍ³¼ÆÒ»¸öÖ¸¶¨Ä¿Â¼ÏµÄÎļþ½ÚµãÊýÁ¿
ʾÀý£ºhadoop fs -count /aaa/
-setrep
¹¦ÄÜ£ºÉèÖÃhdfsÖÐÎļþµÄ¸±±¾ÊýÁ¿
ʾÀý£ºhadoop fs -setrep 3 /aaa/jdk.tar.gz
²¹³ä£º hadoop dfsadmin -report ÓÃÕâ¸öÃüÁî¿ÉÒÔ¿ìËÙ¶¨Î»³öÄÄЩ½ÚµãdownµôÁË£¬HDFSµÄÈÝÁ¿ÒÔ¼°Ê¹ÓÃÁ˶àÉÙ£¬ÒÔ¼°Ã¿¸ö½ÚµãµÄÓ²ÅÌʹÓÃÇé¿ö¡£
HDFSÔÀíÆª
hdfsµÄ¹¤×÷»úÖÆ
£¨¹¤×÷»úÖÆµÄѧϰÖ÷ÒªÊÇΪ¼ÓÉî¶Ô·Ö²¼Ê½ÏµÍ³µÄÀí½â£¬ÒÔ¼°ÔöÇ¿Óöµ½¸÷ÖÖÎÊÌâʱµÄ·ÖÎö½â¾öÄÜÁ¦£¬ÐγÉÒ»¶¨µÄ¼¯ÈºÔËάÄÜÁ¦£©
×¢£ººÜ¶à²»ÊÇÕæÕýÀí½âhadoop¼¼ÊõÌåϵµÄÈ˻᳣³£¾õµÃHDFS¿ÉÓÃÓÚÍøÅÌÀàÓ¦Ó㬵«Êµ¼Ê²¢·ÇÈç´Ë¡£ÒªÏ뽫¼¼Êõ׼ȷÓÃÔÚÇ¡µ±µÄµØ·½£¬±ØÐë¶Ô¼¼ÊõÓÐÉî¿ÌµÄÀí½â
3.1 ¸ÅÊö
HDFS¼¯Èº·ÖΪÁ½´ó½ÇÉ«£ºNameNode¡¢DataNode (Secondary Namenode)
NameNode¸ºÔð¹ÜÀíÕû¸öÎļþϵͳµÄÔªÊý¾Ý(Õû¸öhdfsÎļþϵͳµÄĿ¼Ê÷ºÍÿ¸öÎļþµÄblockÐÅÏ¢)
DataNode ¸ºÔð¹ÜÀíÓû§µÄÎļþÊý¾Ý¿é
Îļþ»á°´Õչ̶¨µÄ´óС£¨blocksize£©ÇгÉÈô¸É¿éºó·Ö²¼Ê½´æ´¢ÔÚÈô¸Ę́datanodeÉÏ
ÿһ¸öÎļþ¿é¿ÉÒÔÓжà¸ö¸±±¾£¬²¢´æ·ÅÔÚ²»Í¬µÄdatanodeÉÏ
Datanode»á¶¨ÆÚÏòNamenode»ã±¨×ÔÉíËù±£´æµÄÎļþblockÐÅÏ¢£¬¶ønamenodeÔò»á¸ºÔð±£³ÖÎļþµÄ¸±±¾ÊýÁ¿
HDFSµÄÄÚ²¿¹¤×÷»úÖÆ¶Ô¿Í»§¶Ë±£³Ö͸Ã÷£¬¿Í»§¶ËÇëÇó·ÃÎÊHDFS¶¼ÊÇͨ¹ýÏònamenodeÉêÇëÀ´½øÐÐ
3.2 HDFSдÊý¾ÝÁ÷³Ì
3.2.1 ¸ÅÊö
¿Í»§¶ËÒªÏòHDFSдÊý¾Ý£¬Ê×ÏÈÒª¸únamenodeͨÐÅÒÔÈ·ÈÏ¿ÉÒÔдÎļþ²¢»ñµÃ½ÓÊÕÎļþblockµÄdatanode£¬È»ºó£¬¿Í»§¶Ë°´Ë³Ðò½«ÎļþÖð¸öblock´«µÝ¸øÏàÓ¦datanode£¬²¢ÓɽÓÊÕµ½blockµÄdatanode¸ºÔðÏòÆäËûdatanode¸´ÖÆblockµÄ¸±±¾
3.2.2 Ïêϸ²½Öèͼ

3.2.3 Ïêϸ²½Öè½âÎö
¸ùnamenodeͨÐÅÇëÇóÉÏ´«Îļþ£¬namenode¼ì²éÄ¿±êÎļþÊÇ·ñÒÑ´æÔÚ£¬¸¸Ä¿Â¼ÊÇ·ñ´æÔÚ
namenode·µ»ØÊÇ·ñ¿ÉÒÔÉÏ´«
clientÇëÇóµÚÒ»¸ö block¸Ã´«Êäµ½ÄÄЩdatanode·þÎñÆ÷ÉÏ
namenode·µ»Ø3¸ödatanode·þÎñÆ÷ABC
clientÇëÇó3̨dnÖеÄһ̨AÉÏ´«Êý¾Ý£¨±¾ÖÊÉÏÊÇÒ»¸öRPCµ÷Ó㬽¨Á¢pipeline£©£¬AÊÕµ½ÇëÇó»á¼ÌÐøµ÷ÓÃB£¬È»ºóBµ÷ÓÃC£¬½«Õæ¸öpipeline½¨Á¢Íê³É£¬Öð¼¶·µ»Ø¿Í»§¶Ë
client¿ªÊ¼ÍùAÉÏ´«µÚÒ»¸öblock£¨ÏÈ´Ó´ÅÅ̶ÁÈ¡Êý¾Ý·Åµ½Ò»¸ö±¾µØÄڴ滺´æ£©£¬ÒÔpacketΪµ¥Î»£¬AÊÕµ½Ò»¸öpacket¾Í»á´«¸øB£¬B´«¸øC£»Aÿ´«Ò»¸öpacket»á·ÅÈëÒ»¸öÓ¦´ð¶ÓÁеȴýÓ¦´ð
µ±Ò»¸öblock´«ÊäÍê³ÉÖ®ºó£¬clientÔÙ´ÎÇëÇónamenodeÉÏ´«µÚ¶þ¸öblockµÄ·þÎñÆ÷¡£
3.3. HDFS¶ÁÊý¾ÝÁ÷³Ì
3.3.1 ¸ÅÊö
¿Í»§¶Ë½«Òª¶ÁÈ¡µÄÎļþ·¾¶·¢Ë͸ønamenode£¬namenode»ñÈ¡ÎļþµÄÔªÐÅÏ¢£¨Ö÷ÒªÊÇblockµÄ´æ·ÅλÖÃÐÅÏ¢£©·µ»Ø¸ø¿Í»§¶Ë£¬¿Í»§¶Ë¸ù¾Ý·µ»ØµÄÐÅÏ¢ÕÒµ½ÏàÓ¦datanodeÖð¸ö»ñÈ¡ÎļþµÄblock²¢ÔÚ¿Í»§¶Ë±¾µØ½øÐÐÊý¾Ý×·¼ÓºÏ²¢´Ó¶ø»ñµÃÕû¸öÎļþ
3.3.2 Ïêϸ²½Öèͼ

3.3.3 Ïêϸ²½Öè½âÎö
¸únamenodeͨÐŲéѯԪÊý¾Ý£¬namenodeÕÒµ½Îļþ¿éËùÔÚµÄdatanode·þÎñÆ÷
Ìôѡһ̨datanode£¨¾Í½üÔÔò£¬È»ºóËæ»ú£©·þÎñÆ÷£¬ÇëÇó½¨Á¢socketÁ÷
datanode¿ªÊ¼·¢ËÍÊý¾Ý£¨´Ó´ÅÅÌÀïÃæ¶ÁÈ¡Êý¾Ý·ÅÈëÁ÷£¬ÒÔpacketΪµ¥Î»À´×öУÑ飩
¿Í»§¶ËÒÔpacketΪµ¥Î»½ÓÊÕ£¬ÏÖÔÚ±¾µØ»º´æ£¬È»ºóдÈëÄ¿±êÎļþ
4 NAMENODE¹¤×÷»úÖÆ
ѧϰĿ±ê£ºÀí½ânamenodeµÄ¹¤×÷»úÖÆÓÈÆäÊÇÔªÊý¾Ý¹ÜÀí»úÖÆ£¬ÒÔÔöÇ¿¶ÔHDFS¹¤×÷ÔÀíµÄÀí½â£¬¼°ÅàÑøhadoop¼¯ÈºÔËÓªÖС°ÐÔÄܵ÷ÓÅ¡±¡¢¡°namenode¡±¹ÊÕÏÎÊÌâµÄ·ÖÎö½â¾öÄÜÁ¦
ÎÊÌⳡ¾°£º
¼¯ÈºÆô¶¯ºó£¬¿ÉÒԲ鿴Îļþ£¬µ«ÊÇÉÏ´«Îļþʱ±¨´í£¬´ò¿ªwebÒ³Ãæ¿É¿´µ½namenodeÕý´¦ÓÚsafemode״̬£¬Ôõô´¦Àí£¿
Namenode·þÎñÆ÷µÄ´ÅÅ̹ÊÕϵ¼ÖÂnamenodeå´»ú£¬ÈçºÎÍì¾È¼¯Èº¼°Êý¾Ý£¿
NamenodeÊÇ·ñ¿ÉÒÔÓжà¸ö£¿namenodeÄÚ´æÒªÅäÖöà´ó£¿namenode¸ú¼¯ÈºÊý¾Ý´æ´¢ÄÜÁ¦ÓйØÏµÂð£¿
ÎļþµÄblocksize¾¿¾¹µ÷´óºÃ»¹Êǵ÷СºÃ£¿
¡¡
ÖîÈç´ËÀàÎÊÌâµÄ»Ø´ð£¬¶¼ÐèÒª»ùÓÚ¶Ônamenode×ÔÉíµÄ¹¤×÷ÔÀíµÄÉî¿ÌÀí½â
4.1 NAMENODEÖ°Ôð
NAMENODEÖ°Ôð£º
¸ºÔð¿Í»§¶ËÇëÇóµÄÏìÓ¦
ÔªÊý¾ÝµÄ¹ÜÀí£¨²éѯ£¬Ð޸ģ©
4.2 ÔªÊý¾Ý¹ÜÀí
namenode¶ÔÊý¾ÝµÄ¹ÜÀí²ÉÓÃÁËÈýÖÖ´æ´¢ÐÎʽ£º
ÄÚ´æÔªÊý¾Ý(NameSystem)
´ÅÅÌÔªÊý¾Ý¾µÏñÎļþ(fsimage)
Êý¾Ý²Ù×÷ÈÕÖ¾Îļþ£¨edits¿Éͨ¹ýÈÕÖ¾ÔËËã³öÔªÊý¾Ý£©
4.2.1 ÔªÊý¾Ý´æ´¢»úÖÆ(ÔªÊý¾ÝÊǶÔÏó£¬ÓÐÌØ¶¨µÄÊý¾Ý½á¹¹£¬¿ÉÒÔÀí½âΪhashmap½á¹¹)
A¡¢ÄÚ´æÖÐÓÐÒ»·ÝÍêÕûµÄÔªÊý¾Ý(ÄÚ´æmeta data)
B¡¢´ÅÅÌÓÐÒ»¸ö¡°×¼ÍêÕû¡±µÄÔªÊý¾Ý¾µÏñ£¨fsimage£©Îļþ(ÔÚnamenodeµÄ¹¤×÷Ŀ¼ÖÐ)
C¡¢ÓÃÓÚÏνÓÄÚ´æmetadataºÍ³Ö¾Ã»¯ÔªÊý¾Ý¾µÏñfsimageÖ®¼äµÄ²Ù×÷ÈÕÖ¾£¨editsÎļþ£©×¢£ºµ±¿Í»§¶Ë¶ÔhdfsÖеÄÎļþ½øÐÐÐÂÔö»òÕßÐ޸IJÙ×÷£¬²Ù×÷¼Ç¼Ê×Ïȱ»¼ÇÈëeditsÈÕÖ¾ÎļþÖУ¬µ±¿Í»§¶Ë²Ù×÷³É¹¦ºó£¬ÏàÓ¦µÄÔªÊý¾Ý»á¸üе½ÄÚ´æmeta.dataÖÐ
²¹³ä:
1¡¢fsimageÎļþÊÇÏßÐԽṹ£¬¶¼ÊÇ0ºÍ1£¬ºÜÄѲéÕÒ»òÕßÐÞ¸ÄijÌõÊý¾Ý£¬ËùÒԲŻᶨÆÚcheckpoint¡£
2¡¢edits¼Ç¼µÄÊDzÙ×÷²½Ö裬ÀàËÆÓÚmysqlµÄbinlog
3¡¢fsimage¼Ç¼µÄÊÇÕâ¸öÎļþ±¸·ÝÁ˼¸·Ý£¬·Ö±ð½ÐʲôÃû³Æ

4¡¢secondary namenode½¨Òé²»ºÍnamenodeÔÚÒ»¸ö½ÚµãÆô¶¯£¬ÒòΪËü»á¿½±´ÔªÊý¾Ý£¬¼ÓÔØµ½ÄÚ´æÉú³Éfsimage£¬»áÕ¼ÓÃnamenodeµÄÄÚ´æ¡£(×î¼ò°æ)
5¡¢ÔÚhadoopµÄ¸ß¿ÉÓûúÖÆ+Federation»úÖÆÖУ¬Ã»ÓÐSecondaryNamenode£¬¿ÉÒÔͨ¹ýÆô¶¯SecondaryNamenode½øÐÐÑéÖ¤£¬»á±¨Ò»¸ö´íÎó:¡°ËüµÄ¹¦Äܱ»StandbyNamenodeÈ¡´ú¡±¡£(ÔÚÆô¶¯µÄÄÇ̨»úÆ÷µÄlogsÎļþ¼ÐÀïÃæµÄSecondaryNamenode.log)¡£(ÍêÈ«°æ)
4.2.2 ÔªÊý¾ÝÊÖ¶¯²é¿´
¿ÉÒÔͨ¹ýhdfsµÄÒ»¸ö¹¤¾ßÀ´²é¿´editsÖеÄÐÅÏ¢
bin/hdfs oev -i edits -o edits.xml
bin/hdfs oiv -i fsimage_0000000000000000087 -p XML
-o fsimage.xml
4.2.3 ÔªÊý¾ÝµÄcheckpoint
ÿ¸ôÒ»¶Îʱ¼ä£¬»áÓÉsecondary namenode½«namenodeÉÏ×îеÄedits(ÏÂÔØ¹ýµÄnamenode»áɾ³ý)ºÍfsimage(µÚÒ»´Îʱ»áÏÂÔØfsimage,ÒÔºó²»»á)ÏÂÔØµ½secondary
namenodeÖУ¬²¢¼ÓÔØµ½ÄÚ´æ½øÐÐmerge£¨Õâ¸ö¹ý³Ì³ÆÎªcheckpoint£©
checkpointµÄÏêϸ¹ý³Ì

checkpoint²Ù×÷µÄ´¥·¢Ìõ¼þÅäÖòÎÊý
dfs.namenode.checkpoint.check.period=60
#¼ì²é´¥·¢Ìõ¼þÊÇ·ñÂú×ãµÄƵÂÊ£¬60Ãë
dfs.namenode.checkpoint.dir=file://KaTeX parse
error: Expected 'EOF', got '#' at position 36:
¡/namesecondary #?ÒÔÉÏÁ½¸ö²ÎÊý×öcheckpoi¡{dfs.namenode.checkpoint.dir} |
dfs.namenode.checkpoint.max-retries=3
#×î´óÖØÊÔ´ÎÊý
dfs.namenode.checkpoint.period=3600 #Á½´ÎcheckpointÖ®¼äµÄʱ¼ä¼ä¸ô3600Ãë
dfs.namenode.checkpoint.txns=1000000 #Á½´ÎcheckpointÖ®¼ä×î´óµÄ²Ù×÷¼Ç¼ |
checkpointµÄ¸½´ø×÷ÓÃ
namenodeºÍsecondary namenodeµÄ¹¤×÷Ŀ¼´æ´¢½á¹¹ÍêÈ«Ïàͬ£¬ËùÒÔ£¬µ±namenode¹ÊÕÏÍ˳öÐèÒªÖØÐ»ָ´Ê±£¬¿ÉÒÔ´Ósecondary
namenodeµÄ¹¤×÷Ŀ¼Öн«fsimage¿½±´µ½namenodeµÄ¹¤×÷Ŀ¼£¬ÒÔ»Ö¸´namenodeµÄÔªÊý¾Ý¡£
4.2.4 ÔªÊý¾ÝĿ¼˵Ã÷
ÔÚµÚÒ»´Î²¿ÊðºÃHadoop¼¯ÈºµÄʱºò£¬ÎÒÃÇÐèÒªÔÚNameNode£¨NN£©½ÚµãÉϸñʽ»¯´ÅÅÌ£º
$HADOOP_HOME/bin/hdfs
namenode -format |
¸ñʽ»¯Íê³ÉÖ®ºó£¬½«»áÔÚ$ dfs. namenode .name.dir/currentĿ¼ÏÂÈçϵÄÎļþ½á¹¹
current/
|-- VERSION
|-- edits_*
|-- fsimage_0000000000008547077
|-- fsimage_0000000000008547077.md5
`-- seen_txid |
ÆäÖеÄdfs.name.dirÊÇÔÚhdfs-site.xmlÎļþÖÐÅäÖõģ¬Ä¬ÈÏÖµÈçÏ£º
<property>
<name>dfs.name.dir</name> <value>file://${hadoop.tmp.dir}/dfs/name</value>
</property> |
hadoop.tmp.dirÊÇÔÚcore-site.xmlÖÐÅäÖõģ¬Ä¬ÈÏÖµÈçÏÂ
<property>
<name>hadoop.tmp.dir</name> <value>/tmp/hadoop-${user.name}</value>
<description>A base for other temporary
directories.</description>
</property> |
dfs. namenode.name.dirÊôÐÔ¿ÉÒÔÅäÖöà¸öĿ¼£¬
Èç/data1/dfs/name,/data2/dfs/name,/data3/dfs/name,¡¡£¸÷¸öĿ¼´æ´¢µÄÎļþ½á¹¹ºÍÄÚÈݶ¼ÍêȫһÑù£¬Ï൱ÓÚ±¸·Ý£¬ÕâÑù×öµÄºÃ´¦Êǵ±ÆäÖÐÒ»¸öĿ¼Ëð»µÁË£¬Ò²²»»áÓ°Ïìµ½HadoopµÄÔªÊý¾Ý£¬ÌرðÊǵ±ÆäÖÐÒ»¸öĿ¼ÊÇNFS£¨ÍøÂçÎļþϵͳNetwork
File System£¬NFS£©Ö®ÉÏ£¬¼´Ê¹ÄãÕą̂»úÆ÷Ëð»µÁË£¬ÔªÊý¾ÝÒ²µÃµ½±£´æ¡£
ÏÂÃæ¶Ô$dfs. namenode .name.dir/current/Ŀ¼ÏµÄÎļþ½øÐнâÊÍ¡£
VERSIONÎļþÊÇJavaÊôÐÔÎļþ£¬ÄÚÈÝ´óÖÂÈçÏ£º
#Fri Nov 15 19:47:46
CST 2013
namespaceID=934548976
clusterID=CID-cdff7d73-93cd-4783-9399-0a22e6dce196
cTime=0
storageType=NAME_NODE
blockpoolID=BP-893790215-192.168.24.72-1383809616115
layoutVersion=-47 |
ÆäÖÐ
£¨1£©¡¢namespaceIDÊÇÎļþϵͳµÄΨһ±êʶ·û£¬ÔÚÎļþϵͳÊ״θñʽ»¯Ö®ºóÉú³ÉµÄ£»
£¨2£©¡¢storageType˵Ã÷Õâ¸öÎļþ´æ´¢µÄÊÇʲô½ø³ÌµÄÊý¾Ý½á¹¹ÐÅÏ¢£¨Èç¹ûÊÇDataNode£¬storageType=DATA_NODE£©£»
£¨3£©¡¢cTime±íʾNameNode´æ´¢Ê±¼äµÄ´´½¨Ê±¼ä£¬ÓÉÓÚÎÒµÄNameNodeûÓиüйý£¬ËùÒÔÕâÀïµÄ¼Ç¼ֵΪ0£¬ÒÔºó¶ÔNameNodeÉý¼¶Ö®ºó£¬cTime½«»á¼Ç¼¸üÐÂʱ¼ä´Á£»
£¨4£©¡¢layoutVersion±íʾHDFSÓÀ¾ÃÐÔÊý¾Ý½á¹¹µÄ°æ±¾ÐÅÏ¢£¬
Ö»ÒªÊý¾Ý½á¹¹±ä¸ü£¬°æ±¾ºÅÒ²ÒªµÝ¼õ£¬´ËʱµÄHDFSÒ²ÐèÒªÉý¼¶£¬·ñÔò´ÅÅÌÈÔ¾ÉÊÇʹÓþɰ汾µÄÊý¾Ý½á¹¹£¬Õâ»áµ¼ÖÂа汾µÄNameNodeÎÞ·¨Ê¹Óã»
£¨5£©¡¢clusterIDÊÇϵͳÉú³É»òÊÖ¶¯Ö¸¶¨µÄ¼¯ÈºID£¬ÔÚ-clusteridÑ¡ÏîÖпÉÒÔʹÓÃËü£»ÈçÏÂ˵Ã÷
a¡¢Ê¹ÓÃÈçÏÂÃüÁî¸ñʽ»¯Ò»¸öNamenode£º
$HADOOP_HOME/bin/hdfs
namenode -format [-clusterId <cluster_id>] |
Ñ¡ÔñÒ»¸öΨһµÄcluster_id£¬²¢ÇÒÕâ¸öcluster_id²»ÄÜÓë»·¾³ÖÐÆäËû¼¯ÈºÓгåÍ»¡£Èç¹ûûÓÐÌṩcluster_id£¬Ôò»á×Ô¶¯Éú³ÉÒ»¸öΨһµÄClusterID¡£
b¡¢Ê¹ÓÃÈçÏÂÃüÁî¸ñʽ»¯ÆäËûNamenode£º
$HADOOP_HOME/bin/hdfs
namenode -format -clusterId <cluster_id> |
c¡¢Éý¼¶¼¯ÈºÖÁ×îа汾¡£ÔÚÉý¼¶¹ý³ÌÖÐÐèÒªÌṩһ¸öClusterID£¬ÀýÈ磺
$ HADOOP_PREFIX_HOME/bin/hdfs
start namenode --config
$ HADOOP_CONF_DIR -upgrade -clusterId <cluster_ID> |
Èç¹ûûÓÐÌṩClusterID£¬Ôò»á×Ô¶¯Éú³ÉÒ»¸öClusterID¡£
£¨6£©¡¢blockpoolID£ºÊÇÕë¶Ôÿһ¸öNamespaceËù¶ÔÓ¦µÄblockpoolµÄID£¬ÉÏÃæµÄÕâ¸öBP-893790215-192.168.24.72-1383809616115¾ÍÊÇÔÚÎÒµÄns1µÄnamespaceϵĴ洢¿é³ØµÄID£¬Õâ¸öID°üÀ¨ÁËÆä¶ÔÓ¦µÄNameNode½ÚµãµÄipµØÖ·¡£
¡¡¡¡
2. $dfs.namenode.name.dir/current/seen_txid·Ç³£ÖØÒª£¬ÊÇ´æ·ÅtransactionIdµÄÎļþ£¬formatÖ®ºóÊÇ0£¬Ëü´ú±íµÄÊÇnamenodeÀïÃæµÄedits_*ÎļþµÄβÊý£¬namenodeÖØÆôµÄʱºò£¬»á°´ÕÕseen_txidµÄÊý×Ö£¬ÑÐò´ÓÍ·ÅÜedits_0000001~µ½seen_txidµÄÊý×Ö¡£ËùÒÔµ±ÄãµÄhdfs·¢ÉúÒì³£ÖØÆôµÄʱºò£¬Ò»¶¨Òª±È¶Ôseen_txidÄÚµÄÊý×ÖÊDz»ÊÇÄãedits×îºóµÄβÊý£¬²»È»»á·¢Éú½¨ÖÃnamenodeʱmetaDataµÄ×ÊÁÏÓÐȱÉÙ£¬µ¼ÖÂÎóɾDatanodeÉ϶àÓàBlockµÄ×ÊѶ¡£
$dfs.namenode.name.dir/currentĿ¼ÏÂÔÚformatµÄͬʱҲ»áÉú³ÉfsimageºÍeditsÎļþ£¬¼°Æä¶ÔÓ¦µÄmd5УÑéÎļþ¡£
²¹³ä£ºseen_txid
ÎļþÖмǼµÄÊÇedits¹ö¶¯µÄÐòºÅ£¬Ã¿´ÎÖØÆônamenodeʱ£¬namenode¾ÍÖªµÀÒª½«ÄÄЩedits½øÐмÓÔØedits
5 DATANODEµÄ¹¤×÷»úÖÆ
ÎÊÌⳡ¾°£º
1¡¢¼¯ÈºÈÝÁ¿²»¹»£¬ÔõôÀ©ÈÝ£¿
2¡¢Èç¹ûÓÐһЩdatanodeå´»ú£¬¸ÃÔõô°ì£¿
3¡¢datanodeÃ÷Ã÷ÒÑÆô¶¯£¬µ«ÊǼ¯ÈºÖеĿÉÓÃdatanodeÁбíÖоÍÊÇûÓУ¬Ôõô°ì£¿
ÒÔÉÏÕâÀàÎÊÌâµÄ½â´ð£¬ÓÐÀµÓÚ¶Ôdatanode¹¤×÷»úÖÆµÄÉî¿ÌÀí½â
5.1 ¸ÅÊö
1¡¢Datanode¹¤×÷Ö°Ôð£º
´æ´¢¹ÜÀíÓû§µÄÎļþ¿éÊý¾Ý
¶¨ÆÚÏònamenode»ã±¨×ÔÉíËù³ÖÓеÄblockÐÅÏ¢£¨Í¨¹ýÐÄÌøÐÅÏ¢Éϱ¨£©
£¨ÕâµãºÜÖØÒª£¬ÒòΪ£¬µ±¼¯ÈºÖз¢ÉúijЩblock¸±±¾Ê§Ð§Ê±£¬¼¯ÈºÈçºÎ»Ö¸´block³õʼ¸±±¾ÊýÁ¿µÄÎÊÌ⣩
<property>
<name>dfs.blockreport.intervalMsec</name>
<value>3600000</value> <description>Determines
block reporting interval in milliseconds.</description>
</property> |
2¡¢DatanodeµôÏßÅжÏʱÏÞ²ÎÊý
datanode½ø³ÌËÀÍö»òÕßÍøÂç¹ÊÕÏÔì³ÉdatanodeÎÞ·¨ÓënamenodeͨÐÅ£¬namenode²»»áÁ¢¼´°Ñ¸Ã½ÚµãÅж¨ÎªËÀÍö£¬Òª¾¹ýÒ»¶Îʱ¼ä£¬Õâ¶Îʱ¼äÔݳÆ×÷³¬Ê±Ê±³¤¡£HDFSĬÈϵij¬Ê±Ê±³¤Îª10·ÖÖÓ+30Ãë¡£Èç¹û¶¨Ò峬ʱʱ¼äΪtimeout£¬Ôò³¬Ê±Ê±³¤µÄ¼ÆË㹫ʽΪ£º
timeout = 2 * heartbeat.recheck.interval + 10 * dfs.heartbeat.interval¡£
¶øÄ¬ÈϵÄheartbeat.recheck.interval ´óСΪ5·ÖÖÓ£¬dfs.heartbeat.intervalĬÈÏΪ3Ãë¡£
ÐèҪעÒâµÄÊÇhdfs-site.xml ÅäÖÃÎļþÖеÄheartbeat.recheck.intervalµÄµ¥Î»ÎªºÁÃ룬dfs.heartbeat.intervalµÄµ¥Î»ÎªÃë¡£ËùÒÔ£¬¾Ù¸öÀý×Ó£¬Èç¹ûheartbeat.recheck.intervalÉèÖÃΪ5000£¨ºÁÃ룩£¬dfs.heartbeat.intervalÉèÖÃΪ3£¨Ã룬ĬÈÏ£©£¬Ôò×ܵij¬Ê±Ê±¼äΪ40Ãë¡£
<property>
<name>heartbeat.recheck.interval</name>
<value>2000</value>
</property>
<property> <name>dfs.heartbeat.interval</name>
<value>1</value>
</property> |
5.2 ¹Û²ìÑéÖ¤DATANODE¹¦ÄÜ
ÉÏ´«Ò»¸öÎļþ£¬¹Û²ìÎļþµÄblock¾ßÌåµÄÎïÀí´æ·ÅÇé¿ö£º
ÔÚÿһ̨datanode»úÆ÷ÉϵÄÕâ¸öĿ¼ÖÐÄÜÕÒµ½ÎļþµÄÇп飺
/home/hadoop/app/hadoop-2.4.1/tmp/dfs/data/current/BP-193442119-192.168.2.120-1432457733977/current/finalized
5.3ÔªÊý¾ÝĿ¼(×Ô¼ºÌí¼Ó,ʵ²âÓÐЧ)
ÆäÖеÄdfs.data.dirÊÇÔÚhdfs-site.xmlÎļþÖÐÅäÖõģ¬Ä¬ÈÏÖµÈçÏ£º
<property>
<name>dfs.data.dir</name> <value>file://${hadoop.tmp.dir}/dfs/name</value>
</property> |
dfs. datanode data.dirÊôÐÔ¿ÉÒÔÅäÖöà¸öĿ¼£¬
Èç/data1/dfs/ data,/data2/dfs/ data,/data3/dfs/ data,¡¡£datanodeÅäÖöà¿é´ÅÅ̺󣬻ὫÕâЩ´ÅÅÌͳһ¿´³ÉËüµÄ¿Õ¼ä¡£²¢·¢Ê±ÓÐÓÅÊÆ£¬¿ÉÒÔÍù²»Í¬µÄ´ÅÅÌдÊý¾Ý£¬´ÅÅÌ¿ÉÒÔ²¢ÐС£Ï൱ÓÚÀ©ÈÝ¡£
²¹³ä£ºblock¿éĬÈÏ128M£¬×îСÅäÖÃΪ1M
HDFSÓ¦Óÿª·¢Æª
6. HDFSµÄjava²Ù×÷
hdfsÔÚÉú²úÓ¦ÓÃÖÐÖ÷ÒªÊǿͻ§¶ËµÄ¿ª·¢£¬ÆäºËÐIJ½ÖèÊÇ´ÓhdfsÌṩµÄapiÖй¹ÔìÒ»¸öHDFSµÄ·ÃÎʿͻ§¶Ë¶ÔÏó£¬È»ºóͨ¹ý¸Ã¿Í»§¶Ë¶ÔÏó²Ù×÷£¨Ôöɾ¸Ä²é£©HDFSÉϵÄÎļþ
6.1 ´î½¨¿ª·¢»·¾³
ÒýÈëÒÀÀµ
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.6.1</version>
</dependency> |
×¢£ºÈçÐèÊÖ¶¯ÒýÈëjar°ü£¬hdfsµÄjar°ü----hadoopµÄ°²×°Ä¿Â¼µÄshareÏÂ
windowÏ¿ª·¢µÄ˵Ã÷
½¨ÒéÔÚlinuxϽøÐÐhadoopÓ¦ÓõĿª·¢£¬²»»á´æÔÚ¼æÈÝÐÔÎÊÌâ¡£ÈçÔÚwindowÉÏ×ö¿Í»§¶ËÓ¦Óÿª·¢£¬ÐèÒªÉèÖÃÒÔÏ»·¾³£º
A¡¢ÔÚwindowsµÄij¸öĿ¼Ï½âѹһ¸öhadoopµÄ°²×°°ü
B¡¢½«°²×°°üϵÄlibºÍbinĿ¼ÓöÔÓ¦windows°æ±¾Æ½Ì¨±àÒëµÄ±¾µØ¿âÌæ»»
C¡¢ÔÚwindowϵͳÖÐÅäÖÃHADOOP_HOMEÖ¸ÏòÄã½âѹµÄ°²×°°ü
D¡¢ÔÚwindowsϵͳµÄpath±äÁ¿ÖмÓÈëhadoopµÄbinĿ¼
6.2 »ñÈ¡apiÖеĿͻ§¶Ë¶ÔÏó
ÔÚjavaÖвÙ×÷hdfs£¬Ê×ÏÈÒª»ñµÃÒ»¸ö¿Í»§¶ËʵÀý
Configuration
conf = new Configuration()
FileSystem fs = FileSystem.get(conf) |
¶øÎÒÃǵIJÙ×÷Ä¿±êÊÇHDFS£¬ËùÒÔ»ñÈ¡µ½µÄfs¶ÔÏóÓ¦¸ÃÊÇDistributedFileSystemµÄʵÀý£»
get·½·¨ÊǴӺδ¦ÅжϾßÌåʵÀý»¯ÄÇÖÖ¿Í»§¶ËÀàÄØ£¿
¡ª¡ª´ÓconfÖеÄÒ»¸ö²ÎÊý fs.defaultFSµÄÅäÖÃÖµÅжϣ»
Èç¹ûÎÒÃǵĴúÂëÖÐûÓÐÖ¸¶¨fs.defaultFS£¬²¢ÇÒ¹¤³ÌclasspathÏÂҲûÓиø¶¨ÏàÓ¦µÄÅäÖã¬confÖеÄĬÈÏÖµ¾ÍÀ´×ÔÓÚhadoopµÄjar°üÖеÄcore-default.xml£¬Ä¬ÈÏֵΪ£º
file:///£¬Ôò»ñÈ¡µÄ½«²»ÊÇÒ»¸öDistributedFileSystemµÄʵÀý£¬¶øÊÇÒ»¸ö±¾µØÎļþϵͳµÄ¿Í»§¶Ë¶ÔÏó
6.3 DistributedFileSystemʵÀý¶ÔÏóËù¾ß±¸µÄ·½·¨6.4 HDFS¿Í»§¶Ë²Ù×÷Êý¾Ý´úÂëʾÀý£º

6.4.1 ÎļþµÄÔöɾ¸Ä²é
public class
HdfsClient {
FileSystem fs = null;
@Before
public void init() throws Exception {
// ¹¹ÔìÒ»¸öÅäÖòÎÊý¶ÔÏó£¬ÉèÖÃÒ»¸ö²ÎÊý£ºÎÒÃÇÒª·ÃÎʵÄhdfsµÄURI
// ´Ó¶øFileSystem.get()·½·¨¾ÍÖªµÀÓ¦¸ÃÊÇÈ¥¹¹ÔìÒ»¸ö·ÃÎÊhdfsÎļþϵͳµÄ¿Í»§¶Ë£¬ÒÔ¼°hdfsµÄ·ÃÎʵØÖ·
// new Configuration();µÄʱºò£¬Ëü¾Í»áÈ¥¼ÓÔØjar°üÖеÄhdfs-default.xml
// È»ºóÔÙ¼ÓÔØclasspathϵÄhdfs-site.xml
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://hdp-node01:9000");
/**
* ²ÎÊýÓÅÏȼ¶£º 1¡¢¿Í»§¶Ë´úÂëÖÐÉèÖõÄÖµ 2¡¢classpathϵÄÓû§×Ô¶¨ÒåÅäÖÃÎļþ 3¡¢È»ºóÊÇ·þÎñÆ÷µÄĬÈÏÅäÖÃ
*/
conf.set("dfs.replication", "3");
// »ñȡһ¸öhdfsµÄ·ÃÎʿͻ§¶Ë£¬¸ù¾Ý²ÎÊý£¬Õâ¸öʵÀýÓ¦¸ÃÊÇDistributedFileSystemµÄʵÀý
// fs = FileSystem.get(conf);
// Èç¹ûÕâÑùÈ¥»ñÈ¡£¬ÄÇconfÀïÃæ¾Í¿ÉÒÔ²»ÒªÅä"fs.defaultFS"²ÎÊý£¬¶øÇÒ£¬Õâ¸ö¿Í»§¶ËµÄÉí·Ý±êʶÒѾÊÇhadoopÓû§
fs = FileSystem.get(new URI("hdfs://hdp-node01:9000"),
conf, "hadoop");
}
/**
* ÍùhdfsÉÏ´«Îļþ
*
* @throws Exception
*/
@Test
public void testAddFileToHdfs() throws Exception
{
// ÒªÉÏ´«µÄÎļþËùÔڵı¾µØÂ·¾¶
Path src = new Path("g:/redis-recommend.zip");
// ÒªÉÏ´«µ½hdfsµÄÄ¿±ê·¾¶
Path dst = new Path("/aaa");
fs.copyFromLocalFile(src, dst);
fs.close();
}
/**
* ´ÓhdfsÖи´ÖÆÎļþµ½±¾µØÎļþϵͳ
*
* @throws IOException
* @throws IllegalArgumentException
*/
@Test
public void testDownloadFileToLocal() throws IllegalArgumentException,
IOException {
fs.copyToLocalFile(new Path("/jdk-7u65-linux-i586.tar.gz"),
new Path("d:/"));
fs.close();
}
@Test
public void testMkdirAndDeleteAndRename() throws
IllegalArgumentException, IOException {
// ´´½¨Ä¿Â¼
fs.mkdirs(new Path("/a1/b1/c1"));
// ɾ³ýÎļþ¼Ð £¬Èç¹ûÊÇ·Ç¿ÕÎļþ¼Ð£¬²ÎÊý2±ØÐë¸øÖµtrue
fs.delete(new Path("/aaa"), true);
// ÖØÃüÃûÎļþ»òÎļþ¼Ð
fs.rename(new Path("/a1"), new Path("/a2"));
}
/**
* ²é¿´Ä¿Â¼ÐÅÏ¢£¬Ö»ÏÔʾÎļþ
*
* @throws IOException
* @throws IllegalArgumentException
* @throws FileNotFoundException
*/
@Test
public void testListFiles() throws FileNotFoundException,
IllegalArgumentException, IOException {
// ˼¿¼£ºÎªÊ²Ã´·µ»Øµü´úÆ÷£¬¶ø²»ÊÇListÖ®ÀàµÄÈÝÆ÷
RemoteIterator<LocatedFileStatus> listFiles
= fs.listFiles(new Path("/"), true);
while (listFiles.hasNext()) {
LocatedFileStatus fileStatus = listFiles.next();
System.out.println(fileStatus.getPath().getName());
System.out.println(fileStatus.getBlockSize());
System.out.println(fileStatus.getPermission());
System.out.println(fileStatus.getLen());
BlockLocation[] blockLocations = fileStatus.getBlockLocations();
for (BlockLocation bl : blockLocations) {
System.out.println("block-length:" +
bl.getLength() + "--" + "block-offset:"
+ bl.getOffset());
String[] hosts = bl.getHosts();
for (String host : hosts) {
System.out.println(host);
}
}
System.out.println("--------------Ϊangelababy´òÓ¡µÄ·Ö¸îÏß--------------");
}
}
/**
* ²é¿´Îļþ¼°Îļþ¼ÐÐÅÏ¢
*
* @throws IOException
* @throws IllegalArgumentException
* @throws FileNotFoundException
*/
@Test
public void testListAll() throws FileNotFoundException,
IllegalArgumentException, IOException {
FileStatus[] listStatus = fs.listStatus(new Path("/"));
String flag = "d-- ";
for (FileStatus fstatus : listStatus) {
if (fstatus.isFile()) flag = "f-- ";
System.out.println(flag + fstatus.getPath().getName());
}
}
} |
6.4.2 ͨ¹ýÁ÷µÄ·½Ê½·ÃÎÊhdfs
/**
* Ïà¶ÔÄÇЩ·â×°ºÃµÄ·½·¨¶øÑԵĸüµ×²ãһЩµÄ²Ù×÷·½Ê½
* ÉϲãÄÇЩmapreduce sparkµÈÔËËã¿ò¼Ü£¬È¥hdfsÖлñÈ¡Êý¾ÝµÄʱºò£¬¾ÍÊǵ÷µÄÕâÖֵײãµÄapi
* @author
*
*/
public class StreamAccess {
FileSystem fs = null;
@Before
public void init() throws Exception {
Configuration conf = new Configuration();
fs = FileSystem.get(new URI("hdfs://hdp-node01:9000"),
conf, "hadoop");
}
/**
* ͨ¹ýÁ÷µÄ·½Ê½ÉÏ´«Îļþµ½hdfs
* @throws Exception
*/
@Test
public void testUpload() throws Exception {
FSDataOutputStream outputStream = fs.create(new
Path("/angelababy.love"), true);
FileInputStream inputStream = new FileInputStream("c:/angelababy.love");
IOUtils.copy(inputStream, outputStream);
}
@Test
public void testDownLoadFileToLocal() throws IllegalArgumentException,
IOException{
//ÏÈ»ñȡһ¸öÎļþµÄÊäÈëÁ÷----Õë¶ÔhdfsÉϵÄ
FSDataInputStream in = fs.open(new Path("/jdk-7u65-linux-i586.tar.gz"));
//ÔÙ¹¹ÔìÒ»¸öÎļþµÄÊä³öÁ÷----Õë¶Ô±¾µØµÄ
FileOutputStream out = new FileOutputStream(new
File("c:/jdk.tar.gz"));
//ÔÙ½«ÊäÈëÁ÷ÖÐÊý¾Ý´«Êäµ½Êä³öÁ÷
IOUtils.copyBytes(in, out, 4096);
}
/**
* hdfsÖ§³ÖËæ»ú¶¨Î»½øÐÐÎļþ¶ÁÈ¡£¬¶øÇÒ¿ÉÒÔ·½±ãµØ¶Áȡָ¶¨³¤¶È
* ÓÃÓÚÉϲã·Ö²¼Ê½ÔËËã¿ò¼Ü²¢·¢´¦ÀíÊý¾Ý
* @throws IllegalArgumentException
* @throws IOException
*/
@Test
public void testRandomAccess() throws IllegalArgumentException,
IOException{
//ÏÈ»ñȡһ¸öÎļþµÄÊäÈëÁ÷----Õë¶ÔhdfsÉϵÄ
FSDataInputStream in = fs.open(new Path("/iloveyou.txt"));
//¿ÉÒÔ½«Á÷µÄÆðÊ¼Æ«ÒÆÁ¿½øÐÐ×Ô¶¨Òå
in.seek(22);
//ÔÙ¹¹ÔìÒ»¸öÎļþµÄÊä³öÁ÷----Õë¶Ô±¾µØµÄ
FileOutputStream out = new FileOutputStream(new
File("c:/iloveyou.line.2.txt"));
IOUtils.copyBytes(in,out,19L,true);
}
/**
* ÏÔʾhdfsÉÏÎļþµÄÄÚÈÝ
* @throws IOException
* @throws IllegalArgumentException
*/
@Test
public void testCat() throws IllegalArgumentException,
IOException{
FSDataInputStream in = fs.open(new Path("/iloveyou.txt"));
IOUtils.copyBytes(in, System.out, 1024);
}
} |
6.4.3 ³¡¾°±à³Ì
ÔÚmapreduce ¡¢sparkµÈÔËËã¿ò¼ÜÖУ¬ÓÐÒ»¸öºËÐÄ˼Ïë¾ÍÊǽ«ÔËËãÒÆÍùÊý¾Ý£¬»òÕß˵£¬¾ÍÊÇÒªÔÚ²¢·¢¼ÆËãÖо¡¿ÉÄÜÈÃÔËËã±¾µØ»¯£¬Õâ¾ÍÐèÒª»ñÈ¡Êý¾ÝËùÔÚλÖõÄÐÅÏ¢²¢½øÐÐÏàÓ¦·¶Î§¶ÁÈ¡
ÒÔÏÂÄ£ÄâʵÏÖ£º»ñȡһ¸öÎļþµÄËùÓÐblockλÖÃÐÅÏ¢£¬È»ºó¶Áȡָ¶¨blockÖеÄÄÚÈÝ
@Test
public void testCat() throws IllegalArgumentException,
IOException{
FSDataInputStream in = fs.open(new Path("/weblog/input/access.log.10"));
//Äõ½ÎļþÐÅÏ¢
FileStatus[] listStatus = fs.listStatus(new Path("/weblog/input/access.log.10"));
//»ñÈ¡Õâ¸öÎļþµÄËùÓÐblockµÄÐÅÏ¢
BlockLocation[] fileBlockLocations = fs.getFileBlockLocations(listStatus[0],
0L, listStatus[0].getLen());
//µÚÒ»¸öblockµÄ³¤¶È
long length = fileBlockLocations[0].getLength();
//µÚÒ»¸öblockµÄÆðÊ¼Æ«ÒÆÁ¿
long offset = fileBlockLocations[0].getOffset();
System.out.println(length);
System.out.println(offset);
//»ñÈ¡µÚÒ»¸öblockдÈëÊä³öÁ÷
// IOUtils.copyBytes(in, System.out, (int)length);
byte[] b = new byte[4096];
FileOutputStream os = new FileOutputStream(new
File("d:/block0"));
while(in.read(offset, b, 0, 4096)!=-1){
os.write(b);
offset += 4096;
if(offset>=length) return;
};
os.flush();
os.close();
in.close();
} |
7. °¸Àý1£º¿ª·¢shell²É¼¯½Å±¾
7.1ÐèÇó˵Ã÷
µã»÷Á÷ÈÕ־ÿÌì¶¼10T£¬ÔÚÒµÎñÓ¦Ó÷þÎñÆ÷ÉÏ£¬ÐèҪ׼ʵʱÉÏ´«ÖÁÊý¾Ý²Ö¿â£¨Hadoop HDFS£©ÉÏ
7.2ÐèÇó·ÖÎö
Ò»°ãÉÏ´«Îļþ¶¼ÊÇÔÚÁ賿24µã²Ù×÷£¬ÓÉÓںܶàÖÖÀàµÄÒµÎñÊý¾Ý¶¼ÒªÔÚÍíÉϽøÐд«Ê䣬ΪÁ˼õÇá·þÎñÆ÷µÄѹÁ¦£¬±Ü¿ª¸ß·åÆÚ¡£
Èç¹ûÐèҪαʵʱµÄÉÏ´«£¬Ôò²ÉÓö¨Ê±ÉÏ´«µÄ·½Ê½
7.3¼¼Êõ·ÖÎö
HDFS SHELL: hadoop fs ¨Cput xxxx.tar /data »¹¿ÉÒÔʹÓà Java
Api
Âú×ãÉÏ´«Ò»¸öÎļþ£¬²»ÄÜÂú×㶨ʱ¡¢ÖÜÆÚÐÔ´«Èë¡£
¶¨Ê±µ÷¶ÈÆ÷£º
Linux crontab
crontab -e
*/5 * * * * $home/bin/command.sh //Îå·ÖÖÓÖ´ÐÐÒ»´Î
ϵͳ»á×Ô¶¯Ö´Ðнű¾£¬Ã¿5·ÖÖÓÒ»´Î£¬Ö´ÐÐʱÅжÏÎļþÊÇ·ñ·ûºÏÉÏ´«¹æÔò£¬·ûºÏÔòÉÏ´«
7.4ʵÏÖÁ÷³Ì
7.4.1ÈÕÖ¾²úÉú³ÌÐò
ÈÕÖ¾²úÉú³ÌÐò½«ÈÕÖ¾Éú³Éºó£¬²úÉúÒ»¸öÒ»¸öµÄÎļþ£¬Ê¹Óùö¶¯Ä£Ê½´´½¨ÎļþÃû¡£

ÈÕÖ¾Éú³ÉµÄÂß¼ÓÉÒµÎñϵͳ¾ö¶¨£¬±ÈÈçÔÚlog4jÅäÖÃÎļþÖÐÅäÖÃÉú³É¹æÔò£¬È磺µ±xxxx.log
µÈÓÚ10Gʱ£¬¹ö¶¯Éú³ÉÐÂÈÕÖ¾
log4j.logger.msg=info,msg
log4j.appender.msg=cn.maoxiangyi.MyRollingFileAppender
log4j.appender.msg.layout=org.apache.log4j.PatternLayout
log4j.appender.msg.layout.ConversionPattern=%m%n
log4j.appender.msg.datePattern='.'yyyy-MM-dd
log4j.appender.msg.Threshold=info
log4j.appender.msg.append=true
log4j.appender.msg.encoding=UTF-8
log4j.appender.msg.MaxBackupIndex=100
log4j.appender.msg.MaxFileSize=10GB
log4j.appender.msg.File=/home/hadoop/logs/log/access.log |
ϸ½Ú£º
1¡¢Èç¹ûÈÕÖ¾Îļþºó׺ÊÇ1\2\3µÈÊý×Ö£¬¸ÃÎļþÂú×ãÐèÇó¿ÉÒÔÉÏ´«µÄ»°¡£°Ñ¸ÃÎļþÒÆ¶¯µ½×¼±¸ÉÏ´«µÄ¹¤×÷Çø¼ä¡£
2¡¢¹¤×÷Çø¼äÓÐÎļþÖ®ºó£¬¿ÉÒÔʹÓÃhadoop putÃüÁÎļþÉÏ´«¡£
½×¶ÎÎÊÌ⣺
1¡¢´ýÉÏ´«ÎļþµÄ¹¤×÷Çø¼äµÄÎļþ£¬ÔÚÉÏ´«Íê³ÉÖ®ºó£¬ÊÇ·ñÐèҪɾ³ýµô¡£
7.4.2α´úÂë
ʹÓÃlsÃüÁî¶Áȡָ¶¨Â·¾¶ÏµÄËùÓÐÎļþÐÅÏ¢£¬
ls | while read line
//ÅжÏlineÕâ¸öÎļþÃû³ÆÊÇ·ñ·ûºÏ¹æÔò
if line=access.log.* (
½«ÎļþÒÆ¶¯µ½´ýÉÏ´«µÄ¹¤×÷Çø¼ä
) |
//ÅúÁ¿ÉÏ´«¹¤×÷Çø¼äµÄÎļþ
hadoop fs ¨Cput xxx
½Å±¾Ð´ÍêÖ®ºó£¬ÅäÖÃlinux¶¨Ê±ÈÎÎñ£¬Ã¿5·ÖÖÓÔËÐÐÒ»´Î¡£
7.5´úÂëʵÏÖ
´úÂëµÚÒ»°æ±¾£¬ÊµÏÖ»ù±¾µÄÉÏ´«¹¦ÄܺͶ¨Ê±µ÷¶È¹¦ÄÜ´úÂë

µÚ¶þ°æ±¾£ºÔöÇ¿°æV2(»ù±¾ÄÜÓ㬻¹ÊDz»¹»½¡È«)


7.6Ч¹ûչʾ¼°²Ù×÷²½Öè
1¡¢ÈÕÖ¾ÊÕ¼¯ÎļþÊÕ¼¯Êý¾Ý£¬²¢½«Êý¾Ý±£´æÆðÀ´£¬Ð§¹ûÈçÏ£º

2¡¢ÉÏ´«³ÌÐòͨ¹ýcrontab¶¨Ê±µ÷¶È

3¡¢³ÌÐòÔËÐÐʱ²úÉúµÄÁÙʱÎļþ

4¡¢Hadoo hdfsÉϵÄЧ¹û8. °¸Àý2£º¿ª·¢JAVA²É¼¯³ÌÐò

8.1 ÐèÇó
´ÓÍⲿ¹ºÂòÊý¾Ý£¬Êý¾ÝÌṩ·½»áʵʱ½«Êý¾ÝÍÆË͵½6̨FTP·þÎñÆ÷ÉÏ£¬ÎÒ·½²¿Êð6̨½Ó¿Ú²É¼¯»úÀ´¶Ô½Ó²É¼¯Êý¾Ý£¬²¢ÉÏ´«µ½HDFSÖÐ
ÌṩÉÌÔÚFTPÉÏÉú³ÉÊý¾ÝµÄ¹æÔòÊÇÒÔСʱΪµ¥Î»½¨Á¢Îļþ¼Ð(2016-03-11-10)£¬Ã¿·ÖÖÓÉú³ÉÒ»¸öÎļþ£¨00.dat,01.data,02.dat,¡£©
Ìṩ·½²»ÌṩÊý¾Ý±¸·Ý£¬ÍÆË͵½FTP·þÎñÆ÷µÄÊý¾ÝÈç¹û¶ªÊ§£¬²»ÔÙÖØÐÂÌṩ£¬ÇÒFTP·þÎñÆ÷´ÅÅ̿ռäÓÐÏÞ£¬×î¶à´æ´¢×î½ü10СʱÄÚµÄÊý¾Ý
ÓÉÓÚÿһ¸öÎļþ±È½ÏС£¬Ö»ÓÐ150M×óÓÒ£¬Òò´Ë£¬ÎÒ·½ÔÚÉÏ´«µ½HDFS¹ý³ÌÖУ¬ÐèÒª½«15·ÖÖÓʱ¶ÎµÄÊý¾ÝºÏ²¢³ÉÒ»¸öÎļþÉÏ´«µ½HDFS
ΪÁËÇø·ÖÊý¾Ý¶ªÊ§µÄÔðÈΣ¬ÎÒ·½ÔÚÏÂÔØÊý¾Ýʱ×îºÃ½øÐÐУÑé
8.2 Éè¼Æ·ÖÎö 
|