Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Modeler   Code  
»áÔ±   
 
   
 
 
     
   
 ¶©ÔÄ
  ¾èÖú
Hadoop ѹËõʵÏÖ·ÖÎö
 
×÷ÕߣºÍõÌÚÌÚ,ÉÛ±ø À´Ô´£ºIBM ·¢²¼ÓÚ 2016-1-22
  2477  次浏览      27
 

Hadoop ×÷Ϊһ¸ö½ÏͨÓõĺ£Á¿Êý¾Ý´¦ÀíÆ½Ì¨£¬Ã¿´ÎÔËËã¶¼»áÐèÒª´¦Àí´óÁ¿Êý¾Ý£¬ÎÒÃÇ»áÔÚ Hadoop ϵͳÖжÔÊý¾Ý½øÐÐѹËõ´¦ÀíÀ´ÓÅ»¯´ÅÅÌʹÓÃÂÊ£¬Ìá¸ßÊý¾ÝÔÚ´ÅÅ̺ÍÍøÂçÖеĴ«ÊäËÙ¶È£¬´Ó¶øÌá¸ßϵͳ´¦ÀíÊý¾ÝµÄЧÂÊ¡£ÔÚʹÓÃѹËõ·½Ê½·½Ã棬Ö÷Òª¿¼ÂÇѹËõËٶȺÍѹËõÎļþµÄ¿É·Ö¸îÐÔ¡£×ÛºÏËùÊö£¬Ê¹ÓÃѹËõµÄÓŵãÈçÏ£º½ÚÊ¡Êý¾ÝÕ¼ÓõĴÅÅ̿ռ䣻¼Ó¿ìÊý¾ÝÔÚ´ÅÅ̺ÍÍøÂçÖеĴ«ÊäËÙ¶È£¬´Ó¶øÌá¸ßϵͳµÄ´¦ÀíËÙ¶È¡£

Òý×Ó

Ëæ×ÅÔÆÊ±´úµÄÀ´ÁÙ£¬´óÊý¾Ý£¨Big data£©Ò²»ñµÃÁËÔ½À´Ô½¶àµÄ¹Ø×¢¡£ÖøÔÆÌ¨µÄ·ÖÎöʦÍŶÓÈÏΪ£¬´óÊý¾Ý£¨Big data£©Í¨³£ÓÃÀ´ÐÎÈÝÒ»¸ö¹«Ë¾´´ÔìµÄ´óÁ¿·Ç½á¹¹»¯ºÍ°ë½á¹¹»¯Êý¾Ý£¬ÕâЩÊý¾ÝÔÚÏÂÔØµ½¹ØÏµÐÍÊý¾Ý¿âÓÃÓÚ·ÖÎöʱ»á»¨·Ñ¹ý¶àʱ¼äºÍ½ðÇ®¡£´óÊý¾Ý·ÖÎö³£ºÍÔÆ¼ÆËãÁªÏµµ½Ò»Æð£¬ÒòΪʵʱµÄ´óÐÍÊý¾Ý¼¯·ÖÎöÐèÒªÏñ MapReduce Ò»ÑùµÄ¿ò¼ÜÀ´ÏòÊýÊ®¡¢Êý°Ù»òÉõÖÁÊýǧµÄµçÄÔ·ÖÅ乤×÷¡£

¡°´óÊý¾Ý¡±ÔÚ»¥ÁªÍøÐÐÒµÖ¸µÄÊÇÕâÑùÒ»ÖÖÏÖÏ󣺻¥ÁªÍø¹«Ë¾ÔÚÈÕ³£ÔËÓªÖÐÉú³É¡¢ÀÛ»ýµÄÓû§ÍøÂçÐÐΪÊý¾Ý¡£ÕâЩÊý¾ÝµÄ¹æÄ£ÊÇÈç´ËÅÓ´ó£¬ÒÔÖÁÓÚ²»ÄÜÓà G »ò T À´ºâÁ¿¡£ËùÒÔÈçºÎ¸ßЧµÄ´¦Àí·ÖÎö´óÊý¾ÝµÄÎÊÌâ°ÚÔÚÁËÃæÇ°¡£¶ÔÓÚ´óÊý¾ÝµÄ´¦ÀíÓÅ»¯·½Ê½ÓкܶàÖÖ£¬±¾ÎÄÖÐÖ÷Òª½éÉÜÔÚʹÓà Hadoop ƽ̨ÖжÔÊý¾Ý½øÐÐѹËõ´¦ÀíÀ´Ìá¸ßÊý¾Ý´¦ÀíЧÂÊ¡£

ѹËõ¼ò½é

Hadoop ×÷Ϊһ¸ö½ÏͨÓõĺ£Á¿Êý¾Ý´¦ÀíÆ½Ì¨£¬Ã¿´ÎÔËËã¶¼»áÐèÒª´¦Àí´óÁ¿Êý¾Ý£¬ÎÒÃÇ»áÔÚ Hadoop ϵͳÖжÔÊý¾Ý½øÐÐѹËõ´¦ÀíÀ´ÓÅ»¯´ÅÅÌʹÓÃÂÊ£¬Ìá¸ßÊý¾ÝÔÚ´ÅÅ̺ÍÍøÂçÖеĴ«ÊäËÙ¶È£¬´Ó¶øÌá¸ßϵͳ´¦ÀíÊý¾ÝµÄЧÂÊ¡£ÔÚʹÓÃѹËõ·½Ê½·½Ã棬Ö÷Òª¿¼ÂÇѹËõËٶȺÍѹËõÎļþµÄ¿É·Ö¸îÐÔ¡£×ÛºÏËùÊö£¬Ê¹ÓÃѹËõµÄÓŵãÈçÏ£º

1. ½ÚÊ¡Êý¾ÝÕ¼ÓõĴÅÅ̿ռ䣻

2. ¼Ó¿ìÊý¾ÝÔÚ´ÅÅ̺ÍÍøÂçÖеĴ«ÊäËÙ¶È£¬´Ó¶øÌá¸ßϵͳµÄ´¦ÀíËÙ¶È¡£

ѹËõ¸ñʽ

Hadoop ¶ÔÓÚѹËõ¸ñʽµÄÊÇ×Ô¶¯Ê¶±ð¡£Èç¹ûÎÒÃÇѹËõµÄÎļþÓÐÏàӦѹËõ¸ñʽµÄÀ©Õ¹Ãû£¨±ÈÈç lzo£¬gz£¬bzip2 µÈ£©¡£Hadoop »á¸ù¾ÝѹËõ¸ñʽµÄÀ©Õ¹Ãû×Ô¶¯Ñ¡ÔñÏà¶ÔÓ¦µÄ½âÂëÆ÷À´½âѹÊý¾Ý£¬´Ë¹ý³ÌÍêÈ«ÊÇ Hadoop ×Ô¶¯´¦Àí£¬ÎÒÃÇÖ»ÐèҪȷ±£ÊäÈëµÄѹËõÎļþÓÐÀ©Õ¹Ãû¡£

Hadoop ¶Ôÿ¸öѹËõ¸ñʽµÄÖ§³Ö, Ïêϸ¼ûÏÂ±í£º

±í 1. ѹËõ¸ñʽ

Èç¹ûѹËõµÄÎļþûÓÐÀ©Õ¹Ãû£¬ÔòÐèÒªÔÚÖ´ÐÐ MapReduce ÈÎÎñµÄʱºòÖ¸¶¨ÊäÈë¸ñʽ¡£

hadoop jar /usr/home/hadoop/hadoop-0.20.2/contrib/streaming/
hadoop-streaming-0.20.2-CD H3B4.jar -file /usr/home/hadoop/hello/mapper.py -mapper /
usr/home/hadoop/hello/mapper.py -file /usr/home/hadoop/hello/
reducer.py -reducer /usr/home/hadoop/hello/reducer.py -input lzotest -output result4 -
jobconf mapred.reduce.tasks=1*-inputformatorg.apache.hadoop.mapred.LzoTextInputFormat*

ÐÔÄܶԱÈ

Hadoop ϸ÷ÖÖѹËõËã·¨µÄѹËõ±È£¬Ñ¹Ëõʱ¼ä£¬½âѹʱ¼ä¼ûϱí:

±í 2. ÐÔÄܶԱÈ

Òò´ËÎÒÃÇ¿ÉÒԵóö£º

1) Bzip2 ѹËõЧ¹ûÃ÷ÏÔÊÇ×îºÃµÄ£¬µ«ÊÇ bzip2 ѹËõËÙ¶ÈÂý£¬¿É·Ö¸î¡£

2) Gzip ѹËõЧ¹û²»Èç Bzip2£¬µ«ÊÇѹËõ½âѹËٶȿ죬²»Ö§³Ö·Ö¸î¡£

3) LZO ѹËõЧ¹û²»Èç Bzip2 ºÍ Gzip£¬µ«ÊÇѹËõ½âѹËÙ¶È×î¿ì£¡²¢ÇÒÖ§³Ö·Ö¸î£¡

ÕâÀïÌáһϣ¬ÎļþµÄ¿É·Ö¸îÐÔÔÚ Hadoop ÖÐÊǺܷdz£ÖØÒªµÄ£¬Ëü»áÓ°Ïìµ½ÔÚÖ´ÐÐ×÷ҵʱ Map Æô¶¯µÄ¸öÊý£¬´Ó¶ø»áÓ°Ïìµ½×÷ÒµµÄÖ´ÐÐЧÂÊ£¡

ËùÓеÄѹËõËã·¨¶¼ÏÔʾ³öÒ»ÖÖʱ¼ä¿Õ¼äµÄȨºâ£¬¸ü¿ìµÄѹËõºÍ½âѹËÙ¶Èͨ³£»áºÄ·Ñ¸ü¶àµÄ¿Õ¼ä¡£ÔÚÑ¡ÔñʹÓÃÄÄÖÖѹËõ¸ñʽʱ£¬ÎÒÃÇÓ¦¸Ã¸ù¾Ý×ÔÉíµÄÒµÎñÐèÇóÀ´Ñ¡Ôñ¡£

ÏÂͼÊÇÔÚ±¾µØÑ¹ËõÓëͨ¹ýÁ÷½«Ñ¹Ëõ½á¹ûÉÏ´«µ½ BI µÄʱ¼ä¶Ô±È¡£

ͼ 1. ʱ¼ä¶Ô±È

ʹÓ÷½Ê½

MapReduce ¿ÉÒÔÔÚÈý¸ö½×¶ÎÖÐʹÓÃѹËõ¡£

1. ÊäÈëѹËõÎļþ¡£Èç¹ûÊäÈëµÄÎļþÊÇѹËõ¹ýµÄ£¬ÄÇôÔÚ±» MapReduce ¶Áȡʱ£¬ËüÃǻᱻ×Ô¶¯½âѹ¡£

2.MapReduce ×÷ÒµÖУ¬¶Ô Map Êä³öµÄÖмä½á¹û¼¯Ñ¹Ëõ¡£ÊµÏÖ·½Ê½ÈçÏ£º

1£©¿ÉÒÔÔÚ core-site.xml ÎļþÖÐÅäÖ㬴úÂëÈçÏÂ

ͼ 2. core-site.xml ´úÂëʾÀý

2£©Ê¹Óà Java ´úÂëÖ¸¶¨

conf.setCompressMapOut(true);
conf.setMapOutputCompressorClass(GzipCode.class);

×îºóÒ»ÐдúÂëÖ¸¶¨ Map Êä³ö½á¹ûµÄ±àÂëÆ÷¡£

3.MapReduce ×÷ÒµÖУ¬¶Ô Reduce Êä³öµÄ×îÖÕ½á¹û¼¯Ñ¹¡£ÊµÏÖ·½Ê½ÈçÏ£º

1£©¿ÉÒÔÔÚ core-site.xml ÎļþÖÐÅäÖ㬴úÂëÈçÏÂ

ͼ 3. core-site.xml ´úÂëʾÀý

2£©Ê¹Óà Java ´úÂëÖ¸¶¨

conf.setBoolean(¡°mapred.output.compress¡±,true);
conf.setClass(¡°mapred.output.compression.codec¡±,GzipCode.class,CompressionCodec.class);

×îºóÒ»ÐÐͬÑùÖ¸¶¨ Reduce Êä³ö½á¹ûµÄ±àÂëÆ÷¡£

ѹËõ¿ò¼Ü

ÎÒÃÇÇ°ÃæÒѾ­Ìáµ½¹ý¹ØÓÚѹËõµÄʹÓ÷½Ê½£¬ÆäÖеÚÒ»ÖÖ¾ÍÊǽ«Ñ¹ËõÎļþÖ±½Ó×÷ΪÈë¿Ú²ÎÊý½»¸ø MapReduce ´¦Àí£¬MapReduce »á×Ô¶¯¸ù¾ÝѹËõÎļþµÄÀ©Õ¹ÃûÀ´×Ô¶¯Ñ¡ÔñºÏÊʽâѹÆ÷´¦ÀíÊý¾Ý¡£ÄÇôµ½µ×ÊÇÔõôʵÏÖµÄÄØ£¿ÈçÏÂͼËùʾ£º

ͼ 4. ѹËõʵÏÖÇéÐÎ

ÎÒÃÇÔÚÅäÖà Job ×÷ÒµµÄʱºò£¬»áÉèÖÃÊý¾ÝÊäÈëµÄ¸ñʽ»¯·½Ê½£¬Ê¹Óà conf.setInputFormat() ·½·¨£¬ÕâÀïµÄÈë¿Ú²ÎÊýÊÇ TextInputFormat.class¡£

TextInputFormat.class ¼Ì³ÐÓÚ InputFormat.class£¬Ö÷ÒªÓÃÓÚ¶ÔÊý¾Ý½øÐÐÁ½·½ÃæµÄÔ¤´¦Àí¡£Ò»ÊǶÔÊäÈëÊý¾Ý½øÐÐÇз֣¬Éú³ÉÒ»×é split£¬Ò»¸ö split »á·Ö·¢¸øÒ»¸ö mapper ½øÐд¦Àí£»¶þÊÇÕë¶Ôÿ¸ö split£¬ÔÙ´´½¨Ò»¸ö RecordReader ¶ÁÈ¡ split ÄÚµÄÊý¾Ý£¬²¢°´ÕÕ<key,value>µÄÐÎʽ×éÖ¯³ÉÒ»Ìõ record ´«¸ø map º¯Êý½øÐд¦Àí¡£´ËÀàÔÚ¶ÔÊý¾Ý½øÐÐÇзÖ֮ǰ£¬»áÊ×Ïȳõʼ»¯Ñ¹Ëõ½âѹ¹¤³ÌÀà CompressionCodeFactory.class£¬Í¨¹ý¹¤³§»ñȡʵÀý»¯µÄ±àÂë½âÂëÆ÷ CompressionCodec ºó¶ÔÊý¾Ý´¦Àí²Ù×÷¡£

ÏÂÃæÎÒÃÇÀ´ÏêϸµÄ¿´Ò»Ï´ÓѹËõ¹¤³§»ñÈ¡±àÂë½âÂëÆ÷µÄ¹ý³Ì¡£

ѹËõ½âѹ¹¤³§Àà CompressionCodecFactory

ѹËõ½âѹ¹¤³§Àà CompressionCodeFactory.class Ö÷Òª¹¦ÄܾÍÊǸºÔð¸ù¾Ý²»Í¬µÄÎļþÀ©Õ¹ÃûÀ´×Ô¶¯»ñÈ¡Ïà¶ÔÓ¦µÄѹËõ½âѹÆ÷CompressionCodec.class£¬ÊÇÕû¸öѹËõ¿ò¼ÜµÄºËÐÄ¿ØÖÆÆ÷¡£ÎÒÃÇÀ´¿´Ï CompressionCodeFactory.class ÖеöÖØÒª·½·¨£º

1. ³õʼ»¯·½·¨

ͼ 5. ´úÂëʾÀý

¢Ù getCodeClasses(conf) ¸ºÔð»ñÈ¡¹ØÓÚ±àÂë½âÂëÆ÷ CompressionCodec.class µÄÅäÖÃÐÅÏ¢¡£ÏÂÃæ½«»áÏêϸ½²½â¡£

¢Ú ĬÈÏÌí¼ÓÁ½ÖÖ±àÂë½âÂëÆ÷¡£µ± getCodeClass(conf) ·½·¨Ã»ÓжÁÈ¡µ½Ïà¹ØµÄ±àÂë½âÂëÆ÷ CompressionCodec.class µÄÅäÖÃÐÅϢʱ£¬ÏµÍ³»áĬÈÏÌí¼ÓÁ½ÖÖ±àÂë½âÂëÆ÷ CompressionCodec.class£¬·Ö±ðÊÇ GzipCode.class ºÍ DefaultCode.class¡£

¢Û addCode(code) ´Ë·½·¨ÓÃÓÚ½«±àÂë½âÂëÆ÷ CompressionCodec.class Ìí¼Óµ½ÏµÍ³»º´æÖС£ÏÂÃæ½«»áÏêϸ½²½â¡£

2. getCodeClasses(conf)

ͼ 6. ´úÂëʾÀý

¢Ù ÕâÀïÎÒÃÇ¿ÉÒÔ¿´£¬ÏµÍ³¶ÁÈ¡¹ØÓÚ±àÂë½âÂëÆ÷ CompressionCodec.class µÄÅäÖÃÐÅÏ¢ÔÚ core-site.xml ÖÐ io.compression.codes Ï¡£ÎÒÃÇ¿´ÏÂÕâ¶ÎÅäÖÃÎļþ£¬ÈçÏÂͼËùʾ£º

ͼ 7. ´úÂëʾÀý

Value ±êÇ©ÖÐÊÇÿ¸ö±àÂë½âÂë CompressionCodec.class µÄÍêÕû·¾¶£¬ÖмäÓöººÅ·Ö¸ô¡£ÎÒÃÇÖ»ÐèÒª½«×Ô¼ºÐèҪʹÓõ½µÄ±àÂë½âÂëÅäÖõ½´ËÊôÐÔÖУ¬ÏµÍ³¾Í»á×Ô¶¯¼ÓÔØµ½»º´æÖС£

³ýÁËÉÏÊöµÄÕâÖÖ·½Ê½ÒÔÍ⣬Hadoop ΪÎÒÃÇÌṩÁËÁíÒ»ÖÖ¼ÓÔØ·½Ê½£º´úÂë¼ÓÔØ¡£Í¬Ñù×îÖÕ½«ÐÅÏ¢ÅäÖÃÔÚ io.compression.codes ÊôÐÔÖУ¬´úÂëÈçÏ£º

conf.set("io.compression.codecs","org.apache.hadoop.io.compress.DefaultCodec,
org.apache.hadoop.io.compress.GzipCodec,com.hadoop.compression.lzo.LzopCodec");)

3. addCode(code) ·½·¨Ìí¼Ó±àÂë½âÂëÆ÷

ͼ 8. ´úÂëʾÀý

addCodec(codec) ·½·¨Èë¿Ú²ÎÊýÊǸö±àÂë½âÂëÆ÷ CompressionCodec.class£¬ÕâÀïÎÒÃÇ»áÊ×ÏȽӴ¥µ½ËüµÄÒ»¸ö·½·¨¡£

¢Ù codec.getDefaultExtension() ·½·¨¿´·½·¨ÃûµÄ×ÖÃæÒâ˼ÎÒÃǾͿÉÒÔÖªµÀ£¬´Ë·½·¨ÓÃÓÚ»ñÈ¡´Ë±àÂë½âÂëËù¶ÔÓ¦ÎļþµÄÀ©Õ¹Ãû£¬±ÈÈ磬ÎļþÃûÊÇ xxxx.gz2£¬ÄÇôÕâ¸ö·½·¨µÄ·µ»ØÖµ¾ÍÊÇ¡°.bz2¡±£¬ÎÒÃÇÀ´¿´Ï org.apache.hadoop.io.compress.BZip2Codec ´Ë·½·¨µÄʵÏÖ´úÂ룺

ͼ 9. ´úÂëʾÀý

¢Ú Codecs ÊÇÒ»¸ö SortedMap µÄʾÀý¡£ÕâÀïÓиöºÜÓÐÒâ˼µÄµØ·½£¬Ëü½« Key Öµ£¬Ò²¾ÍÊÇͨ¹ý codec.getDefaultExtension() ·½·¨»ñÈ¡µ½µÄÎļþÀ©Õ¹Ãû½øÐÐÁË·­×ª£¬¾Ù¸öÀý×Ó£¬±ÈÈçÎļþÃûÀ©Õ¹Ãû¡°.bz2¡±£¬½«ÎļþÃû·­×ªÖ®ºó¾Í±ä³ÉÁË¡°2zb.¡±¡£

ϵͳ¼ÓÔØÍêËùÓеıàÂë½âÂëÆ÷ºó£¬ÎÒÃÇ¿ÉÒԵõ½ÕâÑùÒ»¸öÓÐÐòÓ³Éä±í£¬ÈçÏ£º

ͼ 10. ´úÂëʾÀý

ÏÖÔÚ±àÂë½âÂëÆ÷¶¼ÓÐÁË£¬ÎÒÃÇÔõôµÃµ½¶ÔÓ¦µÄ±àÂë½âÂëÆ÷ÄØ£¿¿´ÏÂÃæÕâ¸ö·½·¨¡£

4. getCodec() ·½·¨

´Ë·½·¨ÓÃÓÚ»ñÈ¡ÎļþËù¶ÔÓ¦µÄµÄ±àÂë½âÂëÆ÷ CompressionCodec.class¡£

ͼ 11. ´úÂëʾÀý

getCodec(Path) ·½·¨µÄÊäÈë²ÎÊýÊÇ Path ¶ÔÏ󣬱£´æ×ÅÎļþ·¾¶¡£

¢Ù ½«ÎļþÃû·­×ª¡£Èç xxxx.bz2 ·­×ª³É 2zb.xxxx¡£

¢Ú »ñÈ¡ codecs ¼¯ºÏÖÐ×î½Ó½ü 2zb.xxxx µÄÖµ¡£´Ë·½·¨Óзµ»ØÖµÍ¬ÑùÊǸö SortMap ¶ÔÏó¡£

ÔÚÕâÀï¶Ô·µ»ØµÄ SortMap ¶ÔÏó½øÐеڶþ´Îɸѡ¡£

±àÂë½âÂëÆ÷ CompressionCodec

¸Õ¸ÕÔÚ½éÉÜѹËõ½âѹ¹¤³ÌÀà CompressionCodeFactory.class µÄʱºò£¬ÎÒÃǶà´ÎÌáµ½ÁËѹËõ½âѹÆ÷ CompressionCodecclass£¬²¢ÇÒÎÒÃÇÔÚÉÏÎÄÖл¹Ìáµ½ÁËËüÆäÖеÄÒ»¸öÓÃÓÚ»ñÈ¡ÎļþÀ©Õ¹ÃûµÄ·½·¨ getDefaultExtension()¡£

ѹËõ½âѹ¹¤³ÌÀà CompressionCodeFactory.class ʹÓõÄÊdzéÏ󹤳§µÄÉè¼ÆÄ£Ê½¡£ËüÊÇÒ»¸ö½Ó¿Ú£¬Öƶ¨ÁËһϵÁз½·¨£¬ÓÃÓÚ´´½¨Ìض¨Ñ¹Ëõ½âѹËã·¨¡£ÏÂÃæÎÒÃÇÀ´¿´Ï±ȽÏÖØÒªµÄ¼¸¸ö·½·¨£º

1. createOutputStream() ·½·¨¶ÔÊý¾ÝÁ÷½øÐÐѹËõ¡£

ͼ 12. ´úÂëʾÀý

´Ë·½·¨ÌṩÁË·½·¨ÖØÔØ¡£

¢Ù »ùÓÚÁ÷µÄѹËõ´¦Àí£»

¢Ú »ùÓÚѹËõ»ú Compress.class µÄѹËõ´¦Àí

2. createInputStream() ·½·¨¶ÔÊý¾ÝÁ÷½øÐнâѹ¡£

ͼ 13. ´úÂëʾÀý

ÕâÀïµÄ½âѹ·½·¨Í¬ÑùÌṩÁË·½·¨ÖØÔØ¡£

¢Ù »ùÓÚÁ÷µÄ½âѹ´¦Àí£»

¢Ú »ùÓÚ½âѹ»ú Decompressor.class µÄ½âѹ´¦Àí£»

¹ØÓÚѹËõ/½âѹÁ÷ÓëѹËõ/½âѹ»ú»áÔÚÏÂÃæµÄÎÄÕÂÖÐÎÒÃÇ»áÏêϸ½²½â¡£´Ë´¦ÔÝ×÷Á˽⡣

3. getCompressorType() ·µ»ØÐèÒªµÄ±àÂëÆ÷µÄÀàÐÍ¡£

getDefaultExtension() »ñÈ¡¶ÔÓ¦ÎļþÀ©Õ¹ÃûµÄ·½·¨¡£Ç°ÎÄÒÑÌáµ½¹ý£¬²»ÔÙ°½Êö¡£

ѹËõ»ú Compressor ºÍ½âѹ»ú Decompressor

Ç°ÃæÔÚ±àÂë½âÂëÆ÷²¿·ÖµÄ createInputStream() ºÍ createInputStream() ·½·¨ÖÐÎÒÃÇÌáµ½¹ý Compressor.class ºÍ Decompressor.class ¶ÔÏó¡£ÔÚ Hadoop µÄʵÏÖÖУ¬Êý¾Ý±àÂëÆ÷ºÍ½âÂëÆ÷±»³éÏó³ÉÁËÁ½¸ö½Ó¿Ú£º

1. org.apache.hadoop.io.compress.Compressor;

2. org.apache.hadoop.io.compress.Decompressor;

ËüÃǹ涨ÁËһϵÁеķ½·¨£¬ËùÒÔÔÚ Hadoop ÄÚ²¿µÄ±àÂë/½âÂëË㷨ʵÏÖ¶¼ÐèҪʵÏÖ¶ÔÓ¦µÄ½Ó¿Ú¡£ÔÚʵ¼ÊµÄÊý¾ÝѹËõÓë½âѹËõ¹ý³Ì£¬Hadoop ΪÓû§ÌṩÁËͳһµÄ I/O Á÷´¦Àíģʽ¡£

ÎÒÃÇ¿´Ò»ÏÂѹËõ»ú Compressor.class£¬´úÂëÈçÏ£º

ͼ 14. ´úÂëʾÀý

¢Ù setInput() ·½·¨½ÓÊÕÊý¾Ýµ½ÄÚ²¿»º³åÇø£¬¿ÉÒÔ¶à´Îµ÷Óã»

¢Ú needsInput() ·½·¨ÓÃÓÚ¼ì²é»º³åÇøÊÇ·ñÒÑÂú¡£Èç¹ûÊÇ false Ôò˵Ã÷µ±Ç°µÄ»º³åÇøÒÑÂú£»

¢Û getBytesRead() ÊäÈëδѹËõ×Ö½ÚµÄ×ÜÊý£»

¢Ü getBytesWritten() Êä³öѹËõ×Ö½ÚµÄ×ÜÊý£»

¢Ý finish() ·½·¨½áÊøÊý¾ÝÊäÈëµÄ¹ý³Ì£»

¢Þ finished() ·½·¨ÓÃÓÚ¼ì²éÊÇ·ñÒѾ­¶ÁÈ¡ÍêËùÓеĵȴýѹËõµÄÊý¾Ý¡£Èç¹û·µ»Ø false£¬±íÃ÷ѹËõÆ÷Öл¹ÓÐδ¶ÁÈ¡µÄѹËõÊý¾Ý£¬¿ÉÒÔ¼ÌÐøÍ¨¹ý compress() ·½·¨¶ÁÈ¡£»

¢ß compress() ·½·¨»ñȡѹËõºóµÄÊý¾Ý£¬ÊÍ·Å»º³åÇø¿Õ¼ä£»

¢à reset() ·½·¨ÓÃÓÚÖØÖÃѹËõÆ÷£¬ÒÔ´¦ÀíеÄÊäÈëÊý¾Ý¼¯ºÏ£»

¢á end() ·½·¨ÓÃÓڹرսâѹËõÆ÷²¢·ÅÆúËùÓÐδ´¦ÀíµÄÊäÈ룻

¢â reinit() ·½·¨¸ü½øÒ»²½ÔÊÐíʹÓà Hadoop µÄÅäÖÃϵͳ£¬ÖØÖò¢ÖØÐÂÅäÖÃѹËõÆ÷£»

ΪÁËÌá¸ßѹËõЧÂÊ£¬²¢²»ÊÇÿ´ÎÓû§µ÷Óà setInput() ·½·¨£¬Ñ¹Ëõ»ú¾Í»áÁ¢¼´¹¤×÷£¬ËùÒÔ£¬ÎªÁË֪ͨѹËõ»úËùÓÐÊý¾ÝÒѾ­Ð´È룬±ØÐëʹÓà finish() ·½·¨¡£finish() µ÷ÓýáÊøºó£¬Ñ¹Ëõ»ú»º³åÇøÖб£³ÖµÄÒѾ­Ñ¹ËõµÄÊý¾Ý£¬¿ÉÒÔ¼ÌÐøÍ¨¹ý compress() ·½·¨»ñµÃ¡£ÖÁÓÚÒªÅжÏѹËõ»úÖÐÊÇ·ñ»¹ÓÐδ¶ÁÈ¡µÄѹËõÊý¾Ý£¬ÔòÐèÒªÀûÓà finished() ·½·¨À´Åжϡ£

ѹËõÁ÷ CompressionOutputStream ºÍ½âѹËõÁ÷ CompressionInputStream

ǰÎıàÂë½âÂëÆ÷²¿·ÖÌáµ½¹ý createInputStream() ·½·¨·µ»Ø CompressionOutputStream ¶ÔÏó£¬createInputStream() ·½·¨·µ»Ø CompressionInputStream ¶ÔÏó¡£ÕâÁ½¸öÀà·Ö±ð¼Ì³Ð×Ô java.io.OutputStream ºÍ java.io.InputStream¡£´Ó¶øÎÒÃDz»ÄÑÀí½â£¬ÕâÁ½¸ö¶ÔÏóµÄ×÷ÓÃÁ˰ɡ£

ÎÒÃÇÀ´¿´Ï CompressionInputStream.class µÄ´úÂ룺

ͼ 15. ´úÂëʾÀý

¿ÉÒÔ¿´µ½ CompressionOutputStream ʵÏÖÁË OutputStream µÄ close() ·½·¨ºÍ flush() ·½·¨£¬µ«ÓÃÓÚÊä³öÊý¾ÝµÄ write() ·½·¨ÒÔ¼°ÓÃÓÚ½áÊøÑ¹Ëõ¹ý³Ì²¢½«ÊäÈëдµ½µ×²ãÁ÷µÄ finish() ·½·¨ºÍÖØÖÃѹËõ״̬µÄ resetState() ·½·¨»¹ÊdzéÏó·½·¨£¬ÐèÒª CompressionOutputStream µÄ×ÓÀàʵÏÖ¡£

Hadoop ѹËõ¿ò¼ÜÖÐΪÎÒÃÇÌṩÁËÒ»¸öʵÏÖÁË CompressionOutputStream ÀàͨÓõÄ×ÓÀà CompressorStream.class¡£

ͼ 16. ´úÂëʾÀý

CompressorStream.class ÌṩÁËÈý¸ö²»Í¬µÄ¹¹Ô캯Êý£¬CompressorStream ÐèÒªµÄµ×²ãÊä³öÁ÷ out ºÍѹËõʱʹÓõÄѹËõÆ÷£¬¶¼×÷Ϊ²ÎÊý´«Èë¹¹Ô캯Êý¡£ÁíÒ»¸ö²ÎÊýÊÇ CompressorStream ¹¤×÷ʱʹÓõĻº³åÇø buffer µÄ´óС£¬¹¹Ôìʱ»áÀûÓÃÕâ¸ö²ÎÊý·ÖÅä¸Ã»º³åÇø¡£µÚÒ»¸ö¿ÉÒÔÊÖ¶¯ÉèÖûº³åÇø´óС£¬µÚ¶þ¸öĬÈÏ 512£¬µÚÈý¸öûÓлº³åÇøÇÒ²»¿ÉʹÓÃѹËõÆ÷¡£

ͼ 17. ´úÂëʾÀý

ÔÚ write()¡¢compress()¡¢finish() ÒÔ¼° resetState() ·½·¨ÖУ¬ÎÒÃÇ·¢ÏÖÁËѹËõ»ú Compressor µÄÉíÓ°£¬Ç°ÃæÎÄÕÂÎÒÃÇÒѾ­½éÉܹýѹËõ»úµÄµÄʵÏÖ¹ý³Ì£¬Í¨¹ýµ÷Óà setInput() ·½·¨½«´ýѹËõÊý¾ÝÌî³äµ½ÄÚ²¿»º³åÇø£¬È»ºóµ÷Óà needsInput() ·½·¨¼ì²é»º³åÇøÊÇ·ñÒÑÂú£¬Èç¹û»º³åÇøÒÑÂú£¬½«µ÷Óà compress() ·½·¨¶ÔÊý¾Ý½øÐÐѹËõ¡£Á÷³ÌÈçÏÂͼËùʾ£º

ͼ 18. µ÷ÓÃÁ÷³Ìͼ

½áÊøÓï

±¾ÎÄÉîÈëµ½ Hadoop ƽ̨ѹËõ¿ò¼ÜÄÚ²¿£¬¶ÔÆäºËÐÄ´úÂëÒÔ¼°¸÷ѹËõ¸ñʽµÄЧÂʽøÐжԱȷÖÎö£¬ÒÔ°ïÖú¶ÁÕßÔÚʹÓà Hadoop ƽ̨ʱ£¬¿ÉÒÔͨ¹ý¶ÔÊý¾Ý½øÐÐѹËõ´¦ÀíÀ´Ìá¸ßÊý¾Ý´¦ÀíЧÂÊ¡£µ±ÔÙ´ÎÃæÁÙº£Á¿Êý¾Ý´¦Àíʱ£¬ Hadoop ƽ̨µÄѹËõ»úÖÆ¿ÉÒÔÈÃÎÒÃÇʰ빦±¶¡£

   
2477 ´Îä¯ÀÀ       27
Ïà¹ØÎÄÕÂ

»ùÓÚEAµÄÊý¾Ý¿â½¨Ä£
Êý¾ÝÁ÷½¨Ä££¨EAÖ¸ÄÏ£©
¡°Êý¾Ýºþ¡±£º¸ÅÄî¡¢ÌØÕ÷¡¢¼Ü¹¹Óë°¸Àý
ÔÚÏßÉ̳ÇÊý¾Ý¿âϵͳÉè¼Æ ˼·+Ч¹û
 
Ïà¹ØÎĵµ

GreenplumÊý¾Ý¿â»ù´¡Åàѵ
MySQL5.1ÐÔÄÜÓÅ»¯·½°¸
ijµçÉÌÊý¾ÝÖÐ̨¼Ü¹¹Êµ¼ù
MySQL¸ßÀ©Õ¹¼Ü¹¹Éè¼Æ
Ïà¹Ø¿Î³Ì

Êý¾ÝÖÎÀí¡¢Êý¾Ý¼Ü¹¹¼°Êý¾Ý±ê×¼
MongoDBʵս¿Î³Ì
²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿âÉè¼ÆÓëÓÅ»¯
PostgreSQLÊý¾Ý¿âʵսÅàѵ
×îл¼Æ»®
DeepSeekÔÚÈí¼þ²âÊÔÓ¦ÓÃʵ¼ù 4-12[ÔÚÏß]
DeepSeek´óÄ£ÐÍÓ¦Óÿª·¢Êµ¼ù 4-19[ÔÚÏß]
UAF¼Ü¹¹ÌåϵÓëʵ¼ù 4-11[±±¾©]
AIÖÇÄÜ»¯Èí¼þ²âÊÔ·½·¨Óëʵ¼ù 5-23[ÉϺ£]
»ùÓÚ UML ºÍEA½øÐзÖÎöÉè¼Æ 4-26[±±¾©]
ÒµÎñ¼Ü¹¹Éè¼ÆÓ뽨ģ 4-18[±±¾©]

MySQLË÷Òý±³ºóµÄÊý¾Ý½á¹¹
MySQLÐÔÄܵ÷ÓÅÓë¼Ü¹¹Éè¼Æ
SQL ServerÊý¾Ý¿â±¸·ÝÓë»Ö¸´
ÈÃÊý¾Ý¿â·ÉÆðÀ´ 10´óDB2ÓÅ»¯
oracleµÄÁÙʱ±í¿Õ¼äдÂú´ÅÅÌ
Êý¾Ý¿âµÄ¿çƽ̨Éè¼Æ

²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿â
¸ß¼¶Êý¾Ý¿â¼Ü¹¹Éè¼ÆÊ¦
HadoopÔ­ÀíÓëʵ¼ù
Oracle Êý¾Ý²Ö¿â
Êý¾Ý²Ö¿âºÍÊý¾ÝÍÚ¾ò
OracleÊý¾Ý¿â¿ª·¢Óë¹ÜÀí

GE Çø¿éÁ´¼¼ÊõÓëʵÏÖÅàѵ
º½Ìì¿Æ¹¤Ä³×Ó¹«Ë¾ Nodejs¸ß¼¶Ó¦Óÿª·¢
ÖÐÊ¢Òæ»ª ׿Խ¹ÜÀíÕß±ØÐë¾ß±¸µÄÎåÏîÄÜÁ¦
ijÐÅÏ¢¼¼Êõ¹«Ë¾ PythonÅàѵ
ij²©²ÊITϵͳ³§ÉÌ Ò×ÓÃÐÔ²âÊÔÓëÆÀ¹À
ÖйúÓÊ´¢ÒøÐÐ ²âÊÔ³ÉÊì¶ÈÄ£Ðͼ¯³É(TMMI)
ÖÐÎïÔº ²úÆ·¾­ÀíÓë²úÆ·¹ÜÀí