Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Model Center   Code  
»áÔ±   
   
 
     
   
 ¶©ÔÄ
  ¾èÖú
HiveÔ­Àí·ÖÎö
 
×÷ÕߣºCharlotteXÓÐÃÎÏëµÄÏÌÓã
 
  2574  次浏览      27
2020-10-27  
 
±à¼­ÍƼö:
¹¤×÷ÖÐʹÓõ½ÁËhive£¬mysqlµÈÊý¾Ý¿â£¬²»Í¬µÄÊý¾Ý¿âÓв»Í¬µÄÓ¦Óó¡¾°£¬¸ÃÈçºÎÕýÈ·µÄÑ¡ÔñÊý¾Ý´æ´¢Óë´¦Àí·½Ê½£¬ÐèÒªÁ˽âµ×²ãÔ­Àí£¬²ÅÄÜÉÙ×ßÍä·£¬±¾ÎÄÖ÷ÒªÊǼǼһÏÂhiveµÄʵÏÖÔ­ÀíÒÔ¼°Ò»Ð©¶ÔÓ¦µÄ¸ÅÄî¡£

±¾ÎÄÀ´×ÔÖªºõ£¬ÓÉ»ðÁú¹ûÈí¼þAnna±à¼­¡¢ÍƼö¡£

Front

ÔÚ¿ªÊ¼Á˽âhive֮ǰ£¬ÐèÒªÁ˽âһЩ֪ʶ»òÕ߸ÅÄ¿ÉÒÔ¸üºÃµÄÀí½âhiveʵÏÖÔ­Àí

MapReduce

Google MapReduceÊÇGoogle»ùÓÚº¯Êýʽ±à³Ìmap£¨Ó³É䣩£¬reduce£¨»¯¼ò£©Ìá³öµÄÒ»ÖÖ·Ö²¼Ê½±à³ÌÄ£ÐÍ£¬ÔÚÄ£ÐÍÖÐÒþ²ØÁË·Ö²¼Ê½¼¯ÈºµÄʵÏÖϸ½Ú£¬½»ÓÉ¿ò¼Üµ×²ã½øÐÐʵÏÖ£¬Äܹ»Ê¹³ÌÐòÔ±ÔÚ²»Á˽â·Ö²¼Ê½²¢Ðбà³ÌµÄÇé¿öÏ£¬½«×Ô¼ºÊéдµÄ³ÌÐòÔÚ·Ö²¼Ê½ÏµÍ³ÉÏÔËÐÐ

±à³ÌÄ£ÐÍ

Map: ½«ÊäÈëµÄÒ»¶Ô¼üÖµ¶Ôת»»ÎªÒ»×éÖмä¼üÖµ¶Ô £¨k1,v1) -> list(k2,v2)

Reduce: ½«ËùÓмüÏàͬµÄÖмä¼üÖµ¶ÔºÏ²¢£¬µÃµ½¹ØÓÚÄǸö¼üµÄ½á¹û (k2,list(v2)) -> (k2,v3)

¾Ù¸öÀõ×Ó

ÒÔÒ»¸öºÜ¼òµ¥µÄWordCountΪÀý×Ó£¬¼ÙÉè¸ø¶¨´óÁ¿Êý¾ÝµÄÎĵµ£¬¼ÆËãÆäÖÐÿ¸öµ¥´Ê³öÏֵĴÎÊý£¬ÏÂÃæÊÇα´úÂë

map(String key,String value,Context context){
// key: document name
// value: document contents
String[] words = value.split(separator);
for(String word:words){
context.write(word,1);
}
}

reduce(String word,Iterable<Integer> values,Context context){
int sum = 0;
for(Integer value:values){
sum += value;
}
con.write(word,sum);
}

¸ü¶àµÄÀõ×Ó

¼ÆËã URL ·ÃÎÊÆµÂÊ£ºMap º¯Êý´¦ÀíÈÕÖ¾ÖÐ web Ò³ÃæÇëÇóµÄ¼Ç¼£¬È»ºóÊä³ö(URL,1)¡£Reduce º¯Êý°ÑÏàͬURL µÄ value Öµ¶¼ÀÛ¼ÓÆðÀ´£¬²úÉú(URL,¼Ç¼×ÜÊý)½á¹û¡£

µ¹×ªÍøÂçÁ´½Óͼ£ºMap º¯ÊýÔÚÔ´Ò³Ãæ£¨source£©ÖÐËÑË÷ËùÓеÄÁ´½ÓÄ¿±ê£¨target£©²¢Êä³öΪ(target,source)¡£ Reduce º¯Êý°Ñ¸ø¶¨Á´½ÓÄ¿±ê£¨target£©µÄÁ´½Ó×éºÏ³ÉÒ»¸öÁÐ±í£¬Êä³ö(target,list(source))¡£

µ¹ÅÅË÷Òý£ºMap º¯Êý·ÖÎöÿ¸öÎĵµÊä³öÒ»¸ö(´Ê,ÎĵµºÅ)µÄÁÐ±í£¬Reduce º¯ÊýµÄÊäÈëÊÇÒ»¸ö¸ø¶¨´ÊµÄËùÓÐ £¨´Ê£¬ÎĵµºÅ£©£¬ÅÅÐòËùÓеÄÎĵµºÅ£¬Êä³ö(´Ê,list£¨ÎĵµºÅ£©)¡£ËùÓеÄÊä³ö¼¯ºÏÐγÉÒ»¸ö¼òµ¥µÄµ¹ÅÅË÷Òý£¬Ëü ÒÔÒ»ÖÖ¼òµ¥µÄËã·¨¸ú×Ù´ÊÔÚÎĵµÖеÄλÖá£

ÿ¸öÖ÷»úµÄ¼ìË÷´ÊÏòÁ¿£º¼ìË÷´ÊÏòÁ¿ÓÃÒ»¸ö(´Ê,ƵÂÊ)ÁбíÀ´¸ÅÊö³öÏÖÔÚÎĵµ»òÎĵµ¼¯ÖеÄ×îÖØÒªµÄһЩ ´Ê¡£Map º¯ÊýΪÿһ¸öÊäÈëÎĵµÊä³ö(Ö÷»úÃû,¼ìË÷´ÊÏòÁ¿)£¬ÆäÖÐÖ÷»úÃûÀ´×ÔÎĵµµÄ URL¡£Reduce º¯Êý½ÓÊÕ¸ø ¶¨Ö÷»úµÄËùÓÐÎĵµµÄ¼ìË÷´ÊÏòÁ¿£¬²¢°ÑÕâЩ¼ìË÷´ÊÏòÁ¿¼ÓÔÚÒ»Æð£¬¶ªÆúµôµÍƵµÄ¼ìË÷´Ê£¬Êä³öÒ»¸ö×îÖÕµÄ(Ö÷»úÃû,¼ìË÷´ÊÏòÁ¿)¡£

ϵͳʵÏÖ

Ê×ÏÈ£¬Óû§Í¨¹ý MapReduce ¿Í»§¶ËÖ¸¶¨ Map º¯ÊýºÍ Reduce º¯Êý£¬ÒÔ¼°´Ë´Î MapReduce ¼ÆËãµÄÅäÖ㬰üÀ¨Öмä½á¹û¼üÖµ¶ÔµÄ Partition ÊýÁ¿ R ÒÔ¼°ÓÃÓÚÇзÖÖмä½á¹ûµÄ¹þÏ£º¯Êý hash ¡£ Óû§¿ªÊ¼ MapReduce ¼ÆËãºó£¬Õû¸ö MapReduce ¼ÆËãµÄÁ÷³Ì¿É×ܽáÈçÏ£º

×÷ΪÊäÈëµÄÎļþ»á±»·ÖΪ M ¸ö Split£¬Ã¿¸ö Split µÄ´óСͨ³£ÔÚ 16~64 MB Ö®¼ä

Èç´Ë£¬Õû¸ö MapReduce ¼ÆËã°üº¬ M ¸öMap ÈÎÎñºÍ R ¸ö Reduce ÈÎÎñ¡£Master ½áµã»á´Ó¿ÕÏÐµÄ Worker ½áµãÖнøÐÐѡȡ²¢ÎªÆä·ÖÅä Map ÈÎÎñºÍ Reduce ÈÎÎñ

ÊÕµ½ Map ÈÎÎñµÄ Worker ÃÇ£¨ÓÖ³Æ Mapper£©¿ªÊ¼¶ÁÈë×Ô¼º¶ÔÓ¦µÄ Split£¬½«¶ÁÈëµÄÄÚÈݽâÎöΪÊäÈë¼üÖµ¶Ô²¢µ÷ÓÃÓÉÓû§¶¨ÒåµÄ Map º¯Êý¡£ÓÉ Map º¯Êý²úÉúµÄÖмä½á¹û¼üÖµ¶Ô»á±»ÔÝʱ´æ·ÅÔÚ»º³åÄÚ´æÇøÖÐ

ÔÚ Map ½×¶Î½øÐеÄͬʱ£¬Mapper ÃÇÖÜÆÚÐԵؽ«·ÅÖÃÔÚ»º³åÇøÖеÄÖмä½á¹û´æÈëµ½×Ô¼ºµÄ±¾µØ´ÅÅÌÖУ¬Í¬Ê±¸ù¾ÝÓû§Ö¸¶¨µÄ Partition º¯Êý£¨Ä¬ÈÏΪ hash(key) mod R£©½«²úÉúµÄÖмä½á¹û·ÖΪ R ¸ö²¿·Ö¡£ÈÎÎñÍê³Éʱ£¬Mapper ±ã»á½«Öмä½á¹ûÔÚÆä±¾µØ´ÅÅÌÉϵĴæ·ÅλÖñ¨¸æ¸ø Master

Mapper Éϱ¨µÄÖмä½á¹û´æ·ÅλÖûᱻ Master ת·¢¸ø Reducer¡£µ± Reducer ½ÓÊÕµ½ÕâЩÐÅÏ¢ºó±ã»áͨ¹ý RPC ¶ÁÈ¡´æ´¢ÔÚ Mapper ±¾µØ´ÅÅÌÉÏÊôÓÚ¶ÔÓ¦ Partition µÄÖмä½á¹û¡£ÔÚ¶ÁÈ¡Íê±Ïºó£¬Reducer »á¶Ô¶ÁÈ¡µ½µÄÊý¾Ý½øÐÐÅÅÐòÒÔÁîÓµÓÐÏàͬ¼üµÄ¼üÖµ¶ÔÄܹ»Á¬Ðø·Ö²¼

Ö®ºó£¬Reducer »áΪÿ¸ö¼üÊÕ¼¯ÓëÆä¹ØÁªµÄÖµµÄ¼¯ºÏ£¬²¢ÒÔÖ®µ÷ÓÃÓû§¶¨ÒåµÄ Reduce º¯Êý¡£Reduce º¯ÊýµÄ½á¹û»á±»·ÅÈëµ½¶ÔÓ¦µÄ Reduce Partition ½á¹ûÎļþ

ʵ¼ÊÉÏ£¬ÔÚÒ»¸ö MapReduce ¼¯ÈºÖУ¬Master »á¼Ç¼ÿһ¸ö Map ºÍ Reduce ÈÎÎñµÄµ±Ç°Íê³É״̬£¬ÒÔ¼°Ëù·ÖÅäµÄ Worker¡£³ý´ËÖ®Í⣬Master »¹¸ºÔ𽫠Mapper ²úÉúµÄÖмä½á¹ûÎļþµÄλÖúʹóСת·¢¸ø Reducer¡£

ÖµµÃ×¢ÒâµÄÊÇ£¬Ã¿´Î MapReduce ÈÎÎñÖ´ÐÐʱ£¬ M ºÍ R µÄÖµ¶¼Ó¦±È¼¯ÈºÖÐµÄ Worker ÊýÁ¿Òª¸ßµÃ¶à£¬ÒÔ´ï³É¼¯ÈºÄÚ¸ºÔؾùºâµÄЧ¹û¡£

ÁÐʽ´æ´¢

ʲôÊÇÁÐʽ´æ´¢

´«Í³ÊÂÎñÐÍÊý¾Ý¿âͨ³£²ÉÓÃÐÐʽ´æ´¢¡£ÒÔÏÂͼΪÀý£¬ËùÓеÄÁÐÒÀ´ÎÅÅÁй¹³ÉÒ»ÐУ¬ÒÔÐÐΪµ¥Î»´æ´¢£¬ÔÙÅäºÏÒÔ B+ Ê÷×÷ΪË÷Òý£¬¾ÍÄÜ¿ìËÙͨ¹ýÖ÷¼üÕÒµ½ÏàÓ¦µÄÐÐÊý¾Ý¡£

ÐÐʽ´æ´¢¶ÔÓÚ OLTP£¨Áª»úÊÂÎñ´¦Àí£© ³¡¾°ÊǺÜ×ÔÈ»µÄ£º´ó¶àÊý²Ù×÷¶¼ÒÔʵÌ壨entity£©Îªµ¥Î»£¬¼´´ó¶àΪÔöɾ¸Ä²éÒ»ÕûÐмǼ£¬ÏÔÈ»°ÑÒ»ÐÐÊý¾Ý´æÔÚÎïÀíÉÏÏàÁÚµÄλÖÃÊǸöºÜºÃµÄÑ¡Ôñ¡£

È»¶ø£¬¶ÔÓÚ OLAP £¨Áª»ú·ÖÎö´¦Àí£©³¡¾°£¬Ò»¸öµäÐ͵IJéѯÐèÒª±éÀúÕû¸ö±í£¬½øÐзÖ×é¡¢ÅÅÐò¡¢¾ÛºÏµÈ²Ù×÷£¬ÕâÑùÒ»À´°´Ðд洢µÄÓÅÊÆ¾Í²»¸´´æÔÚÁË¡£¸üÔã¸âµÄÊÇ£¬·ÖÎöÐÍ SQL ³£³£²»»áÓõ½ËùÓеÄÁУ¬¶ø½ö½ö¶ÔÆäÖÐijЩ¸ÐÐËȤµÄÁÐ×öÔËË㣬ÄÇÒ»ÐÐÖÐÄÇЩÎ޹صÄÁÐÒ²²»µÃ²»²ÎÓëɨÃè¡£

ÁÐʽ´æ´¢¾ÍÊÇΪÕâÑùµÄÐèÇóÉè¼ÆµÄ¡£ÈçÏÂͼËùʾ£¬Í¬Ò»ÁеÄÊý¾Ý±»Ò»¸ö½ÓÒ»¸ö½ô°¤×Å´æ·ÅÔÚÒ»Æð£¬±íµÄÿÁй¹³ÉÒ»¸ö³¤Êý×é¡£

ÏÔÈ»£¬ÁÐʽ´æ´¢¶ÔÓÚ OLTP ²»ÓѺã¬Ò»ÐÐÊý¾ÝµÄдÈëÐèҪͬʱÐ޸Ķà¸öÁС£µ«¶Ô OLAP ³¡¾°ÓÐןܴóµÄÓÅÊÆ£º

µ±²éѯÓï¾äֻɿ¼°²¿·ÖÁÐʱ£¬Ö»ÐèҪɨÃèÏà¹ØµÄÁÐ

ÿһÁеÄÊý¾Ý¶¼ÊÇÏàͬÀàÐ͵ģ¬±Ë´Ë¼äÏà¹ØÐÔ¸ü´ó£¬¶ÔÁÐÊý¾ÝѹËõµÄЧÂʽϸß

ÁÐʽ´æ´¢Óë·Ö²¼Ê½Îļþϵͳ

ÔÚÏÖ´úµÄ´óÊý¾Ý¼Ü¹¹ÖУ¬GFS¡¢HDFS µÈ·Ö²¼Ê½ÎļþϵͳÒѾ­³ÉΪ´æ·Å´ó¹æÄ£Êý¾Ý¼¯µÄÖ÷Á÷·½Ê½¡£·Ö²¼Ê½ÎļþϵͳÏà±Èµ¥»úÉϵĴÅÅÌ£¬¾ß±¸¶à¸±±¾¸ß¿ÉÓá¢ÈÝÁ¿´ó¡¢³É±¾µÍµÈÖî¶àÓÅÊÆ£¬µ«Ò²´øÀ´ÁËһЩµ¥»ú¼Ü¹¹ËùûÓеÄÎÊÌ⣺

¶Áд¾ùÒª¾­¹ýÍøÂ磬ÍÌÍÂÁ¿¿ÉÒÔ׷ƽÉõÖÁ³¬¹ýÓ²ÅÌ£¬µ«ÊÇÑÓ³ÙÒª±ÈÓ²ÅÌ´óµÃ¶à£¬ÇÒÊÜÍøÂç»·¾³Ó°ÏìºÜ´ó¡£

¿ÉÒÔ½øÐдóÍÌÍÂÁ¿µÄ˳Ðò¶Áд£¬µ«Ëæ»ú·ÃÎÊÐÔÄܺܲ´ó¶à²»Ö§³ÖËæ»úдÈ롣ΪÁ˵ÖÏûÍøÂçµÄ overhead£¬Í¨³£Ð´Èë¶¼ÒÔ¼¸Ê® MB Ϊµ¥Î»¡£ ÉÏÊöȱµã¶ÔÓÚÖØ¶ÈÒÀÀµËæ»ú¶ÁдµÄ OLTP ³¡¾°À´ËµÊÇÖÂÃüµÄ¡£ËùÒÔÎÒÃÇ¿´µ½£¬ºÜ¶à¶¨Î»ÓÚ OLAP µÄÁÐʽ´æ´¢Ñ¡Ôñ·ÅÆú OLTP ÄÜÁ¦£¬´Ó¶øÄܹ¹½¨ÔÚ·Ö²¼Ê½Îļþϵͳ֮ÉÏ¡£

ÒªÏ뽫·Ö²¼Ê½ÎļþϵͳµÄÐÔÄÜ·¢»Óµ½¼«Ö£¬ÎÞ·ÇÓм¸ÖÖ·½·¨£º°´¿é£¨·ÖƬ£©¶ÁÈ¡Êý¾Ý¡¢Á÷ʽ¶ÁÈ¡¡¢×·¼ÓдÈëµÈ¡£ÎÒÃÇÔÚºóÃæ»á¿´µ½Ò»Ð©¿ªÔ´½çÁ÷ÐеÄÁÐʽ´æ´¢Ä£ÐÍ£¬½«ÕâЩÓÅ»¯·½·¨ÌåÏÖÔÚ´æ´¢¸ñʽµÄÉè¼ÆÖС£

ÁÐʽ´æÍ³´¢Ïµ°¸Àý

Apache ORC

Apache ORC ×î³õÊÇΪ֧³Ö Hive É쵀 OLAP ²éѯ¿ª·¢µÄÒ»ÖÖÎļþ¸ñʽ£¬Èç½ñÔÚ Hadoop Éú̬ϵͳÖÐÓй㷺µÄÓ¦Óá£ORC Ö§³Ö¸÷ÖÖ¸ñʽµÄ×ֶΣ¬°üÀ¨³£¼ûµÄ int¡¢string µÈ£¬Ò²°üÀ¨ struct¡¢list¡¢map µÈ×éºÏ×ֶΣ»×Ö¶ÎµÄ meta ÐÅÏ¢¾Í·ÅÔÚ ORC ÎļþµÄβ²¿£¨Õâ±»³ÆÎª×ÔÃèÊöµÄ£©¡£

Êý¾Ý½á¹¹¼°Ë÷Òý

Ϊ·ÖÇø¹¹ÔìË÷ÒýÊÇÒ»ÖÖ³£¼ûµÄÓÅ»¯·½°¸£¬ORC µÄÊý¾Ý½á¹¹·Ö³ÉÒÔÏ 3 ¸ö²ã¼¶£¬ÔÚÿ¸ö²ã¼¶É϶¼ÓÐË÷ÒýÐÅÏ¢À´¼ÓËÙ²éѯ

File Level£º¼´Ò»¸ö ORC Îļþ£¬Footer Öб£´æÁËÊý¾ÝµÄ meta ÐÅÏ¢£¬»¹ÓÐÎļþÊý¾ÝµÄË÷ÒýÐÅÏ¢£¬ÀýÈç¸÷ÁÐÊý¾ÝµÄ×î´ó×îСֵ£¨·¶Î§£©¡¢NULL Öµ·Ö²¼¡¢²¼Â¡¹ýÂËÆ÷µÈ£¬ÕâЩÐÅÏ¢¿ÉÓÃÀ´¿ìËÙÈ·¶¨¸ÃÎļþÊÇ·ñ°üº¬Òª²éѯµÄÊý¾Ý¡£Ã¿¸ö ORC ÎļþÖаüº¬¶à¸ö Stripe¡£

Stripe Level ¶ÔÓ¦Ô­±íµÄÒ»¸ö·¶Î§·ÖÇø£¬ÀïÃæ°üº¬¸Ã·ÖÇøÄÚ¸÷ÁеÄÖµ¡£Ã¿¸ö Stripe Ò²ÓÐ×Ô¼ºµÄÒ»¸öË÷Òý·ÅÔÚ footer ÀºÍ file-level Ë÷ÒýÀàËÆ¡£

Row-Group Level £ºÒ»ÁÐÖеÄÿ 10000 ÐÐÊý¾Ý¹¹³ÉÒ»¸ö row-group£¬Ã¿¸ö row-group ÓµÓÐ×Ô¼ºµÄ row-level Ë÷Òý£¬ÐÅϢͬÉÏ

ORC ÀïµÄ Stripe ¾ÍÏñ´«Í³Êý¾Ý¿âµÄÒ³£¬ËüÊÇ ORC ÎļþÅúÁ¿¶ÁдµÄ»ù±¾µ¥Î»¡£ÕâÊÇÓÉÓÚ·Ö²¼Ê½´¢´æÏµÍ³µÄ¶ÁдÑӳٽϴó£¬Ò»´Î IO ²Ù×÷Ö»ÓÐÅúÁ¿¶Áȡһ¶¨Á¿µÄÊý¾Ý²Å»®Ëã¡£ÕâºÍ°´Ò³¶Áд´ÅÅ̵Ä˼·ҲÓй²Í¨Ö®´¦¡£

ÏñÆäËûºÜ¶à´¢´æ¸ñʽһÑù£¬ORC Ñ¡Ôñ½«Í³¼ÆÊý¾ÝºÍ Metadata ·ÅÔÚ File ºÍ Stripe µÄβ²¿¶ø²»ÊÇÍ·²¿¡£µ« ORC ÔÚ Stripe µÄ¶ÁдÉÏ»¹ÓÐÒ»µãÓÅ»¯£¬ÄǾÍÊǰѷÖÇøÁ£¶ÈСÓÚ Stripe µÄ½á¹¹£¨Èç Column ºÍ Row-Group£©µÄË÷Òýͳһ³éÈ¡³öÀ´·Åµ½ Stripe µÄÍ·²¿¡£ÕâÊÇÒòΪÔÚÅú´¦Àí¼ÆËãÖÐÒ»°ãÊǰÑÕû¸ö Stripe ¶ÁÈëÅúÁ¿´¦ÀíµÄ£¬½«ÕâЩË÷Òý³éÈ¡³öÀ´¿ÉÒÔ¼õÉÙÔÚÅú´¦Àí³¡¾°ÏÂÐèÒªµÄ IO£¨Åú´¦Àí¶ÁÈ¡¿ÉÒÔÌø¹ýÕâÒ»²¿·Ö£©¡£

Dremel (2010) / Apache Parquet

Dremel ÊÇ Google Ñз¢µÄÓÃÓÚ´ó¹æÄ£Ö»¶ÁÊý¾ÝµÄ²éѯϵͳ£¬ÓÃÓÚ½øÐпìËÙµÄ ad-hoc ²éѯ£¬ÃÖ²¹ MapReduce ½»»¥Ê½²éѯÄÜÁ¦µÄ²»×㡣ΪÁ˱ÜÃâ¶ÔÊý¾ÝµÄ¶þ´Î¿½±´£¬Dremel µÄÊý¾Ý¾Í·ÅÔÚÔ­´¦£¬Í¨³£ÊÇ GFS ÕâÑùµÄ·Ö²¼Ê½Îļþϵͳ£¬Îª´ËÐèÒªÉè¼ÆÒ»ÖÖͨÓõÄÎļþ¸ñʽ¡£

Dremel µÄϵͳÉè¼ÆºÍ´ó¶à OLAP µÄÁÐʽÊý¾Ý¿â²¢ÎÞÌ«¶à´´Ðµ㣬µ«ÊÇÆä¾«ÇɵĴ洢¸ñʽȴ±äµÃÁ÷ÐÐÆðÀ´£¬Apache Parquet ¾ÍÊÇËüµÄ¿ªÔ´¸´¿Ì°æ¡£×¢Òâ Parquet ºÍ ORC Ò»Ñù¶¼ÊÇÒ»ÖÖ´æ´¢¸ñʽ£¬¶ø·ÇÍêÕûµÄϵͳ¡£

ǶÌ×Êý¾ÝÄ£ÐÍ

Google ÄÚ²¿´óÁ¿Ê¹Óà Protobuf ×÷Ϊ¿çƽ̨¡¢¿çÓïÑÔµÄÊý¾ÝÐòÁл¯¸ñʽ£¬Ïà±È JSON Òª¸ü½ô´Õ²¢¾ßÓиüÇ¿µÄ±í´ïÄÜÁ¦¡£Protobuf ²»½öÔÊÐíÓû§¶¨Ò屨Ð루required£©ºÍ¿ÉÑ¡£¨optinal£©×ֶΣ¬»¹ÔÊÐíÓû§¶¨Òå repeated ×ֶΣ¬ÒâζןÃ×ֶοÉÒÔ³öÏÖ 0¡«N ´Î£¬ÀàËÆ±ä³¤Êý×é¡£

Dremel ¸ñʽµÄÉè¼ÆÄ¿µÄ¾ÍÊǰ´ÁÐÀ´´æ´¢ Protobuf µÄÊý¾Ý¡£ÓÉÓÚ repeated ×ֶεĴæÔÚ£¬ÕâÒª±È°´Áд洢¹ØÏµÐ͵ÄÊý¾ÝÀ§ÄÑһЩ¡£Ò»°ãµÄ˼·¿ÉÄÜÊÇÓÃÖÕÖ¹·û±íʾÿ¸ö repeat ½áÊø£¬µ«ÊÇ¿¼Âǵ½Êý¾Ý¿ÉÄܺÜÏ¡Ê裬Dremel ÒýÈëÁËÒ»ÖÖ¸üΪ½ô´ÕµÄ¸ñʽ¡£

×÷ΪÀý×Ó£¬ÏÂͼ×ó°ë±ßչʾÁËÊý¾ÝµÄ schema ºÍ 2 ¸ö Document µÄʵÀý£¬ÓÒ°ë±ßÊÇÐòÁл¯Ö®ºóµÄ¸÷¸öÁС£ÐòÁл¯Ö®ºóµÄÁжà³öÁË R¡¢D Á½ÁУ¬·Ö±ð´ú±í Repetition Level ºÍ Definition Level£¬Í¨¹ýÕâÁ½¸öÖµ¾ÍÄÜÈ·±£Î¨Ò»µØ·´ÐòÁл¯³öÔ­±¾µÄÊý¾Ý¡£

Repetition Level ±íʾµ±Ç°ÖµÔÚÄÄÒ»¸ö¼¶±ðÉÏÖØ¸´¡£¶ÔÓÚ·Ç repeated ×Ö¶ÎÖ»ÒªÌîÉÏ trivial Öµ 0 ¼´¿É£»·ñÔò£¬Ö»ÒªÕâ¸ö×ֶοÉÄܳöÏÖÖØ¸´£¨ÎÞÂÛ±¾ÉíÊÇ repeated »¹ÊÇÍâ²ã½á¹¹ÊÇ repeated£©£¬Ó¦µ±Îª R ÌîÉϵ±Ç°ÖµÔÚÄÄÒ»²ãÉÏ repeat¡£

¾Ù¸öÀý×Ó˵Ã÷£º¶ÔÓÚ Name.Language.Code ÎÒÃÇÒ»¹²ÓÐÈýÌõ·Ç NULL µÄ¼Ç¼¡£

µÚÒ»¸öÊÇ en-us£¬³öÏÖÔÚµÚÒ»¸ö Name µÄµÚÒ»¸ö Lanuage µÄµÚÒ»¸ö Code ÀïÃæ¡£ÔÚ´Ë֮ǰ£¬ÕâÈý¸öÔªËØÊÇûÓÐÖØ¸´¹ýµÄ£¬¶¼ÊǵÚÒ»´Î³öÏÖ¡£ËùÒÔÆä R=0

µÚ¶þ¸öÊÇ en£¬³öÏÖÔÚÏÂÒ»¸ö Language ÀïÃæ¡£Ò²¾ÍÊÇ˵ Language ÊÇÖØ¸´µÄÔªËØ¡£Name.Language.Code ÖÐLanguage Åŵڶþ¸ö£¬ËùÒÔÆä R=2

µÚÈý¸öÊÇ en-gb£¬³öÏÖÔÚÏÂÒ»¸ö Name ÖУ¬Name ÊÇÖØ¸´ÔªËØ£¬ÅŵÚÒ»¸ö£¬ËùÒÔÆä R=1

×¢Òâµ½ en-gbÊÇÊôÓÚµÚ3¸ö Name µÄ¶ø·ÇµÚ2¸öName£¬ÎªÁ˱í´ïÕâ¸öÊÂʵ£¬ÎÒÃÇÔÚ en ºÍ en-gbÖмä·ÅÁËÒ»¸ö R=1 µÄ NULL¡£

Definition Level ÊÇΪÁË˵Ã÷ NULL ±»¶¨ÒåÔÚÄÄÒ»²ã£¬Ò²¾ÍÐû¸æÄÇÒ»²ãµÄ repeat µ½´ËΪֹ¡£¶ÔÓÚ·Ç NULL ×Ö¶ÎÖ»ÒªÌîÉÏ trivial Öµ£¬¼´Êý¾Ý±¾ÉíËùÔÚµÄ level ¼´¿É¡£

ͬÑù¾Ù¸öÀý×Ó£¬¶ÔÓÚ Name.Language.Country ÁÐ

us ·Ç NULL ÖµÌîÉÏ Country ×Ö¶ÎµÄ level ¼´ D=3

NULL ÔÚ R1 ÄÚ²¿£¬±íʾµ±Ç° Name Ö®ÄÚ¡¢ºóÐøËùÓÐ Language ¶¼²»º¬ÓÐ Country ×ֶΡ£ËùÒÔDΪ2¡£

NULL ÔÚ R1 ÄÚ²¿£¬±íʾµ±Ç° Document Ö®ÄÚ¡¢ºóÐøËùÓÐ Name ¶¼²»º¬ÓÐ Country ×ֶΡ£ËùÒÔDΪ1¡£

gb ·Ç NULL ÖµÌîÉÏ Country ×Ö¶ÎµÄ level ¼´ D=3

NULL ÔÚ R2 ÄÚ²¿£¬±íʾºóÐøËùÓÐ Document ¶¼²»º¬ÓÐ Country ×ֶΡ£ËùÒÔDΪ0¡£

¿ÉÒÔÖ¤Ã÷£¬½áºÏ R¡¢D Á½¸öÊýÖµÒ»¶¨ÄÜΨһ¹¹½¨³öԭʼÊý¾Ý¡£ÎªÁ˸ßЧ±à½âÂ룬Dremel ÔÚÖ´ÐÐʱÊ×Ïȹ¹½¨³ö״̬»ú£¬Ö®ºóÀûÓÃ״̬»ú´¦ÀíÁÐÊý¾Ý¡£²»½öÈç´Ë£¬×´Ì¬»ú»¹»á½áºÏ²éѯÐèÇóºÍÊý¾ÝµÄ structure Ö±½ÓÌø¹ýÎ޹صÄÊý¾Ý¡£

Hive

hiveʵ¼ÊÉÏÊÇÒÔHDFS×÷Ϊ´æ´¢£¬MapReduce×÷Ϊ¼ÆËãÒýÇæ£¬YARN×÷Ϊ×ÊÔ´·ÖÅä¼°ÈÝ´í»úÖÆ£¬ÒÀÍÐÓÚhadoopÉú̬ϵͳʵÏÖµÄÒ»ÖÖOLAPÊý¾Ý²Ö¿â£¬¾ßÌåÃèÊöÈçÏÂ

HiveÊÇÒ»¸öSQL½âÎöÒýÇæ£¬½«SQLÓï¾äתÒë³ÉMapReduce Job£¬È»ºóÔÙHadoopƽ̨ÉÏÔËÐУ¬´ïµ½¿ìËÙ¿ª·¢µÄÄ¿µÄ¡£

HiveÖеıíÊÇ´¿Âß¼­±í£¬¾ÍÖ»ÊDZíµÄ¶¨ÒåµÈ£¬¼´±íµÄÔªÊý¾Ý¡£±¾ÖʾÍÊÇHadoopµÄĿ¼/Îļþ£¬´ïµ½ÁËÔªÊý¾ÝÓëÊý¾Ý´æ´¢·ÖÀëµÄÄ¿µÄ

Hive±¾Éí²»´æ´¢Êý¾Ý£¬ËüÍêÈ«ÒÀÀµHDFSºÍMapReduce¡£

HiveµÄÄÚÈÝÊǶÁ¶àдÉÙ£¬Ä¬Èϲ»Ö§³Ö¶ÔÊý¾ÝµÄupdateºÍdelete

Hive¼Ü¹¹

Hive ÓÉÍⲿCLI£¬Hive Thrift Server»òÕßWeb UIÌá½»SQL£¬Ìá½»ÖÁDriverÖУ¬Driver½«sql½âÎö³ÉMapReduceÖ´Ðмƻ®£¬²¢½øÐÐÂß¼­ÓÅ»¯¼°ÎïÀíÓÅ»¯ºóÌá½»ÖÁMapReduce½øÐÐÖ´ÐУ¬Èç¹ûÓÐÐèҪдÈëµÄÊý¾Ý¾ÍдÈëHDFSÎļþÖУ¬²¢ÇҼǼÏÂMetadataÖÁMetastoreÖÐ

HiveµÄ´æ´¢Îļþ¸ñʽ

HiveËùÓд洢¶¼ÊÇÒÔÎļþ¸ñÊ½Çø·ÖĿ¼´æ·ÅÔÚhdfsÉϵ쬴¢´æ¸ñʽµÄ²»Í¬¼°ÌصãÇø·ÖÓÚ¸÷¸öÎļþ¸ñʽµÄÌØµã£¬HiveÖ§³ÖÔÚ½¨±íʱʹÓÃSTORED AS (TextFile|RCFile|SequenceFile|AVRO|ORC|Parquet)À´Ö¸¶¨´æ´¢¸ñʽ

ÒÔÏÂÊÇÿÖÖ¸ñʽµÄÌØµã

TextFile: ÐÐʽ´æ´¢£¬Ã¿Ò»Ðж¼ÊÇÒ»Ìõ¼Ç¼£¬Ã¿Ðж¼ÒÔ»»Ðзû£¨\ n£©½áβ¡£Êý¾Ý²»×öѹËõ£¬´ÅÅÌ¿ªÏú´ó£¬Êý¾Ý½âÎö¿ªÏú´ó¡£¿É½áºÏGzip¡¢Bzip2ʹÓã¨ÏµÍ³×Ô¶¯¼ì²é£¬Ö´Ðвéѯʱ×Ô¶¯½âѹ£©£¬µ«Ê¹ÓÃÕâÖÖ·½Ê½£¬hive²»»á¶ÔÊý¾Ý½øÐÐÇз֣¬´Ó¶øÎÞ·¨¶ÔÊý¾Ý½øÐв¢ÐвÙ×÷¡£

SequenceFile: ÐÐʽ´æ´¢£¬ÊÇHadoop APIÌṩµÄÒ»ÖÖ¶þ½øÖÆÎļþÖ§³Ö£¬Æä¾ßÓÐʹÓ÷½±ã¡¢¿É·Ö¸î¡¢¿ÉѹËõµÄÌØµã¡£Ö§³ÖÈýÖÖѹËõÑ¡Ôñ£ºNONE, RECORD, BLOCK¡£ RecordѹËõÂʵͣ¬Ò»°ã½¨ÒéʹÓÃBLOCKѹËõ¡£

RCFile£ºÐÐÁд洢Ïà½áºÏ£¬Ê×ÏÈ£¬Æä½«Êý¾Ý°´Ðзֿ飬±£Ö¤Í¬Ò»¸örecordÔÚÒ»¸ö¿éÉÏ£¬±ÜÃâ¶ÁÒ»¸ö¼Ç¼ÐèÒª¶ÁÈ¡¶à¸öblock¡£Æä´Î£¬¿éÊý¾ÝÁÐʽ´æ´¢£¬ÓÐÀûÓÚÊý¾ÝѹËõºÍ¿ìËÙµÄÁдæÈ¡¡£

AVRO£º¿ªÔ´ÏîÄ¿£¬ÎªHadoopÌṩÊý¾ÝÐòÁл¯ºÍÊý¾Ý½»»»·þÎñ¡£Äú¿ÉÒÔÔÚHadoopÉú̬ϵͳºÍÒÔÈκαà³ÌÓïÑÔ±àдµÄ³ÌÐòÖ®¼ä½»»»Êý¾Ý¡£AvroÊÇ»ùÓÚ´óÊý¾ÝHadoopµÄÓ¦ÓóÌÐòÖÐÁ÷ÐеÄÎļþ¸ñʽ֮һ¡£

ORC: ÁÐʽ´æ´¢£¬Hive´Ó´óÐͱí¶ÁÈ¡£¬Ð´ÈëºÍ´¦ÀíÊý¾Ýʱ£¬Ê¹ÓÃORCÎļþ¿ÉÒÔÌá¸ßÐÔÄÜ¡£

Parquet: ÁÐʽ´æ´¢£¬ÃæÏòÁеĶþ½øÖÆÎļþ¸ñʽ£¬²»¿ÉÒÔÖ±½Ó¶ÁÈ¡

ÎÒÃÇÔÚ¶Á¶àдÉÙ²¢ÇÒֻʹÓÃhiveµÄÇé¿öÏ£¬Ó¦¸Ã¾¡Á¿Ê¹ÓÃorcÒÔÌá¸ßÐÔÄÜ

hql½âÎöÁ÷³Ì

Hive»á½«Hive sql½âÎöΪMapReduce£¬ÔÚÁ˽âSQLÈçºÎ±àÒëΪMapReduce֮ǰ£¬ÏÈ¿´¿´MapReduce¿ò¼ÜʵÏÖSQL²Ù×÷µÄ»ù´¡Ô­Àí

MapReduceʵÏÖ»ù±¾SQL²Ù×÷µÄÔ­Àí

JoinµÄʵÏÖÔ­Àí

select u.name, o.orderid from order o join user u on o.uid = u.uid;

ÔÚmapµÄÊä³övalueÖÐΪ²»Í¬±íµÄÊý¾Ý´òÉÏtag±ê¼Ç£¬ÔÚreduce½×¶Î¸ù¾ÝtagÅжÏÊý¾ÝÀ´Ô´¡£MapReduceµÄ¹ý³ÌÈçÏ£¨ÕâÀïÖ»ÊÇ˵Ã÷×î»ù±¾µÄJoinµÄʵÏÖ£¬»¹ÓÐÆäËûµÄʵÏÖ·½Ê½£©

Group ByµÄʵÏÖÔ­Àí

select rank, isonline, count(*) from city group by rank, isonline;

½«GroupByµÄ×Ö¶Î×éºÏΪmapµÄÊä³ökeyÖµ£¬ÀûÓÃMapReduceµÄÅÅÐò£¬ÔÚreduce½×¶Î±£´æLastKeyÇø·Ö²»Í¬µÄkey¡£MapReduceµÄ¹ý³ÌÈçÏ£¨µ±È»ÕâÀïÖ»ÊÇ˵Ã÷Reduce¶ËµÄ·ÇHash¾ÛºÏ¹ý³Ì£©

DistinctµÄʵÏÖÔ­Àí

select dealid, count(distinct uid) num from order group by dealid;

µ±Ö»ÓÐÒ»¸ödistinct×Ö¶Îʱ£¬Èç¹û²»¿¼ÂÇMap½×¶ÎµÄHash GroupBy£¬Ö»ÐèÒª½«GroupBy×ֶκÍDistinct×Ö¶Î×éºÏΪmapÊä³ökey£¬ÀûÓÃmapreduceµÄÅÅÐò£¬Í¬Ê±½«GroupBy×Ö¶Î×÷ΪreduceµÄkey£¬ÔÚreduce½×¶Î±£´æLastKey¼´¿ÉÍê³ÉÈ¥ÖØ

HQLת»¯ÎªMapReduceµÄ¹ý³Ì

½«HQLתΪMapReduceÖ´ÐеÄÖ÷ÒªÁ÷³ÌÈçÏÂ

Óï·¨½âÎö ½«HQL½âÎöΪAST£¨AbstractSyntaxTree£¬³éÏóÓï·¨Ê÷£©£¬Èë¿ÚΪParseDriver.run()·½·¨¡£¸Ã²½ÖèÖ÷Òª½èÖúÓÚAntlr3ʵÏÖSQLµÄ´Ê·¨ºÍÓï·¨½âÎö£¬ÕâÀï²»Ïêϸ½éÉÜAntlr£¬Ö»ÐèÒªÁ˽âʹÓÃAntlr¹¹ÔìÌØ¶¨µÄÓïÑÔÖ»ÐèÒª±àдһ¸öÓï·¨Îļþ£¬¶¨Òå´Ê·¨ºÍÓï·¨Ìæ»»¹æÔò¼´¿É£¬AntlrÍê³ÉÁË´Ê·¨·ÖÎö¡¢Óï·¨·ÖÎö¡¢ÓïÒå·ÖÎö¡¢Öмä´úÂëÉú³ÉµÄ¹ý³Ì¡£

ÓïÒå·ÖÎöµÚÒ»½×¶Î AST TreeÈÔÈ»·Ç³£¸´ÔÓ£¬²»¹»½á¹¹»¯£¬²»·½±ãÖ±½Ó·­ÒëΪMapReduce³ÌÐò£¬AST Treeת»¯ÎªQueryBlock¾ÍÊǽ«SQL½øÒ»²¿³éÏóºÍ½á¹¹»¯¡£QueryBlockÊÇÒ»ÌõSQL×î»ù±¾µÄ×é³Éµ¥Ôª£¬°üÀ¨Èý¸ö²¿·Ö£ºÊäÈëÔ´£¬¼ÆËã¹ý³Ì£¬Êä³ö¡£¼òµ¥À´½²Ò»¸öQueryBlock¾ÍÊÇÒ»¸ö×Ó²éѯ¡£

Éú³É²éѯ¼Æ»® ½«QueryBlock½âÎö³ÉOperatorÊ÷£¬Hive×îÖÕÉú³ÉµÄMapReduceÈÎÎñ£¬Map½×¶ÎºÍReduce½×¶Î¾ùÓÉOperatorTree×é³É¡£Operator£¬¾ÍÊÇÔÚMap½×¶Î»òÕßReduce½×¶ÎÍê³Éµ¥Ò»Ìض¨µÄ²Ù×÷¡£

Âß¼­ÓÅ»¯ ÓÅ»¯ÒÑÉú³ÉµÄOperatorÊ÷£¬ºÏ²¢²Ù×÷·û£¬´ïµ½¼õÉÙMapReduce Job£¬¼õÉÙshuffleÊý¾ÝÁ¿µÄÄ¿µÄ

Éú³ÉMRÈÎÎñ ½«OperatorÊ÷½âÎöΪTaskÓÐÏòÎÞ»·Í¼

ÎïÀíÓÅ»¯ ¸ÄдTaskÓÐÏòÎÞ»·Í¼£¬½«Ä³Ð©½áµãµÄÆÕͨTask¸ÄдΪ¿ÉÔÚÔËÐÐʱ½øÐзÖ֦ѡÔñµÄConditionalTask

Ö´ÐÐ ½«TaskÓÐÏòÎÞ»·Í¼½»ÓÉ¿ò¼Ü½øÐÐÖ´ÐÐ

×ܽá

HiveÊÇÊôÓÚOLAPÐ͵ÄÊý¾Ý²Ö¿â£¬ÊÊÓ󡾰ΪִÐÐʱЧÐÔ²»¸ßµÄ£¬Êý¾ÝÁ¿´óµÄÀëÏß·ÖÎöÐͼÆË㣬ÀýÈ籨±í·ÖÎö£¬Èç¹ûÓдóÁ¿ÊÂÎñÐÔ²Ù×÷£¬ÇëʹÓÃOLTPÐÍÊý¾Ý¿â£¬ÈçmysqlµÈ

Hive´æ´¢Îļþ¸ñʽÓжàÖÖ£¬Ä¬ÈÏΪÐÐʽ´æ´¢µÄTextFile£¬Õ¼Óÿռä½Ï´ó£¬²¢ÇÒ¼ÆËã¶ÁÈ¡ÐÔÄܲ»¸ß£¬ÔÚʹÓÃhiveʱ£¬¾¡Á¿Ñ¡ÓÃorc¸ñʽ´æ´¢£¬Ñ¹Ëõ±ÈÀý½Ï´ó£¬ÇжÁÈ¡ÐÔÄܸܺß

Hive SQLµ×²ãΪMapReduce£¬ÐèÌá½»ÖÁyarnÉÏÖ´ÐУ¬yarn·ÖÅä×ÊÔ´¸øMapReduceÈÎÎñ£¬´óÁ¿hive sqlͬʱÌá½»¿ÉÄÜ»á·Ç³£ËðºÄmaster½áµã·ÖÅäÈÎÎñµÄ×ÊÔ´£¬Èç¹ûÐèÒªÔÚ³ÌÐòÖе÷ÓÃhive sql insertʱ£¬ÇëʹÓÃÅúÁ¿²åÈëµÄsql»òÕßͨ¹ýÆäËü·½Ê½(Èç±àдÓû§×Ô¶¨Ò庯Êý¶ÁÈ¡mysql)½øÐÐʵÏÖ

 
   
2574 ´Îä¯ÀÀ       27
Ïà¹ØÎÄÕÂ

»ùÓÚEAµÄÊý¾Ý¿â½¨Ä£
Êý¾ÝÁ÷½¨Ä££¨EAÖ¸ÄÏ£©
¡°Êý¾Ýºþ¡±£º¸ÅÄî¡¢ÌØÕ÷¡¢¼Ü¹¹Óë°¸Àý
ÔÚÏßÉ̳ÇÊý¾Ý¿âϵͳÉè¼Æ ˼·+Ч¹û
 
Ïà¹ØÎĵµ

GreenplumÊý¾Ý¿â»ù´¡Åàѵ
MySQL5.1ÐÔÄÜÓÅ»¯·½°¸
ijµçÉÌÊý¾ÝÖÐ̨¼Ü¹¹Êµ¼ù
MySQL¸ßÀ©Õ¹¼Ü¹¹Éè¼Æ
Ïà¹Ø¿Î³Ì

Êý¾ÝÖÎÀí¡¢Êý¾Ý¼Ü¹¹¼°Êý¾Ý±ê×¼
MongoDBʵս¿Î³Ì
²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿âÉè¼ÆÓëÓÅ»¯
PostgreSQLÊý¾Ý¿âʵսÅàѵ
×îл¼Æ»®
DeepSeekÔÚÈí¼þ²âÊÔÓ¦ÓÃʵ¼ù 4-12[ÔÚÏß]
DeepSeek´óÄ£ÐÍÓ¦Óÿª·¢Êµ¼ù 4-19[ÔÚÏß]
UAF¼Ü¹¹ÌåϵÓëʵ¼ù 4-11[±±¾©]
AIÖÇÄÜ»¯Èí¼þ²âÊÔ·½·¨Óëʵ¼ù 5-23[ÉϺ£]
»ùÓÚ UML ºÍEA½øÐзÖÎöÉè¼Æ 4-26[±±¾©]
ÒµÎñ¼Ü¹¹Éè¼ÆÓ뽨ģ 4-18[±±¾©]
 
×îÐÂÎÄÕÂ
´óÊý¾Ýƽ̨ϵÄÊý¾ÝÖÎÀí
ÈçºÎÉè¼ÆÊµÊ±Êý¾Ýƽ̨£¨¼¼Êõƪ£©
´óÊý¾Ý×ʲú¹ÜÀí×ÜÌå¿ò¼Ü¸ÅÊö
Kafka¼Ü¹¹ºÍÔ­Àí
ELK¶àÖּܹ¹¼°ÓÅÁÓ
×îпγÌ
´óÊý¾Ýƽ̨´î½¨Óë¸ßÐÔÄܼÆËã
´óÊý¾Ýƽ̨¼Ü¹¹ÓëÓ¦ÓÃʵս
´óÊý¾ÝϵͳÔËά
´óÊý¾Ý·ÖÎöÓë¹ÜÀí
Python¼°Êý¾Ý·ÖÎö
³É¹¦°¸Àý
ijͨÐÅÉ豸ÆóÒµ PythonÊý¾Ý·ÖÎöÓëÍÚ¾ò
Ä³ÒøÐÐ È˹¤ÖÇÄÜ+Python+´óÊý¾Ý
±±¾© Python¼°Êý¾Ý·ÖÎö
ÉñÁúÆû³µ ´óÊý¾Ý¼¼Êõƽ̨-Hadoop
ÖйúµçÐÅ ´óÊý¾Ýʱ´úÓëÏÖ´úÆóÒµµÄÊý¾Ý»¯ÔËӪʵ¼ù