Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Model Center   Code  
»áÔ±   
   
 
     
   
 ¶©ÔÄ
  ¾èÖú
Kudu:Ö§³Ö¿ìËÙ·ÖÎöµÄÐÂÐÍHadoop´æ´¢ÏµÍ³
 
×÷Õߣº XGogo
 
  1972  次浏览      28
2020-4-26
 
±à¼­ÍƼö:
±¾ÎÄÖ÷Òª¶Ô Kudu µÄ¶¯»ú¡¢±³¾°£¬ÒÔ¼°¼Ü¹¹½øÐмòµ¥½éÉÜ,Ï£Íû¶ÔÄúµÄѧϰÓÐËù°ïÖú¡£
±¾ÎÄÀ´×ÔBBSMAX£¬ÓÉ»ðÁú¹ûÈí¼þAlice±à¼­¡¢ÍƼö¡£

Kudu ÊÇ Cloudera ¿ªÔ´µÄÐÂÐÍÁÐʽ´æ´¢ÏµÍ³£¬ÊÇ Apache Hadoop Éú̬ȦµÄгÉÔ±Ö®Ò»£¨ incubating £©£¬×¨ÃÅΪÁ˶ԿìËٱ仯µÄÊý¾Ý½øÐпìËٵķÖÎö£¬Ìî²¹ÁËÒÔÍù Hadoop ´æ´¢²ãµÄ¿Õȱ¡£

±³¾°¡ª¡ª¹¦ÄÜÉϵĿհ×

Hadoop Éú̬ϵͳÓкܶà×é¼þ£¬Ã¿Ò»¸ö×é¼þÓв»Í¬µÄ¹¦ÄÜ¡£ÔÚÏÖʵ³¡¾°ÖУ¬Óû§ÍùÍùÐèҪͬʱ²¿ÊðºÜ¶à Hadoop ¹¤¾ßÀ´½â¾öͬһ¸öÎÊÌ⣬ÕâÖּܹ¹³ÆÎª »ìºÏ¼Ü¹¹ (hybrid architecture) ¡£ ±ÈÈ磬Óû§ÐèÒªÀûÓà Hbase µÄ¿ìËÙ²åÈë¡¢¿ì¶Á random access µÄÌØÐÔÀ´µ¼ÈëÊý¾Ý£¬ HBase Ò²ÔÊÐíÓû§¶ÔÊý¾Ý½øÐÐÐ޸ģ¬ HBase ¶ÔÓÚ´óÁ¿Ð¡¹æÄ£²éѯҲ·Ç³£Ñ¸ËÙ¡£Í¬Ê±£¬Óû§Ê¹Óà HDFS/Parquet + Impala/Hive À´¶Ô³¬´óµÄÊý¾Ý¼¯½øÐвéѯ·ÖÎö£¬¶ÔÓÚÕâÀೡ¾°£¬ Parquet ÕâÖÖÁÐʽ´æ´¢Îļþ¸ñʽ¾ßÓм«´óµÄÓÅÊÆ¡£

ºÜ¶à¹«Ë¾¶¼³É¹¦µØ²¿ÊðÁË HDFS/Parquet + HBase »ìºÏ¼Ü¹¹£¬È»¶øÕâÖּܹ¹½ÏΪ¸´ÔÓ£¬¶øÇÒÔÚά»¤ÉÏҲʮ·ÖÀ§ÄÑ¡£Ê×ÏÈ£¬Óû§Óà Flume »ò Kafka µÈÊý¾Ý Ingest ¹¤¾ß½«Êý¾Ýµ¼Èë HBase £¬Óû§¿ÉÄÜÔÚ HBase É϶ÔÊý¾Ý×öһЩÐ޸ġ£È»ºóÿ¸ôÒ»¶Îʱ¼ä ( ÿÌì»òÿÖÜ ) ½«Êý¾Ý´Ó Hbase Öе¼Èëµ½ Parquet Îļþ£¬×÷Ϊһ¸öÐ嵀 partition ·ÅÔÚ HDFS ÉÏ£¬×îºóʹÓà Impala µÈ¼ÆËãÒýÇæ½øÐвéѯ£¬Éú³É×îÖÕ±¨±í¡£

ÕâÑùÒ»Ìõ¹¤¾ßÁ´·±Ëö¶ø¸´ÔÓ£¬¶øÇÒ»¹´æÔںܶàÎÊÌ⣬±ÈÈ磺

1.ÈçºÎ´¦Àíijһ¹ý³Ì³öÏÖʧ°Ü£¿

2.´Ó HBase ½«Êý¾Ýµ¼³öµ½Îļþ£¬¶à¾ÃµÄƵÂʱȽϺÏÊÊ£¿

3.µ±Éú³É×îÖÕ±¨±íʱ£¬×î½üµÄÊý¾Ý²¢ÎÞ·¨ÌåÏÖÔÚ×îÖÕ²éѯ½á¹ûÉÏ¡£

4.ά»¤¼¯ÈºÊ±£¬ÈçºÎ±£Ö¤¹Ø¼üÈÎÎñ²»Ê§°Ü£¿

5.Parquet ÊÇ immutable £¬Òò´Ëµ± HBase ÖÐɾ¸ÄijЩÀúÊ·Êý¾Ýʱ£¬ÍùÍùÐèÒªÈ˹¤¸ÉÔ¤½øÐÐͬ²½¡£

Õâʱºò£¬Óû§¾ÍÏ£ÍûÄܹ»ÓÐÒ»ÖÖÓÅÑŵĴ洢½â¾ö·½°¸£¬À´Ó¦¸¶²»Í¬ÀàÐ͵Ť×÷Á÷£¬²¢±£³Ö¸ßÐÔÄܵļÆËãÄÜÁ¦¡£ Cloudera ºÜÔç¾ÍÒâʶµ½Õâ¸öÎÊÌ⣬ÔÚ 2012 Äê¾Í¿ªÊ¼¼Æ»®¿ª·¢ Kudu Õâ¸ö´æ´¢ÏµÍ³£¬ÖÕÓÚÔÚ 2015 Äê·¢²¼²¢¿ªÔ´³öÀ´¡£ Kudu ÊÇ¶Ô HDFS ºÍ HBase ¹¦ÄÜÉϵIJ¹³ä£¬ÄÜÌṩ¿ìËٵķÖÎöºÍʵʱ¼ÆËãÄÜÁ¦£¬²¢ÇÒ³ä·ÖÀûÓà CPU ºÍ I/O ×ÊÔ´£¬Ö§³ÖÊý¾ÝÔ­µØÐ޸ģ¬Ö§³Ö¼òµ¥µÄ¡¢¿ÉÀ©Õ¹µÄÊý¾ÝÄ£ÐÍ¡£

±³¾°¡ª¡ªÐµÄÓ²¼þÉ豸

RAM µÄ¼¼Êõ·¢Õ¹·Ç³£¿ì£¬Ëü±äµÃÔ½À´Ô½±ãÒË£¬ÈÝÁ¿Ò²Ô½À´Ô½´ó¡£ Cloudera µÄ¿Í»§Êý¾ÝÏÔʾ£¬ËûÃǵĿͻ§Ëù²¿ÊðµÄ·þÎñÆ÷£¬ 2012 Äêÿ¸ö½Úµã½öÓÐ 32GB RAM £¬ÏÖÈç½ñÔö³¤µ½Ã¿¸ö½ÚµãÓÐ 128GB »ò 256GB RAM ¡£´æ´¢É豸ÉϸüÐÂÒ²·Ç³£¿ì£¬ ÔںܶàÆÕͨ·þÎñÆ÷Öв¿Êð SSD Ò²ÊÇÂżû²»ÏÊ¡£ HBase ¡¢ HDFS ¡¢ÒÔ¼°ÆäËûµÄ Hadoop ¹¤¾ß¶¼ÔÚ²»¶Ï×ÔÎÒÍêÉÆ£¬´Ó¶øÊÊÓ¦Ó²¼þÉϵÄÉý¼¶»»´ú¡£È»¶ø£¬´Ó¸ù±¾ÉÏ£¬ HDFS »ùÓÚ 03 Äê GFS £¬ HBase »ùÓÚ 05 Äê BigTable £¬ÔÚµ±Ê±ÏµÍ³Æ¿¾±Ö÷Ҫȡ¾öÓڵײã´ÅÅÌËÙ¶È¡£µ±´ÅÅÌËٶȽÏÂýʱ£¬ CPU ÀûÓÃÂʲ»×ãµÄ¸ù±¾Ô­ÒòÊÇ´ÅÅÌËٶȵ¼ÖÂµÄÆ¿¾±£¬µ±´ÅÅÌËÙ¶ÈÌá¸ßÁËÖ®ºó£¬ CPU ÀûÓÃÂÊÌá¸ß£¬Õâʱºò CPU ÍùÍù³ÉΪϵͳµÄÆ¿¾±¡£ HBase ¡¢ HDFS ÓÉÓÚÄê´ú¾ÃÔ¶£¬ÒѾ­ºÜÄÑ´Ó»ù±¾¼Ü¹¹ÉϽøÐÐÐ޸쬶ø Kudu ÊÇ»ùÓÚȫеÄÉè¼Æ£¬Òò´Ë¿ÉÒÔ¸ü³ä·ÖµØÀûÓà RAM ¡¢ I/O ×ÊÔ´£¬²¢ÓÅ»¯ CPU ÀûÓÃÂÊ¡£ÎÒÃÇ¿ÉÒÔÀí½âΪ£¬ Kudu Ïà±ÈÓëÒÔÍùµÄϵͳ£¬ CPU ʹÓýµµÍÁË£¬ I/O µÄʹÓÃÌá¸ßÁË£¬ RAM µÄÀûÓøü³ä·ÖÁË¡£

¼ò½é

Kudu Éè¼ÆÖ®³õ£¬ÊÇΪÁ˽â¾öÒ»ÏÂÎÊÌ⣺

1.¶ÔÊý¾ÝɨÃè (scan) ºÍËæ»ú·ÃÎÊ (random access) ͬʱ¾ßÓиßÐÔÄÜ£¬¼ò»¯Óû§¸´ÔӵĻìºÏ¼Ü¹¹

2.¸ß CPU ЧÂÊ£¬Ê¹Óû§¹ºÂòµÄÏȽø´¦ÀíÆ÷µÄµÄ»¨·ÑµÃµ½×î´ó»Ø±¨

3.¸ß IO ÐÔÄÜ£¬³ä·ÖÀûÓÃÏȽø´æ´¢½éÖÊ

4.Ö§³ÖÊý¾ÝµÄÔ­µØ¸üУ¬±ÜÃâ¶îÍâµÄÊý¾Ý´¦Àí¡¢Êý¾ÝÒÆ¶¯

5.Ö§³Ö¿çÊý¾ÝÖÐÐÄ replication

Kudu µÄºÜ¶àÌØÐÔ¸ú HBase ºÜÏñ£¬ËüÖ§³ÖË÷Òý¼üµÄ²éѯºÍÐ޸ġ£ Cloudera Ôø¾­Ïë¹ý»ùÓÚ Hbase ½øÐÐÐ޸ģ¬È»¶ø½áÂÛÊÇ¶Ô HBase µÄ¸Ä¶¯·Ç³£´ó£¬ Kudu µÄÊý¾ÝÄ£ÐͺʹÅÅÌ´æ´¢¶¼Óë Hbase ²»Í¬¡£ HBase ±¾Éí³É¹¦µÄÊÊÓÃÓÚ´óÁ¿µÄÆäËü³¡¾°£¬Òò´ËÐÞ¸Ä HBase ºÜ¿ÉÄܳÔÁ¦²»Ìֺá£×îºó Cloudera ¾ö¶¨¿ª·¢Ò»¸öȫеĴ洢ϵͳ¡£

Kudu µÄ¶¨Î»ÊÇÌṩ ¡±fast analytics on fast data¡± £¬Ò²¾ÍÊÇÔÚ¿ìËÙ¸üеÄÊý¾ÝÉϽøÐпìËٵIJéѯ¡£Ëü¶¨Î» OLAP ºÍÉÙÁ¿µÄ OLTP ¹¤×÷Á÷£¬Èç¹ûÓдóÁ¿µÄ random accesses £¬¹Ù·½½¨Ò黹ÊÇʹÓà HBase ×îΪºÏÊÊ¡£

¼Ü¹¹ÓëÉè¼Æ

1. »ù±¾¿ò¼Ü

Kudu ÊÇÓÃÓÚ´æ´¢½á¹¹»¯£¨ structured £©µÄ±í£¨ Table £©¡£±íÓÐÔ¤¶¨ÒåµÄ´øÀàÐ͵ÄÁУ¨ Columns £©£¬Ã¿ÕűíÓÐÒ»¸öÖ÷¼ü£¨ primary key £©¡£Ö÷¼ü´øÓÐΨһÐÔ£¨ uniqueness £©ÏÞÖÆ£¬¿É×÷ΪË÷ÒýÓÃÀ´Ö§³Ö¿ìËÙµÄ random access ¡£

ÀàËÆÓÚ BigTable £¬ Kudu µÄ±íÊÇÓɺܶàÊý¾Ý×Ó¼¯¹¹³ÉµÄ£¬±í±»Ë®Æ½²ð·Ö³É¶à¸ö Tablets. Kudu ÓÃÒÔÿ¸ö tablet Ϊһ¸öµ¥ÔªÀ´ÊµÏÖÊý¾ÝµÄ durability ¡£ Tablet Óжà¸ö¸±±¾£¬Í¬Ê±ÔÚ¶à¸ö½ÚµãÉϽøÐг־û¯¡£

Kudu ÓÐÁ½ÖÖÀàÐ͵Ä×é¼þ£¬ Master Server ºÍ Tablet Server ¡£ Master ¸ºÔð¹ÜÀíÔªÊý¾Ý¡£ÕâЩԪÊý¾Ý°üÀ¨ talbet µÄ»ù±¾ÐÅÏ¢£¬Î»ÖÃÐÅÏ¢¡£ Master »¹×÷Ϊ¸ºÔؾùºâ·þÎñÆ÷£¬¼àÌý Tablet Server µÄ½¡¿µ×´Ì¬¡£¶ÔÓÚ¸±±¾Êý¹ýµÍµÄ Tablet £¬ Master »áÔÚÆð replication ÈÎÎñÀ´Ìá¸ßÆä¸±±¾Êý¡£ Master µÄËùÓÐÐÅÏ¢¶¼ÔÚÄÚ´æÖÐ cache £¬Òò´ËËٶȷdz£¿ì¡£Ã¿´Î²éѯ¶¼ÔÚ°ÙºÁÃë¼¶±ð¡£ Kudu Ö§³Ö¶à¸ö Master £¬²»¹ýÖ»ÓÐÒ»¸ö active Master £¬ÆäÓàÖ»ÊÇ×÷ΪÔÖ±¸£¬²»Ìṩ·þÎñ¡£

Tablet Server ÉÏ´æÁË 10~100 ¸ö Tablets £¬Ã¿¸ö Tablet ÓÐ 3 £¨»ò 5 £©¸ö¸±±¾´æ·ÅÔÚ²»Í¬µÄ Tablet Server ÉÏ£¬Ã¿¸ö Tablet ͬʱֻÓÐÒ»¸ö leader ¸±±¾£¬Õâ¸ö¸±±¾¶ÔÓû§ÌṩÐ޸IJÙ×÷£¬È»ºó½«Ð޸Ľá¹ûͬ²½¸ø follower ¡£ Follower Ö»Ìṩ¶Á·þÎñ£¬²»ÌṩÐ޸ķþÎñ¡£¸±±¾Ö®¼äʹÓà raft ЭÒéÀ´ÊµÏÖ High Availability £¬µ± leader ËùÔڵĽڵ㷢Éú¹ÊÕÏʱ£¬ followers »áÖØÐÂÑ¡¾Ù leader ¡£¸ù¾Ý¹Ù·½µÄÊý¾Ý£¬Æä MTTR ԼΪ 5 Ã룬¶Ô client ¶Ë¼¸ºõûÓÐÓ°Ïì¡£ Raft ЭÒéµÄÁíÒ»¸ö×÷ÓÃÊÇʵÏÖ Consistency ¡£ Client ¶Ô leader µÄÐ޸IJÙ×÷£¬ÐèҪͬ²½µ½ N/2+1 ¸ö½ÚµãÉÏ£¬¸Ã²Ù×÷²ÅËã³É¹¦¡£

Kudu ²ÉÓÃÁËÀàËÆ log-structured ´æ´¢ÏµÍ³µÄ·½Ê½£¬Ôöɾ¸Ä²Ù×÷¶¼·ÅÔÚÄÚ´æÖÐµÄ buffer £¬È»ºó²Å merge µ½³Ö¾Ã»¯µÄÁÐʽ´æ´¢ÖС£ Kudu »¹ÊÇÓÃÁË WALs À´¶ÔÄÚ´æÖÐµÄ buffer ½øÐÐÔÖ±¸¡£

2. ÁÐʽ´æ´¢

³Ö¾Ã»¯µÄÁÐʽ´æ´¢´æ´¢£¬Óë HBase ÍêÈ«²»Í¬£¬¶øÊÇʹÓÃÁËÀàËÆ Parquet µÄ·½Ê½£¬Í¬Ò»¸öÁÐÔÚ´ÅÅÌÉÏÊÇ×÷Ϊһ¸öÁ¬ÐøµÄ¿é½øÐдæ·ÅµÄ¡£ÀýÈ磬ͼÖÐ×ó±ßÊÇ twitter ±£´æÍÆÎĵÄÒ»ÕÅ±í£¬¶øÍ¼ÖеÄÓұ߱íʾÁ˱íÔÚ´ÅÅÌÖеĵĴ洢·½Ê½£¬Ò²¾ÍÊǽ«Í¬Ò»¸öÁзÅÔÚÒ»Æð´æ·Å¡£ÕâÑù×öµÄµÚÒ»¸öºÃ´¦ÊÇ£¬¶ÔÓÚһЩ¾ÛºÏºÍ join Óï¾ä£¬ÎÒÃÇ¿ÉÒÔ¾¡¿ÉÄܵؼõÉÙ´ÅÅ̵ķÃÎÊ¡£ÀýÈ磬ÎÒÃÇÒªÓû§ÃûΪ newsycbot

µÄÍÆÎÄÊýÁ¿£¬Ê¹ÓòéѯÓï¾ä£º

SELECT COUNT(*) FROM tweets WHERE user_name = ¡®newsycbot¡¯;

ÎÒÃÇÖ»ÐèÒª²éѯ User_name Õâ¸ö block ¼´¿É¡£Í¬Ò»¸öÁеÄÊý¾ÝÊǼ¯Öе쬶øÇÒÊÇÏàͬ¸ñʽµÄ£¬ Kudu ¿ÉÒÔ¶ÔÊý¾Ý½øÐбàÂ룬ÀýÈç×Öµä±àÂ룬Ð㤱àÂ룬 bitshuffle µÈ¡£Í¨¹ýÕâÖÖ·½Ê½¿ÉÒԺܴóµÄ¼õÉÙÊý¾ÝÔÚ´ÅÅÌÉϵĴóС£¬Ìá¸ßÍÌÍÂÂÊ¡£³ý´ËÖ®Í⣬Óû§¿ÉÒÔÑ¡ÔñʹÓÃͨÓõÄѹËõ¸ñʽ¶ÔÊý¾Ý½øÐÐѹËõ£¬Èç LZ4, gzip, »ò bzip2 ¡£ÕâÊÇ¿ÉÑ¡µÄ£¬Óû§¿ÉÒÔ¸ù¾ÝÒµÎñ³¡¾°£¬ÔÚÊý¾Ý´óСºÍ CPU ЧÂÊÉϽøÐÐȨºâ¡£ÕâÒ»²¿·ÖµÄʵÏÖÉÏ£¬ Kudu ºÜ´ó²¿·Ö½è¼øÁË Parquet µÄ´úÂë¡£

HBase Ö§³Ö snappy ´æ´¢£¬È»¶øÒòΪËüµÄ LSM µÄÊý¾Ý´æ´¢·½Ê½£¬Ê¹µÃËüºÜÄѶÔÊý¾Ý½øÐÐÌØÊâ±àÂ룬ÕâÒ²ÊÇ Kudu Éù³Æ¾ßÓкܿìµÄ scan ËٶȵÄÒ»¸öºÜÖØÒªµÄÔ­Òò¡£²»¹ý£¬ÒòΪÁÐʽ±àÂëºóµÄÊý¾ÝºÜÄÑÔÙ½øÐÐÐ޸ģ¬Òò´Ëµ±ÕâдÊý¾ÝдÈë´ÅÅ̺ó£¬ÊDz»¿É±äµÄ£¬Õⲿ·ÖÊý¾Ý³ÆÖ®Îª base Êý¾Ý¡£ Kudu Óà MVCC £¨¶à°æ±¾²¢·¢¿ØÖÆ£©À´ÊµÏÖÊý¾ÝµÄɾ¸Ä¹¦ÄÜ¡£¸üС¢É¾³ý²Ù×÷ÐèÒª¼Ç¼µ½ÌØÊâµÄÊý¾Ý½á¹¹À±£´æÔÚÄÚ´æÖÐµÄ DeltaMemStore »ò´ÅÅÌÉ쵀 DeltaFIle ÀïÃæ¡£ DeltaMemStore ÊÇ B-Tree ʵÏֵģ¬Òò´ËËٶȿ죬¶øÇÒ¿ÉÐ޸ġ£´ÅÅÌÉ쵀 DeltaFIle ÊǶþ½øÖƵÄÁÐʽµÄ¿é£¬ºÍ base Êý¾ÝÒ»Ñù¶¼ÊDz»¿ÉÐ޸ĵġ£Òò´Ëµ±Êý¾ÝƵ·±É¾¸ÄµÄʱºò£¬´ÅÅÌÉÏ»áÓдóÁ¿µÄ DeltaFiles Îļþ£¬ Kudu ½è¼øÁË Hbase µÄ·½Ê½£¬»á¶¨ÆÚ¶ÔÕâЩÎļþ½øÐкϲ¢¡£

3. ¶ÔÍâ½Ó¿Ú

Kudu Ìṩ C++ ºÍ JAVA API £¬¿ÉÒÔ½øÐе¥Ìõ»òÅúÁ¿µÄÊý¾Ý¶Áд£¬ schema µÄ´´½¨Ð޸ġ£³ý´ËÖ®Í⣬ Kudu »¹½«Óë hadoop Éú̬ȦµÄÆäËü¹¤¾ß½øÐÐÕûºÏ¡£Ä¿Ç°£¬ kudu beta °æ±¾¶Ô Impala Ö§³Ö½ÏΪÍêÉÆ£¬Ö§³ÖÓà Impala ½øÐд´½¨±í¡¢É¾¸ÄÊý¾ÝµÈ´ó²¿·Ö²Ù×÷¡£ Kudu »¹ÊµÏÖÁË KuduTableInputFormat ºÍ KuduTableOutputFormat £¬´Ó¶øÖ§³Ö Mapreduce µÄ¶Áд²Ù×÷¡£Í¬Ê±Ö§³ÖÊý¾ÝµÄ locality ¡£Ä¿Ç°¶Ô spark µÄÖ§³Ö»¹²»¹»ÍêÉÆ£¬ spark Ö»ÄܽøÐÐÊý¾ÝµÄ¶Á²Ù×÷¡£

ʹÓð¸Àý¡ª¡ªÐ¡Ã×

СÃ×ÊÇ Hbase µÄÖØ¶ÈÓû§£¬ËûÃÇÿÌìÓÐÔ¼ 50 ÒÚÌõÓû§¼Ç¼¡£Ð¡Ã×ĿǰʹÓõÄÒ²ÊÇ HDFS + HBase ÕâÑùµÄ»ìºÏ¼Ü¹¹¡£¿É¼û¸ÃÁ÷Ë®ÏßÏà¶Ô±È½Ï¸´ÔÓ£¬ÆäÊý¾Ý´æ´¢·ÖΪ SequenceFile £¬ Hbase ºÍ Parquet ¡£

ÔÚʹÓà Kudu ÒÔºó£¬ Kudu ×÷ΪͳһµÄÊý¾Ý²Ö¿â£¬¿ÉÒÔͬʱ֧³ÖÀëÏß·ÖÎöºÍʵʱ½»»¥·ÖÎö¡£

ÐÔÄܲâÊÔ

1. ºÍ parquet µÄ±È½Ï

ͼÊǹٷ½¸ø³öµÄÓà Impala ÅÜ TPC-H µÄ²âÊÔ£¬¶Ô±È Parquet ºÍ Kudu µÄ¼ÆËãËÙ¶È¡£´ÓͼÖÐÎÒÃÇ¿ÉÒÔ·¢ÏÖ£¬ Kudu µÄËÙ¶ÈºÍ parquet µÄËٶȲî¾à²»´ó£¬ÉõÖÁÓÐЩ Query ±È parquet »¹¿ì¡£È»¶ø£¬ÓÉÓÚÕâЩÊý¾Ý¶¼ÊÇÔÚÄڴ滺´æ¹ýµÄ£¬Òò´Ë¸Ã²âÊÔ½á¹û²»¾ß±¸²Î¿¼¼ÛÖµ¡£

2. ºÍ Hbase µÄ±È½Ï

ͼÊǹٷ½¸ø³öµÄÁíÒ»×é²âÊÔ½á¹û£¬´ÓͼÖÐÎÒÃÇ¿ÉÒÔ¿´³ö£¬ÔÚ scan ºÍ range ²éѯÉÏ£¬ kudu ºÍ parquet ±È HBase ¿ìºÜ¶à£¬¶ø random access Ôò±È HBase ÉÔÂý¡£È»¶øÊý¾Ý¼¯Ö»ÓÐ 60 ÒÚÐÐÊý¾Ý£¬ËùÒԺܿÉÄÜÕâЩÊý¾ÝÒ²ÊÇ¿ÉÒÔÈ«²¿»º´æÔÚÄÚ´æµÄ¡£¶ÔÓÚ´ÓÄÚ´æ²éѯ£¬³ýÁË random access ±È HBase ÂýÖ®Í⣬ kudu µÄËÙ¶È»ù±¾ÒªÓÅÓÚ HBase ¡£

3. ³¬´óÊý¾Ý¼¯µÄ²éѯÐÔÄÜ

Kudu µÄ¶¨Î»²»ÊÇ in-memory database ¡£ÒòΪËüÏ£Íû HDFS/Parquet ÕâÖÖ´æ´¢£¬Òò´Ë´óÁ¿µÄÊý¾Ý¶¼ÊÇ´æ´¢ÔÚ´ÅÅÌÉÏ¡£Èç¹ûÎÒÃÇÏëÒªÄÃËü´úÌæ HDFS/Parquet + HBase £¬ÄÇô³¬´óÊý¾Ý¼¯µÄ²éѯÐÔÄܾÍÖÁ¹ØÖØÒª£¬ÕâÒ²ÊÇ Kudu µÄ×î³õÄ¿µÄ¡£È»¶ø£¬¹Ù·½Ã»Óиø³öÕâ·½ÃæµÄÏà¹ØÊý¾Ý¡£ÓÉÓÚÌõ¼þÏÞÖÆ£¬ÍøÒ×ÔÝʱδÄÜÍê³É¸Ã²âÊÔ¡£ÏÂÒ»²½£¬ÎÒÃǽ«¼Æ»®´î½¨ 10 ̨ Kudu + Impala ·þÎñÆ÷£¬²¢Óà tpc-ds Éú³É³¬´óÊý¾Ý£¬À´Íê³É¸Ã¶Ô±È²âÑé¡£

 

 

 
   
1972 ´Îä¯ÀÀ       28
Ïà¹ØÎÄÕÂ

»ùÓÚEAµÄÊý¾Ý¿â½¨Ä£
Êý¾ÝÁ÷½¨Ä££¨EAÖ¸ÄÏ£©
¡°Êý¾Ýºþ¡±£º¸ÅÄî¡¢ÌØÕ÷¡¢¼Ü¹¹Óë°¸Àý
ÔÚÏßÉ̳ÇÊý¾Ý¿âϵͳÉè¼Æ ˼·+Ч¹û
 
Ïà¹ØÎĵµ

GreenplumÊý¾Ý¿â»ù´¡Åàѵ
MySQL5.1ÐÔÄÜÓÅ»¯·½°¸
ijµçÉÌÊý¾ÝÖÐ̨¼Ü¹¹Êµ¼ù
MySQL¸ßÀ©Õ¹¼Ü¹¹Éè¼Æ
Ïà¹Ø¿Î³Ì

Êý¾ÝÖÎÀí¡¢Êý¾Ý¼Ü¹¹¼°Êý¾Ý±ê×¼
MongoDBʵս¿Î³Ì
²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿âÉè¼ÆÓëÓÅ»¯
PostgreSQLÊý¾Ý¿âʵսÅàѵ
×îл¼Æ»®
DeepSeekÔÚÈí¼þ²âÊÔÓ¦ÓÃʵ¼ù 4-12[ÔÚÏß]
DeepSeek´óÄ£ÐÍÓ¦Óÿª·¢Êµ¼ù 4-19[ÔÚÏß]
UAF¼Ü¹¹ÌåϵÓëʵ¼ù 4-11[±±¾©]
AIÖÇÄÜ»¯Èí¼þ²âÊÔ·½·¨Óëʵ¼ù 5-23[ÉϺ£]
»ùÓÚ UML ºÍEA½øÐзÖÎöÉè¼Æ 4-26[±±¾©]
ÒµÎñ¼Ü¹¹Éè¼ÆÓ뽨ģ 4-18[±±¾©]
 
×îÐÂÎÄÕÂ
´óÊý¾Ýƽ̨ϵÄÊý¾ÝÖÎÀí
ÈçºÎÉè¼ÆÊµÊ±Êý¾Ýƽ̨£¨¼¼Êõƪ£©
´óÊý¾Ý×ʲú¹ÜÀí×ÜÌå¿ò¼Ü¸ÅÊö
Kafka¼Ü¹¹ºÍÔ­Àí
ELK¶àÖּܹ¹¼°ÓÅÁÓ
×îпγÌ
´óÊý¾Ýƽ̨´î½¨Óë¸ßÐÔÄܼÆËã
´óÊý¾Ýƽ̨¼Ü¹¹ÓëÓ¦ÓÃʵս
´óÊý¾ÝϵͳÔËά
´óÊý¾Ý·ÖÎöÓë¹ÜÀí
Python¼°Êý¾Ý·ÖÎö
³É¹¦°¸Àý
ijͨÐÅÉ豸ÆóÒµ PythonÊý¾Ý·ÖÎöÓëÍÚ¾ò
Ä³ÒøÐÐ È˹¤ÖÇÄÜ+Python+´óÊý¾Ý
±±¾© Python¼°Êý¾Ý·ÖÎö
ÉñÁúÆû³µ ´óÊý¾Ý¼¼Êõƽ̨-Hadoop
ÖйúµçÐÅ ´óÊý¾Ýʱ´úÓëÏÖ´úÆóÒµµÄÊý¾Ý»¯ÔËӪʵ¼ù