±à¼ÍƼö: |
±¾ÎÄÖ÷Òª¶Ô
Kudu µÄ¶¯»ú¡¢±³¾°£¬ÒÔ¼°¼Ü¹¹½øÐмòµ¥½éÉÜ,Ï£Íû¶ÔÄúµÄѧϰÓÐËù°ïÖú¡£
±¾ÎÄÀ´×ÔBBSMAX£¬ÓÉ»ðÁú¹ûÈí¼þAlice±à¼¡¢ÍƼö¡£ |
|
Kudu ÊÇ Cloudera ¿ªÔ´µÄÐÂÐÍÁÐʽ´æ´¢ÏµÍ³£¬ÊÇ Apache
Hadoop Éú̬ȦµÄгÉÔ±Ö®Ò»£¨ incubating £©£¬×¨ÃÅΪÁ˶ԿìËٱ仯µÄÊý¾Ý½øÐпìËٵķÖÎö£¬Ìî²¹ÁËÒÔÍù
Hadoop ´æ´¢²ãµÄ¿Õȱ¡£
±³¾°¡ª¡ª¹¦ÄÜÉϵĿհ×
Hadoop Éú̬ϵͳÓкܶà×é¼þ£¬Ã¿Ò»¸ö×é¼þÓв»Í¬µÄ¹¦ÄÜ¡£ÔÚÏÖʵ³¡¾°ÖУ¬Óû§ÍùÍùÐèҪͬʱ²¿ÊðºÜ¶à Hadoop
¹¤¾ßÀ´½â¾öͬһ¸öÎÊÌ⣬ÕâÖּܹ¹³ÆÎª »ìºÏ¼Ü¹¹ (hybrid architecture) ¡£ ±ÈÈ磬Óû§ÐèÒªÀûÓÃ
Hbase µÄ¿ìËÙ²åÈë¡¢¿ì¶Á random access µÄÌØÐÔÀ´µ¼ÈëÊý¾Ý£¬ HBase Ò²ÔÊÐíÓû§¶ÔÊý¾Ý½øÐÐÐ޸ģ¬
HBase ¶ÔÓÚ´óÁ¿Ð¡¹æÄ£²éѯҲ·Ç³£Ñ¸ËÙ¡£Í¬Ê±£¬Óû§Ê¹Óà HDFS/Parquet + Impala/Hive
À´¶Ô³¬´óµÄÊý¾Ý¼¯½øÐвéѯ·ÖÎö£¬¶ÔÓÚÕâÀೡ¾°£¬ Parquet ÕâÖÖÁÐʽ´æ´¢Îļþ¸ñʽ¾ßÓм«´óµÄÓÅÊÆ¡£
ºÜ¶à¹«Ë¾¶¼³É¹¦µØ²¿ÊðÁË HDFS/Parquet + HBase »ìºÏ¼Ü¹¹£¬È»¶øÕâÖּܹ¹½ÏΪ¸´ÔÓ£¬¶øÇÒÔÚά»¤ÉÏҲʮ·ÖÀ§ÄÑ¡£Ê×ÏÈ£¬Óû§ÓÃ
Flume »ò Kafka µÈÊý¾Ý Ingest ¹¤¾ß½«Êý¾Ýµ¼Èë HBase £¬Óû§¿ÉÄÜÔÚ HBase
É϶ÔÊý¾Ý×öһЩÐ޸ġ£È»ºóÿ¸ôÒ»¶Îʱ¼ä ( ÿÌì»òÿÖÜ ) ½«Êý¾Ý´Ó Hbase Öе¼Èëµ½ Parquet
Îļþ£¬×÷Ϊһ¸öÐ嵀 partition ·ÅÔÚ HDFS ÉÏ£¬×îºóʹÓà Impala µÈ¼ÆËãÒýÇæ½øÐвéѯ£¬Éú³É×îÖÕ±¨±í¡£

ÕâÑùÒ»Ìõ¹¤¾ßÁ´·±Ëö¶ø¸´ÔÓ£¬¶øÇÒ»¹´æÔںܶàÎÊÌ⣬±ÈÈ磺
1.ÈçºÎ´¦Àíijһ¹ý³Ì³öÏÖʧ°Ü£¿
2.´Ó HBase ½«Êý¾Ýµ¼³öµ½Îļþ£¬¶à¾ÃµÄƵÂʱȽϺÏÊÊ£¿
3.µ±Éú³É×îÖÕ±¨±íʱ£¬×î½üµÄÊý¾Ý²¢ÎÞ·¨ÌåÏÖÔÚ×îÖÕ²éѯ½á¹ûÉÏ¡£
4.ά»¤¼¯ÈºÊ±£¬ÈçºÎ±£Ö¤¹Ø¼üÈÎÎñ²»Ê§°Ü£¿
5.Parquet ÊÇ immutable £¬Òò´Ëµ± HBase
ÖÐɾ¸ÄijЩÀúÊ·Êý¾Ýʱ£¬ÍùÍùÐèÒªÈ˹¤¸ÉÔ¤½øÐÐͬ²½¡£
Õâʱºò£¬Óû§¾ÍÏ£ÍûÄܹ»ÓÐÒ»ÖÖÓÅÑŵĴ洢½â¾ö·½°¸£¬À´Ó¦¸¶²»Í¬ÀàÐ͵Ť×÷Á÷£¬²¢±£³Ö¸ßÐÔÄܵļÆËãÄÜÁ¦¡£ Cloudera
ºÜÔç¾ÍÒâʶµ½Õâ¸öÎÊÌ⣬ÔÚ 2012 Äê¾Í¿ªÊ¼¼Æ»®¿ª·¢ Kudu Õâ¸ö´æ´¢ÏµÍ³£¬ÖÕÓÚÔÚ 2015 Äê·¢²¼²¢¿ªÔ´³öÀ´¡£
Kudu ÊÇ¶Ô HDFS ºÍ HBase ¹¦ÄÜÉϵIJ¹³ä£¬ÄÜÌṩ¿ìËٵķÖÎöºÍʵʱ¼ÆËãÄÜÁ¦£¬²¢ÇÒ³ä·ÖÀûÓÃ
CPU ºÍ I/O ×ÊÔ´£¬Ö§³ÖÊý¾ÝÔµØÐ޸ģ¬Ö§³Ö¼òµ¥µÄ¡¢¿ÉÀ©Õ¹µÄÊý¾ÝÄ£ÐÍ¡£
±³¾°¡ª¡ªÐµÄÓ²¼þÉ豸
RAM µÄ¼¼Êõ·¢Õ¹·Ç³£¿ì£¬Ëü±äµÃÔ½À´Ô½±ãÒË£¬ÈÝÁ¿Ò²Ô½À´Ô½´ó¡£ Cloudera µÄ¿Í»§Êý¾ÝÏÔʾ£¬ËûÃǵĿͻ§Ëù²¿ÊðµÄ·þÎñÆ÷£¬
2012 Äêÿ¸ö½Úµã½öÓÐ 32GB RAM £¬ÏÖÈç½ñÔö³¤µ½Ã¿¸ö½ÚµãÓÐ 128GB »ò 256GB RAM
¡£´æ´¢É豸ÉϸüÐÂÒ²·Ç³£¿ì£¬ ÔںܶàÆÕͨ·þÎñÆ÷Öв¿Êð SSD Ò²ÊÇÂżû²»ÏÊ¡£ HBase ¡¢ HDFS
¡¢ÒÔ¼°ÆäËûµÄ Hadoop ¹¤¾ß¶¼ÔÚ²»¶Ï×ÔÎÒÍêÉÆ£¬´Ó¶øÊÊÓ¦Ó²¼þÉϵÄÉý¼¶»»´ú¡£È»¶ø£¬´Ó¸ù±¾ÉÏ£¬ HDFS
»ùÓÚ 03 Äê GFS £¬ HBase »ùÓÚ 05 Äê BigTable £¬ÔÚµ±Ê±ÏµÍ³Æ¿¾±Ö÷Ҫȡ¾öÓڵײã´ÅÅÌËÙ¶È¡£µ±´ÅÅÌËٶȽÏÂýʱ£¬
CPU ÀûÓÃÂʲ»×ãµÄ¸ù±¾ÔÒòÊÇ´ÅÅÌËٶȵ¼ÖÂµÄÆ¿¾±£¬µ±´ÅÅÌËÙ¶ÈÌá¸ßÁËÖ®ºó£¬ CPU ÀûÓÃÂÊÌá¸ß£¬Õâʱºò
CPU ÍùÍù³ÉΪϵͳµÄÆ¿¾±¡£ HBase ¡¢ HDFS ÓÉÓÚÄê´ú¾ÃÔ¶£¬ÒѾºÜÄÑ´Ó»ù±¾¼Ü¹¹ÉϽøÐÐÐ޸쬶ø
Kudu ÊÇ»ùÓÚȫеÄÉè¼Æ£¬Òò´Ë¿ÉÒÔ¸ü³ä·ÖµØÀûÓà RAM ¡¢ I/O ×ÊÔ´£¬²¢ÓÅ»¯ CPU ÀûÓÃÂÊ¡£ÎÒÃÇ¿ÉÒÔÀí½âΪ£¬
Kudu Ïà±ÈÓëÒÔÍùµÄϵͳ£¬ CPU ʹÓýµµÍÁË£¬ I/O µÄʹÓÃÌá¸ßÁË£¬ RAM µÄÀûÓøü³ä·ÖÁË¡£
¼ò½é
Kudu Éè¼ÆÖ®³õ£¬ÊÇΪÁ˽â¾öÒ»ÏÂÎÊÌ⣺
1.¶ÔÊý¾ÝɨÃè (scan) ºÍËæ»ú·ÃÎÊ (random access)
ͬʱ¾ßÓиßÐÔÄÜ£¬¼ò»¯Óû§¸´ÔӵĻìºÏ¼Ü¹¹
2.¸ß CPU ЧÂÊ£¬Ê¹Óû§¹ºÂòµÄÏȽø´¦ÀíÆ÷µÄµÄ»¨·ÑµÃµ½×î´ó»Ø±¨
3.¸ß IO ÐÔÄÜ£¬³ä·ÖÀûÓÃÏȽø´æ´¢½éÖÊ
4.Ö§³ÖÊý¾ÝµÄԵظüУ¬±ÜÃâ¶îÍâµÄÊý¾Ý´¦Àí¡¢Êý¾ÝÒÆ¶¯
5.Ö§³Ö¿çÊý¾ÝÖÐÐÄ replication
Kudu µÄºÜ¶àÌØÐÔ¸ú HBase ºÜÏñ£¬ËüÖ§³ÖË÷Òý¼üµÄ²éѯºÍÐ޸ġ£ Cloudera Ôø¾Ïë¹ý»ùÓÚ
Hbase ½øÐÐÐ޸ģ¬È»¶ø½áÂÛÊÇ¶Ô HBase µÄ¸Ä¶¯·Ç³£´ó£¬ Kudu µÄÊý¾ÝÄ£ÐͺʹÅÅÌ´æ´¢¶¼Óë
Hbase ²»Í¬¡£ HBase ±¾Éí³É¹¦µÄÊÊÓÃÓÚ´óÁ¿µÄÆäËü³¡¾°£¬Òò´ËÐÞ¸Ä HBase ºÜ¿ÉÄܳÔÁ¦²»Ìֺá£×îºó
Cloudera ¾ö¶¨¿ª·¢Ò»¸öȫеĴ洢ϵͳ¡£

Kudu µÄ¶¨Î»ÊÇÌṩ ¡±fast analytics on fast data¡± £¬Ò²¾ÍÊÇÔÚ¿ìËÙ¸üеÄÊý¾ÝÉϽøÐпìËٵIJéѯ¡£Ëü¶¨Î»
OLAP ºÍÉÙÁ¿µÄ OLTP ¹¤×÷Á÷£¬Èç¹ûÓдóÁ¿µÄ random accesses £¬¹Ù·½½¨Ò黹ÊÇʹÓÃ
HBase ×îΪºÏÊÊ¡£
¼Ü¹¹ÓëÉè¼Æ
1. »ù±¾¿ò¼Ü
Kudu ÊÇÓÃÓÚ´æ´¢½á¹¹»¯£¨ structured £©µÄ±í£¨ Table £©¡£±íÓÐÔ¤¶¨ÒåµÄ´øÀàÐ͵ÄÁУ¨
Columns £©£¬Ã¿ÕűíÓÐÒ»¸öÖ÷¼ü£¨ primary key £©¡£Ö÷¼ü´øÓÐΨһÐÔ£¨ uniqueness
£©ÏÞÖÆ£¬¿É×÷ΪË÷ÒýÓÃÀ´Ö§³Ö¿ìËÙµÄ random access ¡£
ÀàËÆÓÚ BigTable £¬ Kudu µÄ±íÊÇÓɺܶàÊý¾Ý×Ó¼¯¹¹³ÉµÄ£¬±í±»Ë®Æ½²ð·Ö³É¶à¸ö
Tablets. Kudu ÓÃÒÔÿ¸ö tablet Ϊһ¸öµ¥ÔªÀ´ÊµÏÖÊý¾ÝµÄ durability ¡£
Tablet Óжà¸ö¸±±¾£¬Í¬Ê±ÔÚ¶à¸ö½ÚµãÉϽøÐг־û¯¡£
Kudu ÓÐÁ½ÖÖÀàÐ͵Ä×é¼þ£¬ Master Server ºÍ Tablet Server ¡£ Master
¸ºÔð¹ÜÀíÔªÊý¾Ý¡£ÕâЩԪÊý¾Ý°üÀ¨ talbet µÄ»ù±¾ÐÅÏ¢£¬Î»ÖÃÐÅÏ¢¡£ Master »¹×÷Ϊ¸ºÔؾùºâ·þÎñÆ÷£¬¼àÌý
Tablet Server µÄ½¡¿µ×´Ì¬¡£¶ÔÓÚ¸±±¾Êý¹ýµÍµÄ Tablet £¬ Master »áÔÚÆð replication
ÈÎÎñÀ´Ìá¸ßÆä¸±±¾Êý¡£ Master µÄËùÓÐÐÅÏ¢¶¼ÔÚÄÚ´æÖÐ cache £¬Òò´ËËٶȷdz£¿ì¡£Ã¿´Î²éѯ¶¼ÔÚ°ÙºÁÃë¼¶±ð¡£
Kudu Ö§³Ö¶à¸ö Master £¬²»¹ýÖ»ÓÐÒ»¸ö active Master £¬ÆäÓàÖ»ÊÇ×÷ΪÔÖ±¸£¬²»Ìṩ·þÎñ¡£
Tablet Server ÉÏ´æÁË 10~100 ¸ö Tablets £¬Ã¿¸ö Tablet ÓÐ 3
£¨»ò 5 £©¸ö¸±±¾´æ·ÅÔÚ²»Í¬µÄ Tablet Server ÉÏ£¬Ã¿¸ö Tablet ͬʱֻÓÐÒ»¸ö leader
¸±±¾£¬Õâ¸ö¸±±¾¶ÔÓû§ÌṩÐ޸IJÙ×÷£¬È»ºó½«Ð޸Ľá¹ûͬ²½¸ø follower ¡£ Follower Ö»Ìṩ¶Á·þÎñ£¬²»ÌṩÐ޸ķþÎñ¡£¸±±¾Ö®¼äʹÓÃ
raft ÐÒéÀ´ÊµÏÖ High Availability £¬µ± leader ËùÔڵĽڵ㷢Éú¹ÊÕÏʱ£¬
followers »áÖØÐÂÑ¡¾Ù leader ¡£¸ù¾Ý¹Ù·½µÄÊý¾Ý£¬Æä MTTR ԼΪ 5 Ã룬¶Ô client
¶Ë¼¸ºõûÓÐÓ°Ïì¡£ Raft ÐÒéµÄÁíÒ»¸ö×÷ÓÃÊÇʵÏÖ Consistency ¡£ Client ¶Ô leader
µÄÐ޸IJÙ×÷£¬ÐèҪͬ²½µ½ N/2+1 ¸ö½ÚµãÉÏ£¬¸Ã²Ù×÷²ÅËã³É¹¦¡£

Kudu ²ÉÓÃÁËÀàËÆ log-structured ´æ´¢ÏµÍ³µÄ·½Ê½£¬Ôöɾ¸Ä²Ù×÷¶¼·ÅÔÚÄÚ´æÖÐµÄ buffer
£¬È»ºó²Å merge µ½³Ö¾Ã»¯µÄÁÐʽ´æ´¢ÖС£ Kudu »¹ÊÇÓÃÁË WALs À´¶ÔÄÚ´æÖÐµÄ buffer
½øÐÐÔÖ±¸¡£
2. ÁÐʽ´æ´¢
³Ö¾Ã»¯µÄÁÐʽ´æ´¢´æ´¢£¬Óë HBase ÍêÈ«²»Í¬£¬¶øÊÇʹÓÃÁËÀàËÆ Parquet µÄ·½Ê½£¬Í¬Ò»¸öÁÐÔÚ´ÅÅÌÉÏÊÇ×÷Ϊһ¸öÁ¬ÐøµÄ¿é½øÐдæ·ÅµÄ¡£ÀýÈ磬ͼÖÐ×ó±ßÊÇ
twitter ±£´æÍÆÎĵÄÒ»ÕÅ±í£¬¶øÍ¼ÖеÄÓұ߱íʾÁ˱íÔÚ´ÅÅÌÖеĵĴ洢·½Ê½£¬Ò²¾ÍÊǽ«Í¬Ò»¸öÁзÅÔÚÒ»Æð´æ·Å¡£ÕâÑù×öµÄµÚÒ»¸öºÃ´¦ÊÇ£¬¶ÔÓÚһЩ¾ÛºÏºÍ
join Óï¾ä£¬ÎÒÃÇ¿ÉÒÔ¾¡¿ÉÄܵؼõÉÙ´ÅÅ̵ķÃÎÊ¡£ÀýÈ磬ÎÒÃÇÒªÓû§ÃûΪ newsycbot
µÄÍÆÎÄÊýÁ¿£¬Ê¹ÓòéѯÓï¾ä£º
SELECT COUNT(*)
FROM tweets WHERE user_name = ¡®newsycbot¡¯; |

ÎÒÃÇÖ»ÐèÒª²éѯ User_name Õâ¸ö block ¼´¿É¡£Í¬Ò»¸öÁеÄÊý¾ÝÊǼ¯Öе쬶øÇÒÊÇÏàͬ¸ñʽµÄ£¬
Kudu ¿ÉÒÔ¶ÔÊý¾Ý½øÐбàÂ룬ÀýÈç×Öµä±àÂ룬Ð㤱àÂ룬 bitshuffle µÈ¡£Í¨¹ýÕâÖÖ·½Ê½¿ÉÒԺܴóµÄ¼õÉÙÊý¾ÝÔÚ´ÅÅÌÉϵĴóС£¬Ìá¸ßÍÌÍÂÂÊ¡£³ý´ËÖ®Í⣬Óû§¿ÉÒÔÑ¡ÔñʹÓÃͨÓõÄѹËõ¸ñʽ¶ÔÊý¾Ý½øÐÐѹËõ£¬Èç
LZ4, gzip, »ò bzip2 ¡£ÕâÊÇ¿ÉÑ¡µÄ£¬Óû§¿ÉÒÔ¸ù¾ÝÒµÎñ³¡¾°£¬ÔÚÊý¾Ý´óСºÍ CPU ЧÂÊÉϽøÐÐȨºâ¡£ÕâÒ»²¿·ÖµÄʵÏÖÉÏ£¬
Kudu ºÜ´ó²¿·Ö½è¼øÁË Parquet µÄ´úÂë¡£

HBase Ö§³Ö snappy ´æ´¢£¬È»¶øÒòΪËüµÄ LSM µÄÊý¾Ý´æ´¢·½Ê½£¬Ê¹µÃËüºÜÄѶÔÊý¾Ý½øÐÐÌØÊâ±àÂ룬ÕâÒ²ÊÇ
Kudu Éù³Æ¾ßÓкܿìµÄ scan ËٶȵÄÒ»¸öºÜÖØÒªµÄÔÒò¡£²»¹ý£¬ÒòΪÁÐʽ±àÂëºóµÄÊý¾ÝºÜÄÑÔÙ½øÐÐÐ޸ģ¬Òò´Ëµ±ÕâдÊý¾ÝдÈë´ÅÅ̺ó£¬ÊDz»¿É±äµÄ£¬Õⲿ·ÖÊý¾Ý³ÆÖ®Îª
base Êý¾Ý¡£ Kudu Óà MVCC £¨¶à°æ±¾²¢·¢¿ØÖÆ£©À´ÊµÏÖÊý¾ÝµÄɾ¸Ä¹¦ÄÜ¡£¸üС¢É¾³ý²Ù×÷ÐèÒª¼Ç¼µ½ÌØÊâµÄÊý¾Ý½á¹¹À±£´æÔÚÄÚ´æÖеÄ
DeltaMemStore »ò´ÅÅÌÉ쵀 DeltaFIle ÀïÃæ¡£ DeltaMemStore ÊÇ
B-Tree ʵÏֵģ¬Òò´ËËٶȿ죬¶øÇÒ¿ÉÐ޸ġ£´ÅÅÌÉ쵀 DeltaFIle ÊǶþ½øÖƵÄÁÐʽµÄ¿é£¬ºÍ
base Êý¾ÝÒ»Ñù¶¼ÊDz»¿ÉÐ޸ĵġ£Òò´Ëµ±Êý¾ÝƵ·±É¾¸ÄµÄʱºò£¬´ÅÅÌÉÏ»áÓдóÁ¿µÄ DeltaFiles
Îļþ£¬ Kudu ½è¼øÁË Hbase µÄ·½Ê½£¬»á¶¨ÆÚ¶ÔÕâЩÎļþ½øÐкϲ¢¡£
3. ¶ÔÍâ½Ó¿Ú
Kudu Ìṩ C++ ºÍ JAVA API £¬¿ÉÒÔ½øÐе¥Ìõ»òÅúÁ¿µÄÊý¾Ý¶Áд£¬ schema µÄ´´½¨Ð޸ġ£³ý´ËÖ®Í⣬
Kudu »¹½«Óë hadoop Éú̬ȦµÄÆäËü¹¤¾ß½øÐÐÕûºÏ¡£Ä¿Ç°£¬ kudu beta °æ±¾¶Ô Impala
Ö§³Ö½ÏΪÍêÉÆ£¬Ö§³ÖÓà Impala ½øÐд´½¨±í¡¢É¾¸ÄÊý¾ÝµÈ´ó²¿·Ö²Ù×÷¡£ Kudu »¹ÊµÏÖÁË KuduTableInputFormat
ºÍ KuduTableOutputFormat £¬´Ó¶øÖ§³Ö Mapreduce µÄ¶Áд²Ù×÷¡£Í¬Ê±Ö§³ÖÊý¾ÝµÄ
locality ¡£Ä¿Ç°¶Ô spark µÄÖ§³Ö»¹²»¹»ÍêÉÆ£¬ spark Ö»ÄܽøÐÐÊý¾ÝµÄ¶Á²Ù×÷¡£
ʹÓð¸Àý¡ª¡ªÐ¡Ã×
СÃ×ÊÇ Hbase µÄÖØ¶ÈÓû§£¬ËûÃÇÿÌìÓÐÔ¼ 50 ÒÚÌõÓû§¼Ç¼¡£Ð¡Ã×ĿǰʹÓõÄÒ²ÊÇ HDFS +
HBase ÕâÑùµÄ»ìºÏ¼Ü¹¹¡£¿É¼û¸ÃÁ÷Ë®ÏßÏà¶Ô±È½Ï¸´ÔÓ£¬ÆäÊý¾Ý´æ´¢·ÖΪ SequenceFile £¬
Hbase ºÍ Parquet ¡£

ÔÚʹÓà Kudu ÒÔºó£¬ Kudu ×÷ΪͳһµÄÊý¾Ý²Ö¿â£¬¿ÉÒÔͬʱ֧³ÖÀëÏß·ÖÎöºÍʵʱ½»»¥·ÖÎö¡£

ÐÔÄܲâÊÔ
1. ºÍ parquet µÄ±È½Ï

ͼÊǹٷ½¸ø³öµÄÓà Impala ÅÜ TPC-H µÄ²âÊÔ£¬¶Ô±È Parquet ºÍ Kudu µÄ¼ÆËãËÙ¶È¡£´ÓͼÖÐÎÒÃÇ¿ÉÒÔ·¢ÏÖ£¬
Kudu µÄËÙ¶ÈºÍ parquet µÄËٶȲî¾à²»´ó£¬ÉõÖÁÓÐЩ Query ±È parquet »¹¿ì¡£È»¶ø£¬ÓÉÓÚÕâЩÊý¾Ý¶¼ÊÇÔÚÄڴ滺´æ¹ýµÄ£¬Òò´Ë¸Ã²âÊÔ½á¹û²»¾ß±¸²Î¿¼¼ÛÖµ¡£
2. ºÍ Hbase µÄ±È½Ï

ͼÊǹٷ½¸ø³öµÄÁíÒ»×é²âÊÔ½á¹û£¬´ÓͼÖÐÎÒÃÇ¿ÉÒÔ¿´³ö£¬ÔÚ scan ºÍ range ²éѯÉÏ£¬ kudu
ºÍ parquet ±È HBase ¿ìºÜ¶à£¬¶ø random access Ôò±È HBase ÉÔÂý¡£È»¶øÊý¾Ý¼¯Ö»ÓÐ
60 ÒÚÐÐÊý¾Ý£¬ËùÒԺܿÉÄÜÕâЩÊý¾ÝÒ²ÊÇ¿ÉÒÔÈ«²¿»º´æÔÚÄÚ´æµÄ¡£¶ÔÓÚ´ÓÄÚ´æ²éѯ£¬³ýÁË random access
±È HBase ÂýÖ®Í⣬ kudu µÄËÙ¶È»ù±¾ÒªÓÅÓÚ HBase ¡£
3. ³¬´óÊý¾Ý¼¯µÄ²éѯÐÔÄÜ
Kudu µÄ¶¨Î»²»ÊÇ in-memory database ¡£ÒòΪËüÏ£Íû HDFS/Parquet
ÕâÖÖ´æ´¢£¬Òò´Ë´óÁ¿µÄÊý¾Ý¶¼ÊÇ´æ´¢ÔÚ´ÅÅÌÉÏ¡£Èç¹ûÎÒÃÇÏëÒªÄÃËü´úÌæ HDFS/Parquet + HBase
£¬ÄÇô³¬´óÊý¾Ý¼¯µÄ²éѯÐÔÄܾÍÖÁ¹ØÖØÒª£¬ÕâÒ²ÊÇ Kudu µÄ×î³õÄ¿µÄ¡£È»¶ø£¬¹Ù·½Ã»Óиø³öÕâ·½ÃæµÄÏà¹ØÊý¾Ý¡£ÓÉÓÚÌõ¼þÏÞÖÆ£¬ÍøÒ×ÔÝʱδÄÜÍê³É¸Ã²âÊÔ¡£ÏÂÒ»²½£¬ÎÒÃǽ«¼Æ»®´î½¨
10 ̨ Kudu + Impala ·þÎñÆ÷£¬²¢Óà tpc-ds Éú³É³¬´óÊý¾Ý£¬À´Íê³É¸Ã¶Ô±È²âÑé¡£
|