±à¼ÍƼö: |
ÈçºÎÔÚÒ»¸öϵͳÖÐÈÚºÏOLTPÐÍËæ»ú¶ÁдÄÜÁ¦ÓëOLAPÐÍ·ÖÎöÄÜÁ¦£¬KuduÌṩÁËÓÅÐãµÄÉè¼ÆË¼Â·¡£
±¾ÎÄÀ´×Ôcsdn£¬ÓÉ»ðÁú¹ûÈí¼þAnna±à¼¡¢ÍƼö¡£ |
|
±¾ÎÄÖ÷Òª´ÓKuduµÄÉè¼ÆÂÛÎÄ×ÅÊÖ£¬½áºÏÓëHBaseµÄ¶Ô±È·ÖÎö£¬À´³õ²½½ÒʾKuduµÄÉè¼ÆÔÀí£¬²¿·ÖÉè¼ÆÔÚ×îеÄKudu°æ±¾ÖпÉÄÜÒѾ¹ýʱ£¬µ«×î³õµÄÉè¼ÆË¼ÏëÒÀȻֵµÃ½è¼ø¡£
1KuduµÄÉè¼Æ³õÖÔ
ÔÚ½éÉÜKuduÊÇʲô֮ǰ£¬»¹ÊÇÏȼòµ¥µÄ˵һÏÂÏÖ´æÏµÍ³Õë¶Ô½á¹¹»¯Êý¾Ý´æ´¢Óë²éѯµÄһЩʹµãÎÊÌ⣬½á¹¹»¯Êý¾ÝµÄ´æ´¢£¬Í¨³£°üº¬ÈçÏÂÁ½ÖÖ·½Ê½£º
¾²Ì¬Êý¾Ýͨ³£ÒÔParquet/Carbon/AvroÐÎʽֱ½Ó´æ·ÅÔÚHDFSÖУ¬¶ÔÓÚ·ÖÎö³¡¾°£¬ÕâÖִ洢ͨ³£ÊǸü¼ÓÊʺϵġ£µ«ÎÞÂÛÒÔÄÄÖÖ·½Ê½´æÔÚÓÚHDFSÖУ¬¶¼ÄÑÒÔÖ§³Öµ¥Ìõ¼Ç¼¼¶±ðµÄ¸üУ¬Ëæ»ú¶ÁȡҲ²¢²»¸ßЧ¡£
¿É±äÊý¾ÝµÄ´æ´¢Í¨³£Ñ¡ÔñHBase»òÕßCassandra£¬ÒòΪËüÃÇÄܹ»Ö§³Ö¼Ç¼¼¶±ðµÄ¸ßÐ§Ëæ»ú¶Áд¡£µ«ÕâÖִ洢ȴ²¢²»ÊʺÏÀëÏß·ÖÎö³¡¾°£¬ÒòΪËüÃÇÔÚ´óÅúÁ¿Êý¾Ý»ñȡʱµÄÐÔÄܽϲÕë¶ÔHBase¶øÑÔ£¬ÓÐÁ½·½ÃæµÄÖ÷ÒªÔÒò£ºÒ»ÊÇHFile±¾ÉíµÄ½á¹¹¶¨Ò壬ËüÊǰ´ÐÐ×éÖ¯Êý¾ÝµÄ£¬ÕâÖÖ¸ñʽÕë¶Ô´ó¶àÊýµÄ·ÖÎö³¡¾°£¬¶¼»á´øÀ´½Ï´óµÄIOÏûºÄ£¬ÒòΪ¿ÉÄÜ»á¶ÁÈ¡ºÜ¶à²»±ØÒªµÄÊý¾Ý£¬Ïà¶Ô¶øÑÔParquet¸ñʽÕë¶Ô·ÖÎö³¡¾°¾Í×öÁ˺ܶàÓÅ»¯¡£
¶þÊÇÓÉÓÚHBase±¾ÉíµÄLSM-Tree¼Ü¹¹¾ö¶¨µÄ£¬HBaseµÄ¶Áȡ·¾¶ÖУ¬²»½öÒª¿¼ÂÇÄÚ´æÖеÄÊý¾Ý£¬Í¬Ê±Òª¿¼ÂÇHDFSÖеÄÒ»¸ö»ò¶à¸öHFile£¬½ÏÖ®ÓÚÖ±½Ó´ÓHDFSÖжÁÈ¡Îļþ¶øÑÔ£¬ÕâÖÖ¶Áȡ·¾¶Êǹý³¤µÄ£©¡£
¿ÉÒÔ¿´³ö£¬ÈçÉÏÁ½ÖÖ´æ´¢·½Ê½£¬¶¼´æÔÚÃ÷ÏÔµÄÓÅȱµã£º
Ö±½Ó´æ·ÅÓÚHDFSÖУ¬ÊʺÏÀëÏß·ÖÎö£¬È´²»ÀûÓڼǼ¼¶±ðµÄËæ»ú¶Áд¡£
Ö±½Ó½«Êý¾Ý´æ·ÅÓÚHBase/CassandraÖУ¬ÊʺϼǼ¼¶±ðµÄËæ»ú¶Áд£¬¶ÔÀëÏß·ÖÎöÈ´²»ÓѺá£
µ«Ôںܶàʵ¼ÊÒµÎñ³¡¾°ÖУ¬Á½ÖÖ³¡¾°Ê±³£ÊDz¢´æµÄ¡£ÎÒÃǵÄͨ³£×ö·¨ÓÐÈçϼ¸ÖÖ£º
Êý¾Ý´æ·ÅÓÚHBaseÖУ¬¶ÔÓÚ·ÖÎöÈÎÎñ£¬»ùÓÚSpark/Hive On HBase½øÐУ¬ÐÔÄܽϲ
¶ÔÓÚ·ÖÎöÐÔÄÜÒªÇó½Ï¸ßµÄ£¬¿ÉÒÔ½«Êý¾ÝÔÚHDFS/HiveÖжàÈßÓà´æ·ÅÒ»·Ý£¬»òÕߣ¬½«HBaseÖеÄÊý¾Ý¶¨ÆÚµÄµ¼³ö³ÉParquet/Carbon¸ñʽµÄÊý¾Ý¡£
Ã÷ÏÔÕâÖÖ·½°¸¶ÔÒµÎñÓ¦ÓÃÌá³öÁ˽ϸߵÄÒªÇ󣬶øÇÒÈÝÒ×µ¼ÖÂÔÚÏßÊý¾ÝÓëÀëÏßÊý¾ÝÖ®¼äµÄÒ»ÖÂÐÔÎÊÌâ¡£
KuduµÄÉè¼Æ£¬¾ÍÊÇÊÔͼÔÚOLAPÓëOLTPÖ®¼ä£¬Ñ°ÇóÒ»¸ö×î¼ÑµÄ½áºÏµã£¬´Ó¶øÔÚÒ»¸öϵͳµÄÒ»·ÝÊý¾ÝÖУ¬¼ÈÄÜÖ§³ÖOLTPÐÍʵʱ¶ÁдÄÜÁ¦ÓÖÄÜÖ§³ÖOLAPÐÍ·ÖÎö¡£ÁíÍâÒ»¸ö³õÖÔ£¬ÔÚCloudera·¢²¼µÄ¡¶Kudu:
New Apache Hadoop Storage for Fast Analytics on Fast
Data¡·Ò»ÎÄÖÐÓÐÌá¼°£¬Kudu×÷Ϊһ¸öеķֲ¼Ê½´æ´¢ÏµÍ³ÆÚÍûÓÐЧÌáÉýCPUµÄʹÓÃÂÊ£¬¶øµÍCPUʹÓÃÂÊÇ¡ÊÇHBase/CassandraµÈϵͳµÄ×î´óÎÊÌâ¡£ÏÂÃæµÄÕ½ÚÖУ¬Ö÷Òª´ÓÂÛÎÄËù½ÒʾµÄÄÚÈÝÀ´½â¶ÁKuduµÄÉè¼ÆÔÀí¡£
2 KuduµÄÔÀí½éÉÜ
Kudu×ÔÉíµÄ¼Ü¹¹£¬²¿·Ö½è¼øÁËBigtable/HBase/SpannerµÄÉè¼ÆË¼Ïë¡£ÂÛÎĵÄ×÷ÕßÁбíÖУ¬Óм¸Î»ÊÇHBaseÉçÇøµÄCommitter/PBC³ÉÔ±£¬Òò´Ë£¬ÔÚÂÛÎÄÖÐÒ²ÄܺÜÉî¿ÌµÄ¸ÐÊܵ½HBase¶ÔKuduÉè¼ÆµÄһЩӰÏ죬Òò´Ë£¬ÔÚ±¾ÎĵĶà¸öµØ·½¶¼ÓÐ̸¼°KuduÓëHBaseÔÚÉè¼ÆÉϵÄÒìͬ¡£
2.1 ±íÓëSchema
KuduÉè¼ÆÊÇÃæÏò½á¹¹»¯´æ´¢µÄ£¬Òò´Ë£¬KuduµÄ±í£¬ÐèÒªÓû§ÔÚ½¨±íʱ¶¨ÒåËüµÄSchemaÐÅÏ¢£¬ÕâЩSchemaÐÅÏ¢°üº¬£ºÁж¨Ò壨º¬ÀàÐÍ£©£¬Primary
Key¶¨Ò壨Óû§Ö¸¶¨µÄÈô¸É¸öÁеÄÓÐÐò×éºÏ£©¡£Êý¾ÝµÄΨһÐÔ£¬ÒÀÀµÓÚÓû§ËùÌṩµÄPrimary KeyÖеÄColumn×éºÏµÄÖµµÄΨһÐÔ¡£
KuduÌṩÁËAlterÃüÁîÀ´ÔöɾÁУ¬µ«Î»ÓÚPrimary KeyÖеÄÁÐÊDz»ÔÊÐíɾ³ýµÄ¡£
Kuduµ±Ç°²¢²»Ö§³Ö¶þ¼¶Ë÷Òý¡£
2.2 API
KuduÌṩÁËJava/C++Á½ÖÖÓïÑÔµÄAPI£¨¾¡¹ÜÒ²ÌṩÁËPython API£¬µ«Éд¦ÓÚExperimental½×¶Î£©¡£Í¨¹ýÕâЩAPI£¬¿ÉÒÔ½øÐÐÈçÏÂһЩ²Ù×÷£º
Insert/Update/Delete
ÅúÁ¿Êý¾Ýµ¼Èë/¸üвÙ×÷
Scan(¿ÉÖ§³Ö¼òµ¥µÄFilter)
2.3 ÊÂÎñÓëÒ»ÖÂÐÔÄ£ÐÍ
Kudu½ö½öÌṩµ¥ÐÐÊÂÎñ£¬Ò²²»Ö§³Ö¶àÐÐÊÂÎñ¡£ÕâÒ»µãÓëHBaseÊÇÏàËÆµÄ¡£µ«ÔÚÊý¾ÝÒ»ÖÂÐÔÄ£ÐÍÉÏ£¬ÓëHBaseÓнϴóµÄÇø±ð¡£
KuduÌṩÁËÈçÏÂÁ½ÖÖÒ»ÖÂÐÔÄ£ÐÍ£º
Snapshot Consistency
ÕâÊÇKuduÖеÄĬÈÏÒ»ÖÂÐÔÄ£ÐÍ¡£ÔÚÕâÖÖÄ£ÐÍÖУ¬Ö»±£Ö¤Ò»¸ö¿Í»§¶ËÄܹ»¿´µ½×Ô¼ºËùÌá½»µÄд²Ù×÷£¬¶ø²¢²»±£ÕÏÈ«¾ÖµÄ£¨¿ç¶à¸ö¿Í»§¶ËµÄ£©ÊÂÎñ¿É¼ûÐÔ¡£
External Consistency
×îÔçÌá³öExternal Consistency»úÖÆµÄ£¬Ó¦¸ÃÊÇÔÚGoogleµÄSpannerÂÛÎÄÖС£´«Í³¹ØÏµÐÍÊý¾Ý¿âÖеÄÁ½½×¶ÎÌá½»»úÖÆ£¬ÐèÒªÁ½»ØºÏͨÐÅ£¬Õâ¹ý³ÌÖдøÀ´µÄ´ú¼ÛÊǽϸߵ쬵«Í¬Ê±Õâ¹ý³ÌÖеĸ´ÔÓµÄËø»úÖÆÒ²¿ÉÄÜ»á´øÀ´Ò»Ð©¿ÉÓÃÐÔÎÊÌâ¡£Ò»¸ö¸üºÃµÄʵÏÖ·Ö²¼Ê½ÊÂÎñ/Ò»ÖÂÐÔµÄ˼·£¬ÊÇ»ùÓÚÒ»¸öÈ«¾Ö·¢²¼µÄTimestamp»úÖÆ¡£SpannerÌá³öÁËCommit-waitµÄ»úÖÆ£¬À´±£ÕÏÈ«¾ÖÊÂÎñµÄÓÐÐòÐÔ£ºÈç¹ûÒ»¸öÊÂÎñT1µÄÌá½»ÏÈÓÚÁíÍâÒ»¸öÊÂÎñT2µÄ¿ªÊ¼£¬ÔòT1µÄTimestampҪСÓÚT2µÄTimeStamp¡£ÎÒÃÇÖªµÀ£¬ÔÚ·Ö²¼Ê½ÏµÍ³ÖУ¬ÊǺÜÄÑÓÚ×öÕâÑùµÄ³ÐŵµÄ¡£ÔÚHBaseÖУ¬ÎÒÃÇ¿ÉÒÔÏëÏó£¬Èç¹ûËùÓÐRegionServerÖеÄSequenceID·¢²¼×Ôͬһ¸öÊý¾ÝÔ´£¬ÄÇô£¬HBaseµÄºÜ¶àÊÂÎñÐÔÎÊÌâ¾ÍÓÈжø½âÁË£¬È»ºó×î´óµÄÎÊÌâÔÚÓÚÕâ¸öÈ«¾ÖµÄSequenceIDÊý¾ÝÔ´½«»áÊÇÕû¸öϵͳµÄÐÔÄÜÆ¿¾±µã¡£»Øµ½External
Consistency»úÖÆ£¬SpannerÊÇÒÀÀµÓڸ߾«¶ÈÓë¿ÉÔ¤¼ûÎó²îµÄ±¾µØÊ±ÖÓ(TrueTime API)ʵÏÖµÄ(¼´ÐèÒªÒ»¸ö¸ß¿É¿¿ºÍ¸ß¾«¶ÈµÄʱÖÓÔ´£¬Í¬Ê±£¬Õâ¸öʱÖÓµÄÎó²îÊÇ¿ÉÔ¤¼ûµÄ¡£¸ÐÐËȤµÄͬѧ¿ÉÒÔÔĶÁSpannerÂÛÎÄ£¬ÕâÀﲻ׸Êö)¡£KuduÖÐÌṩÁËÁíÍâÒ»ÖÖ˼·À´ÊµÏÖExternal
Consistency,»ùÓÚTimestampÀ©É¢»úÖÆ£¬¼´£¬¶à¸ö¿Í»§¶Ë¿ÉÏ໥ͨÐÅÀ´¸æÖª±Ë´ËËùÌá½»µÄTimestampÖµ£¬´Ó¶ø±£ÕÏÒ»¸öÈ«¾ÖµÄ˳Ðò¡£ÕâÖÖ»úÖÆÒ²ÊÇÏà¶Ô½ÏΪ¸´Ôӵġ£
ÓëSpannerÀàËÆ£¬Kudu²»ÔÊÐíÓû§×Ô¶¨ÒåÓû§Êý¾ÝµÄTimestamp£¬µ«ÔÚHBaseÖÐÈ´ÊDz»Í¬£¬Óû§¿ÉÒÔ·¢ÆðÒ»´Î»ùÓÚÄ³ÌØ¶¨TimestampµÄ²éѯ¡£
2.4 KuduµÄ¼Ü¹¹
KuduÒ²²ÉÓÃÁËMaster-SlaveÐÎʽµÄÖÐÐĽڵã¼Ü¹¹£¬¹ÜÀí½Úµã±»³Æ×÷Kudu Master£¬Êý¾Ý½Úµã±»³Æ×÷Tablet
Server£¨¿É¶Ô±ÈÀí½âHBaseÖеÄRegionServer½ÇÉ«£©¡£Ò»¸ö±íµÄÊý¾Ý£¬±»·Ö¸î³É1¸ö»ò¶à¸öTablet£¬Tablet±»²¿ÊðÔÚTablet
ServerÀ´ÌṩÊý¾Ý¶Áд·þÎñ¡£?
Kudu MasterÔÚKudu¼¯ÈºÖУ¬·¢»ÓÈçϵÄһЩ×÷Óãº
1. ÓÃÀ´´æ·ÅһЩ±íµÄSchemaÐÅÏ¢£¬ÇÒ¸ºÔð´¦Àí½¨±íµÈÇëÇó¡£
2. ¸ú×Ù¹ÜÀí¼¯ÈºÖеÄËùÓеÄTablet Server£¬²¢ÇÒÔÚTablet ServerÒì³£Ö®ºóе÷Êý¾ÝµÄÖØ²¿Êð¡£
3. ´æ·ÅTabletµ½Tablet ServerµÄ²¿ÊðÐÅÏ¢¡£
TabletÓëHBaseÖеÄRegion´óÖÂÏàËÆ£¬µ«´æÔÚÈçÏÂһЩÃ÷ÏÔµÄÇø±ðµã£º
Tablet°üº¬Á½ÖÖ·ÖÇø²ßÂÔ£¬Ò»ÖÖÊÇ»ùÓÚHash Partition·½Ê½£¬ÔÚÕâÖÖ·ÖÇø·½Ê½ÏÂÓû§Êý¾Ý¿É½Ï¾ùÔȵķֲ¼ÔÚ¸÷¸öTabletÖУ¬µ«ÔÀ´µÄÊý¾ÝÅÅÐòÌØµãÒѱ»´òÂÒ¡£ÁíÍâÒ»ÖÖÊÇ»ùÓÚRange
Partition·½Ê½£¬Êý¾Ý½«°´ÕÕÓû§Êý¾ÝÖ¸¶¨µÄÓÐÐòµÄPrimary Key ColumnsµÄ×éºÏStringµÄ˳Ðò½øÐзÖÇø¡£¶øHBaseÖнö½öÌṩÁËÒ»ÖÖ°´Óû§Êý¾ÝRowKeyµÄRange
Partition·½Ê½¡£
Ò»¸öTablet¿ÉÒÔ±»²¿Êðµ½Á˶à¸öTablet ServerÖС£ÔÚHBase×î³õµÄ¼Ü¹¹ÖУ¬Ò»¸öRegionÖ»Äܱ»²¿ÊðÔÚÒ»¸öRegionServerÖУ¬ËüµÄÊý¾Ý¶à¸±±¾½»ÓÉHDFSÀ´±£ÕÏ¡£´Ó1.0°æ±¾¿ªÊ¼£¬HBaseÓÐÁËRegion
Replica£¨HBASE-10070£©ÌØÐÔ£¬¸ÃÌØÐÔÔÊÐí½«Ò»¸öRegion²¿ÊðÔÚ¶à¸öRegionServerÖÐÀ´ÌáÉý¶ÁÈ¡µÄ¿ÉÓÃÐÔ£¬µ«¶àRegion¸±±¾Ö®¼äµÄÊý¾ÝÈ´²»ÊÇʵʱͬ²½µÄ¡£

ͼ1 KuduµÄÊý¾Ý¶à¸±±¾»úÖÆ

ͼ2 HBaseµÄÊý¾Ý¶à¸±±¾»úÖÆ
2.5 KuduµÄµ×²ãÊý¾ÝÄ£ÐÍ
KuduµÄµ×²ãÊý¾ÝÎļþµÄ´æ´¢£¬Î´²ÉÓÃHDFSÕâÑùµÄ½Ï¸ß³éÏó²ã´ÎµÄ·Ö²¼Ê½Îļþϵͳ£¬¶øÊÇ×ÔÐпª·¢ÁËÒ»Ì׿ɻùÓÚTable/Tablet/ReplicaÊÓͼ¼¶±ðµÄµ×²ã´æ´¢ÏµÍ³¡£ÕâÌ×ʵÏÖ»ùÓÚÈçÏµļ¸¸öÉè¼ÆÄ¿±ê£º
¿ÉÌṩ¿ìËÙµÄÁÐʽ²éѯ¡£
¿ÉÖ§³Ö¿ìËÙµÄËæ»ú¸üÐÂ
¿ÉÌṩ¸üΪÎȶ¨µÄ²éѯÐÔÄܱ£ÕÏ¡£
ΪÁËʵÏÖÈçÉÏÄ¿±ê£¬Kudu²Î¿¼ÁËÒ»ÖÖÀàËÆÓÚFractured MirrorsµÄ»ìºÏÁд洢¼Ü¹¹¡£TabletÔڵײ㱻½øÒ»²½Ï¸·Ö³ÉÁËÒ»¸ö³ÆÖ®ÎªRowSetsµÄµ¥Ôª£º

ͼ3 RowSets
MemRowSets¿ÉÒÔ¶Ô±ÈÀí½â³ÉHBaseÖеÄMemStore, ¶øDiskRowSets¿ÉÀí½â³ÉHBaseÖеÄHFile¡£MemRowSetsÖеÄÊý¾Ý°´ÕÕÐÐÊÔͼ½øÐд洢£¬Êý¾Ý½á¹¹ÎªB-Tree¡£MemRowSetsÖеÄÊý¾Ý±»Flushµ½´ÅÅÌÖ®ºó£¬ÐγÉDiskRowSets¡£
DisRowSetsÖеÄÊý¾Ý£¬°´ÕÕ32MB´óСΪµ¥Î»£¬°´Ðò»®·ÖΪһ¸ö¸öµÄDiskRowSet¡£
DiskRowSetÖеÄÊý¾Ý°´ÕÕColumn½øÐÐ×éÖ¯£¬ÓëParquetÀàËÆ¡£ÕâÊÇKudu¿ÉÖ§³ÖһЩ·ÖÎöÐÔ²éѯµÄ»ù´¡¡£Ã¿Ò»¸öColumnµÄÊý¾Ý±»´æ´¢ÔÚÒ»¸öÏàÁÚµÄÊý¾ÝÇøÓò£¬¶øÕâ¸öÊý¾ÝÇøÓò½øÒ»²½±»Ï¸·Ö³ÉÒ»¸ö¸öµÄСµÄPageµ¥Ôª£¬ÓëHBase
FileÖеÄBlockÀàËÆ£¬¶Ôÿһ¸öColumn Page¿É²ÉÓÃһЩEncodingËã·¨£¬ÒÔ¼°Ò»Ð©Í¨ÓõÄCompressionËã·¨¡£
¼ÈÈ»¿É¶ÔColumn Page¿É²ÉÓÃEncodingÒÔ¼°CompressionËã·¨£¬ÄÇô£¬¶Ôµ¥Ìõ¼Ç¼µÄ¸ü¸Ä¾Í»á±È½ÏÀ§ÄÑÁË¡£Ç°ÃæÌáµ½ÁËKudu¿ÉÖ§³Öµ¥Ìõ¼Ç¼¼¶±ðµÄ¸üÐÂ/ɾ³ý£¬ÊÇÈçºÎ×öµ½µÄ£¿ÓëHBaseÀàËÆ£¬Ò²ÊÇͨ¹ýÔö¼ÓÒ»ÌõеļǼÀ´ÃèÊöÕâ´Î¸üÐÂ/ɾ³ý²Ù×÷µÄ¡£Ò»¸öDiskRowSet°üº¬Á½²¿·ÖÊý¾Ý£º»ù´¡Êý¾Ý(Base
Data)£¬ÒÔ¼°±ä¸üÊý¾Ý(Delta Stores)¡£¸üÐÂ/ɾ³ý²Ù×÷ËùÉú³ÉµÄÊý¾Ý¼Ç¼£¬±»±£´æÔÚ±ä¸üÊý¾Ý²¿·Ö¡£

ͼ4 Delta Store Design
´ÓÉÏͼ£¨Ô´×ÔKuduµÄÔ´¹¤³ÌÎļþ£©À´¿´£¬DeltaÊý¾Ý²¿·ÖÓ¦¸Ã°üº¬REDOÓëUNDOÁ½²¿·Ö£¬ÕâÀïµÄREDOÓëUNDOÓë¹ØÏµÐÍÊý¾Ý¿âÖеÄREDOÓëUNDOÈÕÖ¾ÀàËÆ£¨ÔÚ¹ØÏµÐÍÊý¾Ý¿âÖУ¬REDOÈÕÖ¾¼Ç¼Á˸üкóµÄÊý¾Ý£¬¿ÉÒÔÓÃÀ´»Ö¸´ÉÐδдÈëData
FileµÄÒѳɹ¦ÊÂÎñ¸üеÄÊý¾Ý¡£ ¶øUNDOÈÕÖ¾ÓÃÀ´¼Ç¼ÊÂÎñ¸üÐÂ֮ǰµÄÊý¾Ý£¬¿ÉÒÔÓÃÀ´ÔÚÊÂÎñʧ°Üʱ½øÐлعö£©£¬µ«Ò²´æÔÚһЩϸ½ÚÉϵIJîÒ죺
REDO Delta Files°üº¬ÁËBase Data×ÔÉÏÒ»´Î±»Flush/CompactionÖ®ºóµÄ±ä¸üÖµ¡£REDO
Delta Files°´ÕÕTimestamp˳ÐòÅÅÁС£
UNDO Delta Files°üº¬ÁËBase Data×ÔÉÏÒ»´ÎFlush/Compaction֮ǰµÄ±ä¸üÖµ¡£ÕâÑù²Å¿ÉÒÔ±£ÕÏ»ùÓÚÒ»¸ö¾ÉTimestampµÄ²éѯÄܹ»¿´µ½Ò»¸öÒ»ÖÂÐÔÊÓͼ¡£UNDO°´ÕÕTimestampµ¹ÐòÅÅÁС£
2.6 Êý¾Ý¶ÁдÁ÷³Ì
дÊý¾ÝµÄÁ÷³Ì£¬ÈçÏÂͼËùʾ£º

ͼ5 Write Path
Kudu²»ÔÊÐíÓû§Êý¾ÝµÄPrimary KeyÖØ¸´£¬Òò´Ë£¬ÔÚTabletÄÚ²¿Ð´ÈëÊý¾Ý֮ǰ£¬ÐèÒªÏÈ´ÓÒÑÓеÄÊý¾ÝÖмì²éµ±Ç°ÐÂдÈëµÄÊý¾ÝµÄPrimary
KeyÊÇ·ñÒѾ´æÔÚ£¬¾¡¹ÜÔÚDiskRowSetsÖÐÔö¼ÓÁËBloomFilterÀ´ÌáÉýÕâÖÖÅжϵÄЧÂÊ£¬µ«¿ÉÒÔÔ¤¼û£¬KuduµÄÕâÖÖÉè¼Æ½«»áÃ÷ÏÔÔö´óдÈëµÄʱÑÓ¡£
Êý¾ÝÒ»¿ªÊ¼ÏÈ´æ·ÅÓÚMemRowSetsÖУ¬´ý´óС³¬³öÒ»¶¨µÄãÐÖµÖ®ºó£¬ÔÙFlush³ÉDiskRowSets¡£Õⲿ·ÖÒѾÔÚͼ4ÖÐÓÐÏêϸµÄ½éÉÜ¡£Ëæ×ÅFlush´ÎÊýµÄ²»¶ÏÔö¼Ó£¬Éú³ÉµÄDiskRowSetsÒ²»á²»¶ÏµÄÔö¶à£¬ÔÚKuduÄÚ²¿Ò²´æÔÚÒ»¸öCompactionÁ÷³Ì£¬ÕâÑù¿ÉÒÔ½«ÒѾ´æÔڵĶà¸ö´æÔÚPrimary
Key½»¼¯µÄDiskRowSetsÖØÐÂÅÅÐò¶øÉú³ÉÒ»¸öеÄDiskRowSets¡£ÈçÏÂͼËùʾ£º

ͼ6 RowSet Compaction
¶ÁÊý¾ÝµÄÁ÷³Ì£¬¼ÈÒª¿¼ÂÇ´æÔÚÓÚÄÚ´æÖеÄMemRowSets,ÓÖÒª¶ÁȡλÓÚ´ÅÅÌÖеÄÒ»¸ö»ò¶à¸öDiskRowSets£¬ÔÚScannerµÄ¸ß²ã³éÏóÖУ¬Ó¦¸ÃÓëHBaseÀàËÆ¡£ÈçÏÂÖØµãÌáһЩϸ½ÚµÄÓÅ»¯µã£º
ͨ¹ýScanµÄ·¶Î§£¬Óëÿһ¸öDiskRowSetsÖеÄPrimary Key Range½øÐжԱȣ¬¿ÉÒÔÊ×ÏȹýÂ˵ôһЩ²»±ØÒª²ÎÓë´Ë´ÎScanµÄDiskRowSets¡£
Delta Store²¿·Ö£¬Õë¶Ô¼Ç¼¼¶±ðµÄ¸ü¸Ä£¬¼Ç¼ÁËBase DataÖжÔÓ¦ÔʼÊý¾ÝµÄOffset¡£ÕâÑù£¬ÔÚÅжÏÒ»Ìõ¼Ç¼ÊÇ·ñ´æÔÚ¸ü¸ÄµÄ¼Ç¼ʱ£¬½«»á¸ü¼ÓµÄ¿ìËÙ¡£
ÓÉÓÚDiskRowSetsµÄµ×²ãÎļþÊǰ´ÕÕÁÐ×éÖ¯µÄ£¬»ùÓÚһЩÁеÄÌõ¼þ½øÐйýÂ˲éѯʱ£¬¿ÉÒÔÓÅÏȹýÂ˵ôһЩ²»±ØÒªµÄPrimary
Keys¡£Kudu²¢²»»áÔÚÒ»¿ªÊ¼¶ÁÈ¡µÄʱºò¾Í½«Ò»ÐÐÊý¾ÝµÄËùÓÐÁжÁÈ¡³öÀ´£¬¶øÊÇÏȶÁÈ¡Óë¹ýÂËÌõ¼þÏà¹ØµÄÁУ¬Í¨¹ý½«ÕâЩÁÐÓë²éѯÌõ¼þÆ¥ÅäÖ®ºó£¬ÔÙÀ´¾ö¶¨ÊÇ·ñÈ¥¶ÁÈ¡·ûºÏÌõ¼þµÄÐÐÖÐµÄÆäËüµÄÁÐÐÅÏ¢¡£ÕâÑù¿ÉÒÔ½ÚʡһЩ´ÅÅÌIO¡£Õâ¾ÍÊÇKuduËùÌṩµÄLazy
MaterializationÌØÐÔ¡£
2.7 RaftÄ£ÐÍ
KuduµÄ¶à¸±±¾Ö®¼äµÄÊý¾Ý¹²Ê¶ÐÒé²ÉÓÃÁËRaftÐÒ飬RaftÊDZÈPaxos¸üÈÝÒ×Àí½âÇÒ¸ü¼òµ¥µÄÒ»ÖÖÒ»ÖÂÐÔÐÒé¡£
¹ØÓÚRaftµÄ¸ü¶àÐÅÏ¢£¬Çë²Î¿¼£ºhttps://raft.github.io/
3 KuduÓëHBaseµÄÇø±ð
ÕâÀïÔÙ×ܽáÒ»ÏÂKuduÓëHBaseµÄһЩ´óµÄÇø±ðµã£º
KuduµÄÊý¾Ý·ÖÇø·½Ê½Ïà¶Ô¶àÑù»¯£¬¶øHBase½Ïµ¥Ò»¡£
KuduµÄTablet×ÔÉí¾ß±¸¶à¸±±¾»úÖÆ£¬¶øHBaseµÄRegionÒÀÀµÓڵײãHDFSµÄ¶à¸±±¾»úÖÆ¡£
Kuduµ×²ãÖ±½Ó²ÉÓñ¾µØÎļþϵͳ£¬ ¶øHBaseÒÀÀµÓÚHDFS¡£
KuduµÄµ×²ãÎļþ¸ñʽ²ÉÓÃÁËÀàËÆÓÚParquetµÄÁÐʽ´æ´¢¸ñʽ£¬¶øHBaseµÄµ×²ãHFileÎļþÈ´Êǰ´ÐÐÀ´×éÖ¯µÄ¡£
Kudu¹ØÓڵײãµÄFlushÈÎÎñÒÔ¼°CompactionÈÎÎñ£¬Äܹ»½áºÏæʱ»òÕßÏÐʱ½øÐÐ×Ô¶¯µÄµ÷Õû¡£HBase»¹Éв»¾ß±¸ÕâÖÖµ÷¶ÈÄÜÁ¦¡£
KuduµÄCompactionÎÞMinor/MajorµÄÇø·Ö£¬ÏÞÖÆÃ¿Ò»´ÎCompactionµÄIO×ÜÁ¿ÔÚ128MB´óС£¬Òò´Ë£¬²¢²»´æÔÚ³¤¾ÃÖ´ÐеÄCompactionÈÎÎñ¡£
CompactionÊǰ´Ðè½øÐеģ¬ÀýÈ磬Èç¹ûËùÓеÄдÈë¶¼ÊÇ˳ÐòдÈ룬Ôò½«²»»á´¥·¢Compaction¡£
KuduµÄÉè¼Æ£¬¼È¼æ¹ËÁË·ÖÎöÐ͵IJéѯÄÜÁ¦£¬ÓÖ¼æ¹ËÁËËæ»ú¶ÁдÄÜÁ¦£¬ÕâÑù£¬ÊƱØÒ²»á¸¶³öһЩ´ú¼Û¡£ ÀýÈ磬дÈëÊý¾Ýʱ¹ØÓÚPrimary
KeyΨһÐÔµÄÏÞÖÆ£¬¾ÍÒªÇóдÈëǰҪ¼ì²é¶ÔÓ¦µÄPrimary KeyÊÇ·ñÒѾ´æÔÚ£¬ÕâÑùÊÆ±Ø»áÔö´óдÈëµÄʱÑÓ¡£¶øµ×²ã¾¡¹Ü²ÉÓÃÁËÀàËÆÓÚParquetµÄÁÐʽÎļþÉè¼Æ£¬µ«ÓëHBaseÀàËÆµÄÈß³¤µÄ¶Áȡ·¾¶£¬Ò²»á¶Ô·ÖÎöÐԵIJéѯ´øÀ´Ò»Ð©Ó°Ïì¡£ÁíÍ⣬ÕâÖÖÉè¼ÆÔÚÕûÐжÁȡʱ£¬Ò²»á¸¶³ö½Ï¸ßµÄ´ú¼Û¡£
4 KuduÓëÏÖÓÐϵͳµÄ¶Ô½Ó
KuduÌṩÁËÓëÈçÏÂһЩϵͳµÄ¶Ô½Ó£º
MapReduce: ÌṩÕë¶ÔKuduÓû§±íµÄInputÒÔ¼°OutputÈÎÎñ¶Ô½Ó¡£
Spark: ÌṩÓëSpark SQLÒÔ¼°DataFramesµÄ¶Ô½Ó¡£
Impala: Kudu×ÔÉíδÌṩShellÒÔ¼°SQL Parser£¬ËùÒÔ£¬ËüµÄSQLÄÜÁ¦Ô´×ÔÓëImpalaµÄ¼¯³É¡£ÔÚÕâЩ¼¯³ÉÖУ¬Äܹ»ºÜºÃµÄ¸ÐÖªKudu±íÊý¾ÝµÄ±¾µØÐÔÐÅÏ¢£¬Äܹ»³ä·ÖÀûÓÃKuduËùÌṩµÄ¹ýÂËÆ÷¶Ô²éѯ½øÐÐÓÅ»¯£¬Í¬Ê±£¬Impala±¾ÉíµÄDDL/DMLÓï·¨Õë¶ÔKuduÒ²×öÁËһЩÀ©Õ¹¡£¿ÉÒÔÏëÏó£¬ClouderaÔÚImpalaÓëKuduµÄ¼¯³ÉÉÏ£¬Ò»¶¨»áÓиü¶àµÄ·¢Á¦µã¡£
5KuduµÄÊÊÓó¡¾°
Todd LipconÔÚStrata+Hadoop World 2015´ó»áÉÏËùÌṩµÄÖ÷ÌâΪ¡¶Kudu:
Resolving transactional and analytic trade-offs in
Hadoop¡·µÄÑݽ²ÖУ¬ÕâÑù×ÓÃèÊöKuduµÄÊÊÓó¡¾°£º

6 Kudu BenchmarkÊý¾Ý½âÎö
ÈçÏÂÊǶÔKudu WhitePageÖÐËùÌṩµÄһЩBenchmarkÐÔÄܲâÊÔÊý¾ÝµÄ¼òµ¥½âÎö(ÏêϸµÄ½á¹ûÇë²Î¿¼ÂÛÎĵĵÚ6Õ½Ú)£º
1.»ùÓÚTPC-H²âÊÔ±ê×¼£ºÕë¶ÔImpala On ParquetÒÔ¼°Impala
On Kudu×öÁ˶ԱȲâÊÔ£¬Impala On KuduµÄƽ¾ùÐÔÄܱÈImpala On ParquetÌáÉýÁË31%¡£ÕâÊÇÓÉÓÚKuduËùÌṩµÄLazy
MeterializationÌØÐÔÒÔ¼°¶Ô¶ÔCPUЧÂʵÄÌáÉý¶ø´øÀ´µÄ³É¹û¡£
2.Impala-KuduÓëPhoenix-HBaseµÄ¶Ô±È£º²âÊÔʹÓõ½ÁËTPC-HÖеÄlineitemÒ»±í£¬¹²µ¼ÈëÁË62GBµÄCSV¸ñʽµÄÊý¾Ý¡£ÔÚµ¼ÈëPhoenixʱʹÓÃÁËPhoenixËùÌṩµÄCsvBulkLoadTool¹¤¾ß¡£²âÊÔʱµÄһЩÅäÖÃÐÅÏ¢ÈçÏÂËùʾ£º
ΪPhoenix±í»®·ÖÁË100¸öHash Partitions¡£ÎªKudu´´½¨ÁË100¸öTablets¡£
HBase²ÉÓÃĬÈϵÄBlock Cache²ßÂÔ£¬ÎªÃ¿Ò»¸öRegionServerÅäÖÃÁË9.6GBµÄCacheÄÚ´æ¡£¶øKuduÅäÖÃÁË1GBµÄBlock
CacheµÄ½ø³ÌÄڴ棬µ«Í¬Ê±»¹ÒÀÀµÓÚ²Ù×÷ϵͳµÄBuffer¡£
HBase±íÖвÉÓÃÁËFAST_DIFFµÄBlock EncodingËã·¨£¬Î´ÆôÓÃÈκÎѹËõ¡£
Êý¾Ýµ¼Èëµ½HBaseÖÐÖ®ºó£¬Ö÷¶¯´¥·¢ÁËÒ»´ÎMajor Compaction£¬À´È·±£Êý¾ÝµÄ±¾µØ»¯ÂÊ¡£62GBÔʼÊý¾Ýµ¼Èëµ½HBaseÖÐÖ®ºóµÄ×Ü´óСԼΪ570GB£¨ÕâÊÇÓÉÓÚδÆôÓÃCompressionѹËõ£¬Í¬Ê±£¬ÓÉÓÚ¶à¸öÁж¼ÊǶÀÁ¢´æÔڵĴøÀ´µÄÅòÕ͵¼Ö£©£¬¶øµ¼Èëµ½KuduÖÐÖ®ºóµÄ´óСԼΪ227GB¡£ÈçÏÂÊÇÏàÓ¦µÄ¶Ô±È²âÊÔ³¡¾°ÒÔ¼°¶Ô±È½á¹û£º

³ýÁË»ùÓÚKeyÖµµÄÕûÐÐÊý¾ÝµÄ²éѯÐÔÄÜ£¬PhoenixÓÐÃ÷ÏÔµÄÓÅÊÆÒÔÍ⣬ÆäËüµÄ»ùÓÚÕû±íɨÃ裬»òÕßÊÇ»ùÓÚһЩÁеIJéѯ£¬Impala-KuduÊÇÓÐÃ÷ÏÔµÄÓÅÊÆµÄ¡£
»ùÓÚScan + FilterµÄ²éѯ£¬HBase±¾Éí¾Í²»É󤡣
3.Ëæ»ú¶ÁдÄÜÁ¦µÄ¶Ô±È
ÈçÏÂÊǶԱȲâÊÔµÄһЩ³¡¾°£º

ÈçÏÂÊǶԱȲâÊԵĽá¹û£º

¹ØÓÚ¼ÓÔØÒÔ¼°Zipfian·Ö²¼Ä£Ê½Ï£¬HBaseµÄÓÅÊÆ¸ü¼ÓÃ÷ÏÔ£¬µ±Ç°KuduÒ²ÕýÔÚ×ö¹ØÓÚZipfian·Ö²¼Ä£Ê½ÏµÄÓÅ»¯£¨KUDU-749£©£¬¶øÔÚUniformģʽÏ£¬HBaseµÄÓÅÊÆÉÔÈõ¡£ÕûÌåÀ´¿´£¬ÔÚËæ»ú¶ÁдÉÏ£¬KuduµÄÉè¼Æ½ÏÖ®HBase¶øÑÔ£¬´æÔÚһЩÁÓÊÆ£¬ÕâÊÇΪÁ˼æ¹Ë·ÖÎöÐͲéѯËù¸¶³öµÄһЩ´ú¼Û¡£
|