±à¼ÍƼö: |
±¾ÎÄ´óÖ½éÉÜÁËTokuDBÊÂÎñµÄ¸ôÀëÐÔʵÏÖÔÀí£¬
°üÀ¨TokuDBµÄÊÂÎñ±íʾ¡¢·ÖÐÎÊ÷µÄLeafEntryµÄ½á¹¹¡¢MVCCµÄʵÏÖÁ÷³Ì¡¢¶à°æ±¾¼Ç¼»ØÊÕ·½Ê½ÕâЩ·½ÃæµÄÄÚÈÝ¡£
±¾ÎÄÀ´×ÔÓÚ΢ÐÅÌýÔÆ£¬ÓÉ»ðÁú¹ûÈí¼þAnna±à¼¡¢ÍƼö¡£
|
|
ÔÚ°²×°MariaDBµÄʱºòÁ˽⵽´úÌæInnoDBµÄTokuDB£¬¿´¼ò½é·Ç³£µÄ°ô£¬ÕâÀï¶ÔToduDB×öÒ»¸ö³õ²½µÄÕûÀí£¬Ê¹ÓúóÔÙ×ö¸ü¶àµÄ·ÖÏí¡£
ʲôÊÇTokuDB£¿
ÔÚMySQL×îÁ÷ÐеÄÖ§³ÖÈ«ÊÂÎñµÄÒýÇæÎªINNODB¡£ÆäÌØµãÊÇÊý¾Ý±¾ÉíÊÇÓÃB-TREEÀ´×éÖ¯£¬Êý¾Ý±¾Éí¼´ÊÇÅÓ´óµÄ¸ù¾ÝÖ÷¼ü¾Û´ØµÄB-TREEË÷Òý¡£
ËùÒÔÔÚÕâµãÉÏ£¬Ð´ÈëËٶȾͻáÓÐЩ½µµÍ£¬ÒòΪҪÿ´ÎдÈëÒªÓÃÒ»´ÎIOÀ´×öË÷ÒýÊ÷µÄÖØÅÅ¡£ÌرðÊǵ±Êý¾ÝÁ¿±¾Éí±ÈÄÚ´æ´óºÜ¶àµÄÇé¿öÏ£¬CPU±¾Éí±»´ÅÅÌIO¾À²øµÄ×ö²»ÁËÆäËûÊÂÇéÁË¡£ÕâʱÎÒÃÇÒª¿¼ÂÇÈçºÎ¼õÉÙ¶Ô´ÅÅ̵ÄIOÀ´ÅŽâCPUµÄ´¦¾³£¬³£¼ûµÄ·½·¨ÓУº
°ÑINNODB ¸öPAGEÔö´ó£¨Ä¬ÈÏ16KB£©£¬µ«Ôö´óÒ²¾Í´øÀ´ÁËһЩȱÏÝ¡£ ±ÈÈ磬¶Ô´ÅÅ̽øÐÐCHECKPOINTµÄʱ¼ä½«ÑÓºó¡£
°ÑÈÕÖ¾Îļþ·Åµ½¸ü¿ìËٵĴÅÅÌÉÏ£¬±ÈÈçSSD¡£
TokuDB ÊÇÒ»¸öÖ§³ÖÊÂÎñµÄ¡°Ð¡±ÒýÇæ£¬ÓÐ×ųöÉ«µÄÊý¾ÝѹËõ¹¦ÄÜ£¬ÓÉÃÀ¹ú TokuTek ¹«Ë¾£¨ÏÖÔÚÒѾ±»
Percona ¹«Ë¾ÊÕ¹º£©Ñз¢¡£ÓµÓгöÉ«µÄÊý¾ÝѹËõ¹¦ÄÜ£¬Èç¹ûÄúµÄÊý¾Ýд¶à¶ÁÉÙ£¬¶øÇÒÊý¾ÝÁ¿±È½Ï´ó£¬Ç¿ÁÒ½¨ÒéÄúʹÓÃTokuDB£¬ÒÔ½ÚÊ¡¿Õ¼ä³É±¾£¬²¢´ó·ù¶È½µµÍ´æ´¢Ê¹ÓÃÁ¿ºÍIOPS¿ªÏú£¬²»¹ýÏàÓ¦µÄ»áÔö¼Ó
CPU µÄѹÁ¦¡£
TokuDB µÄÌØÐÔ
1.·á¸»µÄË÷ÒýÀàÐÍÒÔ¼°Ë÷ÒýµÄ¿ìËÙ´´½¨
TokuDB ³ýÁËÖ§³ÖÏÖÓеÄË÷ÒýÀàÐÍÍ⣬ »¹Ôö¼ÓÁË(µÚ¶þ)¼¯ºÏË÷Òý, ÒÔÂú×ã¶àÑùÐԵĸ²¸ÇË÷ÒýµÄ²éѯ,
ÔÚ¿ìËÙ´´½¨Ë÷Òý·½ÃæÌá¸ßÁ˲éѯµÄЧÂÊ
2.(µÚ¶þ)¼¯ºÏË÷Òý
Ò²¿ÉÒÔ³Æ×÷·ÇÖ÷¼üµÄ¼¯ºÏË÷Òý, ÕâÀàË÷ÒýÒ²°üº¬Á˱íÖеÄËùÓÐÁÐ, ¿ÉÒÔÓÃÓÚ¸²¸ÇË÷ÒýµÄ²éѯÐèÒª, ±ÈÈçÒÔÏÂʾÀý,
ÔÚwhere Ìõ¼þÖÐÖ±½ÓÃüÖÐ index_b Ë÷Òý, ±ÜÃâÁË´ÓÖ÷¼üÖÐÔÙ²éÕÒÒ»´Î¡£

¼û
3.Ë÷ÒýÔÚÏß´´½¨(Hot Index Creation)
TokuDB ÔÊÐíÖ±½Ó¸ø±íÔö¼ÓË÷Òý¶ø²»Ó°Ïì¸üÐÂÓï¾ä(insert, update µÈ)µÄÖ´ÐС£¿ÉÒÔͨ¹ý±äÁ¿
tokudb_create_index_online À´¿ØÖÆÊÇ·ñ¿ªÆô¸ÃÌØÐÔ, ²»¹ýÒź¶µÄÊÇĿǰ»¹Ö»ÄÜͨ¹ý
CREATE INDEX Ó﷨ʵÏÖÔÚÏß´´½¨, ²»ÄÜͨ¹ý ALTER TABLE ʵÏÖ¡£ ÕâÖÖ·½Ê½±Èͨ³£µÄ´´½¨·½Ê½ÂýÁËÐí¶à,
´´½¨µÄ¹ý³Ì¿ÉÒÔͨ¹ý show processlist ²é¿´¡£²»¹ý tokudb ²»Ö§³ÖÔÚÏßɾ³ýË÷Òý,
ɾ³ýË÷ÒýµÄʱºò»á¶Ô±ê¼ÓÈ«¾ÖËø¡£

4.ÔÚÏ߸ü¸ÄÁÐ(Add, Delete, Expand, Rename)
TokuDB ¿ÉÒÔÔÚÇá΢×èÈû¸üлò²éѯÓï¾äµÄÇé¿öÏ£¬ ÔÊÐíʵÏÖÒÔϲÙ×÷£º
Ôö¼Ó»òɾ³ý±íÖеÄÁÐ
À©³ä×Ö¶Î: char, varchar, varbinary ºÍ int ÀàÐ͵ÄÁÐ
ÖØÃüÃûÁÐ, ²»Ö§³Ö×Ö¶ÎÀàÐÍ: TIME, ENUM, BLOB, TINYBLOB, MEDIUMBLOB,
LONGBLOB
ÕâЩ²Ù×÷ͨ³£ÊÇÒÔ±íËø¼¶±ð×èÈû(¼¸ÃëÖÓʱ¼ä)ÆäËû²éѯµÄÖ´ÐÐ, µ±±í¼Ç¼Ï´δӴÅÅ̼ÓÔØµ½ÄÚ´æµÄʱºò,
ϵͳ¾Í»áËæÖ®¶Ô¼Ç¼½øÐÐÐ޸IJÙ×÷(add, delete »ò expand)£¬ Èç¹ûÊÇ rename
²Ù×÷, Ôò»áÔÚ¼¸ÃëÖÓµÄÍ£»úʱ¼äÄÚÍê³ÉËùÓвÙ×÷¡£
TokuDBµÄÕâЩ²Ù×÷²»Í¬ÓÚ InnoDB, ¶Ô±í½øÐиüкó¿ÉÒÔ¿´µ½ rows affected
Ϊ 0, ¼´¸ü¸Ä²Ù×÷»á·Åµ½ºǫִ́ÐÐ, ±È½Ï¿ìËÙµÄÔÒò¿ÉÄÜÊÇÓÉÓÚ Fractal-tree Ë÷ÒýµÄÌØÐÔ,
½«Ëæ»úµÄ IO ²Ù×÷Ìæ»»ÎªË³Ðò IO ²Ù×÷£¬ Fractal-treeµÄÌØÐÔÖУ¬»á½«ÕâЩ²Ù×÷¹ã²¥µ½ËùÓÐÐÐ,
²»Ïñ InnoDB, ÐèÒª open table ²¢´´½¨ÁÙʱ±íÀ´Íê³É¡£
¿´¿´¹Ù·½¶Ô¸ÃÌØÐÔµÄһЩָµ¼ËµÃ÷:
ËùÓеÄÕâЩ²Ù×÷²»ÊÇÁ¢¼´Ö´ÐУ¬ ¶øÊǷŵ½ºǫ́ÖÐÓÉ Fractal Tree Íê³É, ²Ù×÷°üÀ¨Ö÷¼üºÍ·ÇÖ÷¼üË÷Òý¡£Ò²¿ÉÒÔÊÖ¹¤Ç¿ÖÆÖ´ÐÐÕâЩ²Ù×÷,
ʹÓà OPTIMIZE TABLE X ÃüÁî¼´¿É, TokuDB ´Ó1.0 ¿ªÊ¼OPTIMIZE TABLEÃüÁîÒ²Ö§³ÖÔÚÏßÍê³É,
µ«ÊDz»»áÖØ½¨Ë÷Òý
²»ÒªÒ»´Î¸üжàÁÐ, ·Ö¿ª¶ÔÿÁнøÐвÙ×÷
±ÜÃâͬʱ¶ÔÒ»ÁнøÐÐ add, delete, expand »ò drop ²Ù×÷
±íËøµÄʱ¼äÖ÷ÒªÓÉ»º´æÖеÄÔàÒ³(dirty page)¾ö¶¨, ÔàÒ³Ô½¶à flush µÄʱ¼ä¾ÍÔ½³¤£¬Ã¿×öÒ»´Î¸üÐÂ,
MySQL ¶¼»á¹Ø±ÕÒ»´Î±íµÄÁ¬½ÓÒÔÊÍ·Å֮ǰµÄ×ÊÔ´
±ÜÃâɾ³ýµÄÁÐÊÇË÷ÒýµÄÒ»²¿·Ö, ÕâÀà²Ù×÷»áÌØ±ðÂý, ·ÇҪɾ³ýµÄ»°¿ÉÒÔÈ¥µôË÷ÒýºÍ¸ÃÁеĹØÁªÔÙ½øÐÐɾ³ý²Ù×÷
À©³äÀàµÄ²Ù×÷Ö»Ö§³Ö char, varchar, varbinary ºÍ int ÀàÐ͵Ä×Ö¶Î
Ò»´ÎÖ» rename Ò»ÁÐ, ²Ù×÷¶àÁлή¼¶Îª±ê×¼µÄ MySQL ÐÐΪ, Óï·¨ÖÐÁеÄÊôÐÔ±ØÐëÒªÖ¸¶¨ÉÏ,
ÈçÏÂ:

rename ²Ù×÷»¹²»Ö§³Ö×Ö¶Î: TIME, ENUM, BLOB, TINYBLOB, MEDIUMBLOB,
LONGBLOB.
²»Ö§³Ö¸üÐÂÁÙʱ±í;
5.Êý¾ÝѹËõ
TokuDBÖÐËùÓеÄѹËõ²Ù×÷¶¼ÔÚºǫִ́ÐÐ, ¸ß¼¶±ðµÄѹËõ»á½µµÍϵͳµÄÐÔÄÜ, ÓÐЩ³¡¾°Ï»áÐèÒª¸ß¼¶±ðµÄѹËõ.
°´ÕÕ¹Ù·½µÄ½¨Òé: 6ºËÊýÒÔϵĻúÆ÷½¨Òé±ê׼ѹËõ, ·´Ö®¿ÉÒÔʹÓø߼¶±ðµÄѹËõ¡£Ã¿¸ö±íÔÚ create
table »ò alter table µÄʱºòͨ¹ý ROW_FORMAT À´Ö¸¶¨Ñ¹ËõµÄËã·¨:

ROW_FORMATĬÈÏÓɱäÁ¿ tokudb_row_format ¿ØÖÆ, ĬÈÏΪ tokudb_zlib,
¿ÉÒÔµÄÖµ°üÀ¨:
tokudb_zlib: ʹÓà zlib ¿âµÄѹËõģʽ£¬ÌṩÁËÖеȼ¶±ðµÄѹËõ±ÈºÍÖеȼ¶±ðµÄCPUÏûºÄ¡£
tokudb_quicklz: ʹÓà quicklz ¿âµÄѹËõģʽ£¬ ÌṩÁËÇáÁ¿¼¶µÄѹËõ±ÈºÍ½ÏµÍ»ù±¾µÄCPUÏûºÄ¡£
tokudb_lzma: ʹÓÃlzma¿âѹËõģʽ£¬ÌṩÁ˸ßѹËõ±ÈºÍ¸ßCPUÏûºÄ¡£
tokudb_uncompressed: ²»Ê¹ÓÃѹËõģʽ¡£
6.Read free ¸´ÖÆÌØÐÔ
µÃÒæÓÚ Fracal Tree Ë÷ÒýµÄÌØÐÔ, TokuDB µÄ slave ¶ËÄܹ»ÒÔµÍÓÚ¶ÁIOµÄÏûºÄÀ´Ó¦ÓÃ
master ¶ËµÄ±ä»¯, ÆäÖ÷ÒªÒÀÀµ Fractal Tree Ë÷ÒýµÄÌØÐÔ£¬¿ÉÒÔÔÚÅäÖÃÀïÆôÓÃÌØÐÔ
insert/delete/update²Ù×÷²¿·Ö¿ÉÒÔÖ±½Ó²åÈëµ½ºÏÊ浀 Fractal Tree Ë÷ÒýÖÐ,
±ÜÃâ read-modify-write ÐÐΪµÄ¿ªÏú;
delete/update ²Ù×÷¿ÉÒÔºöÂÔΨһÐÔ¼ì²é´øÀ´µÄ IO ·½ÃæµÄ¿ªÏú
²»ºÃµÄÊÇ, Èç¹ûÆôÓÃÁË Read Free Replication ¹¦ÄÜ, Server ¶ËÐèÒª×öÈçÏÂÉèÖÃ:
master£º¸´ÖƸñʽ±ØÐëΪ ROW£¬ ÒòΪ tokudb »¹Ã»ÓÐʵÏÖ¶Ô auto-incrementº¯Êý½øÐмÓËø´¦Àí,
ËùÒÔ¶à¸ö²¢·¢µÄ²åÈëÓï¾ä¿ÉÄÜ»áÒýÆð²»È·¶¨µÄ auto-incrementÖµ, ÓÉ´ËÔì³ÉÖ÷´ÓÁ½±ßµÄÊý¾Ý²»Ò»ÖÂ.
slave£º¿ªÆô read-only; ¹Ø±ÕΨһÐÔ¼ì²é(set tokudb_rpl_unique_checks=0);¹Ø±Õ²éÕÒ(read-modify-write)¹¦ÄÜ(set
tokudb_rpl_lookup_rows=0);
slave ¶ËµÄÉèÖÿÉÒÔÔÚһ̨»ò¶ą̀ slave ÖÐÉèÖãºMySQL5.5 ºÍ MariaDB5.5ÖÐÖ»Óж¨ÒåÁËÖ÷¼üµÄ±í²ÅÄÜʹÓøù¦ÄÜ,
MySQL 5.6, Percona 5.6 ºÍ MariaDB 10.X ûÓдËÏÞÖÆ
7.ÊÂÎñ, ACID ºÍ»Ö¸´
ĬÈÏÇé¿öÏÂ, TokuDB ¶¨ÆÚ¼ì²éËùÓдò¿ªµÄ±í, ²¢¼Ç¼ checkpoint ÆÚ¼äËùÓеĸüÐÂ,
ËùÒÔÔÚϵͳ±ÀÀ£µÄʱºò, ¿ÉÒÔ»Ö¸´±íµ½Ö®Ç°µÄ״̬(ACID-compliant), ËùÓеÄÒÑÌá½»µÄÊÂÎñ»á¸üе½±íÀï,δÌá½»µÄÊÂÎñÔò½øÐлعö.
ĬÈϵļì²éÖÜÆÚÿ60sÒ»´Î, ÊÇ´Óµ±Ç°¼ì²éµãµÄ¿ªÊ¼Ê±¼äµ½Ï´μì²éµãµÄ¿ªÊ¼Ê±¼ä, Èç¹û checkpoint
ÐèÒª¸ü¶àµÄÐÅÏ¢, Ï´εÄcheckpoint ¼ì²é»áÁ¢¼´¿ªÊ¼, ²»¹ýÕâºÍ log ÎļþµÄƵ·±Ë¢ÐÂÓйØ.
Óû§Ò²¿ÉÒÔÔÚÈκÎʱºòÊÖ¹¤Ö´ÐÐ flush logs ÃüÁîÀ´ÒýÆðÒ»´Î checkpoint ¼ì²é; ÔÚÊý¾Ý¿âÕý³£¹Ø±ÕµÄʱºò,
ËùÓпªÆôµÄÊÂÎñ¶¼»á±»ºöÂÔ.
¹ÜÀíÈÕÖ¾µÄ´óС: TokuDB Ò»Ö±±£´æ×î½üµÄcheckpoingµ½ÈÕÖ¾ÎļþÖÐ, µ±ÈÕÖ¾´ïµ½100MµÄʱºò,
»áÆðÒ»¸öеÄÈÕÖ¾Îļþ; ÿ´ÎcheckpointµÄʱºò, ÈÕÖ¾ÖоÉÓÚµ±Ç°¼ì²éµãµÄ¶¼»á±»ºöÂÔ, Èç¹û¼ì²éµÄÖÜÆÚÉèÖ÷dz£´ó,
ÈÕÖ¾µÄÇåÀíÆµÂÊÒ²»á¼õÉÙ¡£ TokuDBÒ²»áΪÿ¸ö´ò¿ªµÄÊÂÎñά»¤»Ø¹öÈÕÖ¾, ÈÕÖ¾µÄ´óСºÍÊÂÎñÁ¿Óйأ¬
±»Ñ¹Ëõ±£´æµ½´ÅÅÌÖÐ, µ±ÊÂÎñ½áÊøºó£¬»Ø¹öÈÕÖ¾»á±»ÏàÓ¦ÇåÀí.
»Ö¸´: TokuDB×Ô¶¯½øÐлָ´²Ù×÷, ÔÚ±ÀÀ£ºóʹÓÃÈÕÖ¾ºÍ»Ø¹öÈÕÖ¾½øÐлָ´, »Ö¸´Ê±¼äÓÉÈÕÖ¾´óС(°üÀ¨Î´Ñ¹ËõµÄ»Ø¹öÈÕÖ¾)¾ö¶¨.
½ûÓÃд»º´æ: Èç¹ûÒª±£Ö¤ÊÂÎñ°²È«, ¾ÍµÃ¿¼Âǵ½Ó²¼þ·½ÃæµÄд»º´æ. TokuDB ÔÚ MySQL ÀïÒ²Ö§³ÖÊÂÎñ°²È«ÌØÐÔ(transaction
safe), ¶Ôϵͳ¶øÑÔ, Êý¾Ý¿â¸üеÄÊý¾Ý²»Ò»ÑùÕæµÄдµ½´ÅÅÌÀï, ¶øÊÇ»º´æÆðÀ´, ÔÚϵͳ±ÀÀ£µÄʱºò»¹ÊÇ»á³öÏÖ¶ªÊý¾ÝµÄÏÖÏó,
±ÈÈçTokuDB²»Äܱ£Ö¤¹ÒÔØµÄNFS¾í¿ÉÒÔÕý³£»Ö¸´, ËùÒÔÈç¹ûÒª±£Ö¤°²È«,×îºÃ¹Ø±Õд»º´æ, µ«ÊÇ¿ÉÄÜ»áÔì³ÉÐÔÄܵĽµµÍ.ͨ³£Çé¿öÏÂÐèÒª¹Ø±Õ´ÅÅ̵Äд»º´æ,
²»¹ý¿¼Âǵ½ÐÔÄÜÔÒò, XFSÎļþϵͳµÄ»º´æ¿ÉÒÔ¿ªÆô, ²»¹ý´©Ïß´íÎó¡±Disabling barriers¡±ºó£¬¾ÍÐèÒª¹Ø±Õ»º´æ.
һЩ³¡¾°ÏÂÐèÒª¹Ø±ÕÎļþϵͳ(ext3)»º´æ, LVM, ÈíRAID ºÍ´øÓÐ BBU(battery-backed-up)
ÌØÐÔµÄRAID¿¨
8.¹ý³Ì×·×Ù
TokuDB ÌṩÁË×·×Ù³¤Ê±¼äÔËÐÐÓï¾äµÄ»úÖÆ. ¶Ô LOAD DATA ÃüÁîÀ´Ëµ£¬SHOW PROCESSLIST
¿ÉÒÔÏÔʾ¹ý³ÌÐÅÏ¢, µÚÒ»¸öÊÇÀàËÆ ¡°Inserted about 1000000 rows¡± µÄ״̬ÐÅÏ¢,
ÏÂÒ»¸öÊÇÍê³É°Ù·Ö±ÈµÄÐÅÏ¢, ±ÈÈç ¡°Loading of data about 45% done¡±;
Ôö¼ÓË÷ÒýµÄʱºò, SHOW PROCESSLIST ¿ÉÒÔÏÔʾ CREATE INDEX ºÍ ALTER
TABLE µÄ¹ý³ÌÐÅÏ¢, Æä»áÏÔʾÐÐÊýµÄ¹ÀËãÖµ, Ò²»áÏÔʾÍê³ÉµÄ°Ù·Ö±È; SHOW PROCESSLIST
Ò²»áÏÔʾÊÂÎñµÄÖ´ÐÐÇé¿ö, ±ÈÈç committing »ò aborting ״̬¡£
9.Ç¨ÒÆµ½ TokuDB
¿ÉÒÔʹÓô«Í³µÄ·½Ê½¸ü¸Ä±íµÄ´æ´¢ÒýÇæ, ±ÈÈç ¡°ALTER TABLE ¡ ENGINE = TokuDB¡±
»ò mysqldump µ¼³öÔÙµ¹Èë, INTO OUTFILE ºÍ LOAD DATA INFILE
µÄ·½Ê½Ò²¿ÉÒÔ¡£
10.Èȱ¸
Percona Xtrabackup »¹Î´Ö§³Ö TokuDB µÄÈȱ¸¹¦ÄÜ, percona ҲΪ±íʾÓÐÖ§³ÖµÄ´òËã
http://www.percona.com/blog/2014/07/15/tokudb-tips-mysql-backups/
;¶ÔÓÚ´ó±í¿ÉÒÔʹÓà LVM ÌØÐÔ½øÐб¸·Ý, https://launchpad.net/mylvmbackup
, »ò mysdumper ½øÐб¸·Ý¡£TokuDB ¹Ù·½ÌṩÁËÒ»¸öÈȱ¸²å¼þ tokudb_backup.so,
¿ÉÒÔ½øÐÐÔÚÏß±¸·Ý, Ïê¼û https://github.com/Tokutek/tokudb-backup-plugin£¬
²»¹ýÆäÒÀÀµ backup-enterprise, ÎÞ·¨±àÒë³ö so ¶¯Ì¬¿â, ÊǸöÉÌÒµµÄÊշѰ汾,
¼û https://www.percona.com/doc/percona-server/5.6/tokudb/tokudb_installation.html
×ܽá
TokuDBµÄÓŵã:
¸ßѹËõ±È£¬Ä¬ÈÏʹÓÃzlib½øÐÐѹËõ£¬ÓÈÆäÊǶÔ×Ö·û´®(varchar,textµÈ)ÀàÐÍÓзdz£¸ßµÄѹËõ±È£¬±È½ÏÊʺϴ洢ÈÕÖ¾¡¢ÔʼÊý¾ÝµÈ¡£¹Ù·½Ðû³Æ¿ÉÒÔ´ïµ½1£º12¡£
ÔÚÏßÌí¼ÓË÷Òý£¬²»Ó°Ïì¶Áд²Ù×÷
HCADER ÌØÐÔ£¬Ö§³ÖÔÚÏß×Ö¶ÎÔö¼Ó¡¢É¾³ý¡¢À©Õ¹¡¢ÖØÃüÃû²Ù×÷£¬£¨Ë²¼ä»òÃë¼¶Íê³É£©
Ö§³ÖÍêÕûµÄACIDÌØÐÔºÍÊÂÎñ»úÖÆ
·Ç³£¿ìµÄдÈëÐÔÄÜ£¬ Fractal-treeÔÚÊÂÎñʵÏÖÉÏÓÐÓÅÊÆ,ÎÞundo log£¬¹Ù·½³ÆÖÁÉÙ±Èinnodb¸ß9±¶¡£
Ö§³Öshow processlist ½ø¶È²é¿´
Êý¾ÝÁ¿¿ÉÒÔÀ©Õ¹µ½¼¸¸öTB£»
²»»á²úÉúË÷ÒýË鯬£»
Ö§³Öhot column addition,hot indexing,mvcc
TokuDBȱµã£º
²»Ö§³ÖÍâ¼ü(foreign key)¹¦ÄÜ£¬Èç¹ûÄúµÄ±íÓÐÍâ¼ü£¬Çл»µ½ TokuDBÒýÇæºó£¬´ËÔ¼Êø½«±»ºöÂÔ¡£
TokuDB ²»ÊÊ´óÁ¿¶ÁÈ¡µÄ³¡¾°£¬ÒòΪѹËõ½âѹËõµÄÔÒò¡£CPUÕ¼Óûá¸ß2-3±¶£¬µ«ÓÉÓÚѹËõºó¿Õ¼äС£¬IO¿ªÏúµÍ£¬Æ½¾ùÏìӦʱ¼ä´ó¸ÅÊÇ2±¶×óÓÒ¡£
online ddl ¶Ôtext,blobµÈÀàÐ͵Ä×ֶβ»ÊÊÓÃ
ûÓÐÍêÉÆµÄÈȱ¸¹¤¾ß£¬Ö»ÄÜͨ¹ýmysqldump½øÐÐÂß¼±¸·Ý
ÊÊÓó¡¾°£º
·ÃÎÊÆµÂʲ»¸ßµÄÊý¾Ý»òÀúÊ·Êý¾Ý¹éµµ
Êý¾Ý±í·Ç³£´ó²¢ÇÒʱ²»Ê±»¹ÐèÒª½øÐÐDDL²Ù×÷
TokuDBµÄË÷Òý½á¹¹¨C·ÖÐÎÊ÷µÄʵÏÖ
TokuDBºÍInnoDB×î´óµÄ²»Í¬ÔÚÓÚTokuDB²ÉÓÃÁËÒ»ÖÖ½Ð×öFractal TreeµÄË÷Òý½á¹¹£¬Ê¹ÆäÔÚËæ»úдÊý¾ÝµÄ´¦ÀíÉÏÓкܴóÌáÉý¡£Ä¿Ç°ÎÞÂÛÊÇSQL
Server£¬»¹ÊÇMySQLµÄinnodb£¬¶¼ÊÇÓõÄB+Tree£¨SQL ServerÓõÄÊDZê×¼µÄB-Tree£©µÄË÷Òý½á¹¹¡£InnoDBÊÇÒÔÖ÷¼ü×éÖ¯µÄB+Tree½á¹¹£¬Êý¾Ý°´ÕÕÖ÷¼ü˳ÐòÅÅÁС£¶ÔÓÚ˳ÐòµÄ×ÔÔöÖ÷¼üÓкܺõÄÐÔÄÜ£¬µ«ÊDz»ÊʺÏËæ»úдÈ룬´óÁ¿µÄËæ»úI/O»áʹÊý¾ÝÒ³·ÖÁѲúÉúË鯬£¬Ë÷Òýά»¤¿ªÏúºÜ¶à´ó¡£TokuDB½â¾öËæ»úдÈëµÄÎÊÌâµÃÒæÓÚÆäË÷Òý½á¹¹£¬Fractal
Tree ºÍ B-TreeµÄ²î±ðÖ÷ÒªÔÚÓÚË÷ÒýÊ÷µÄÄÚ²¿½ÚµãÉÏ£¬B-TreeË÷ÒýµÄÄÚ²¿½á¹¹Ö»ÓÐÖ¸Ïò¸¸½ÚµãºÍ×Ó½ÚµãµÄÖ¸Õ룬¶øFractal
TreeµÄÄÚ²¿½Úµã²»½öÓÐÖ¸Ïò¸¸½ÚµãºÍ×Ó½ÚµãµÄÖ¸Õ룬»¹ÓÐÒ»¿éBufferÇø¡£µ±Êý¾ÝдÈëʱ»áÏÈÂäµ½Õâ¸öBufferÇøÉÏ£¬¸ÃÇøÊÇÒ»¸öFIFO½á¹¹£¬Ð´ÊÇÒ»¸ö˳ÐòµÄ¹ý³Ì£¬ºÍÆäËû»º³åÇøÒ»Ñù£¬ÂúÁ˾ÍÒ»´ÎÐÔˢдÊý¾Ý¡£ËùÒÔTokuDBÉϲåÈëÊý¾Ý»ù±¾Éϱä³ÉÁËÒ»¸ö˳ÐòÌí¼ÓµÄ¹ý³Ì¡£
BTreeºÍFractal treeµÄ±È½Ï£º

Fractal tree(·ÖÐÎÊ÷)¼ò½é
·ÖÐÎÊ÷ÊÇÒ»ÖÖдÓÅ»¯µÄ´ÅÅÌË÷ÒýÊý¾Ý½á¹¹¡£ ÔÚÒ»°ãÇé¿öÏ£¬ ·ÖÐÎÊ÷µÄд²Ù×÷£¨Insert/Update/Delete£©ÐÔÄܱȽϺã¬Í¬Ê±Ëü»¹Äܱ£Ö¤¶Á²Ù×÷½üËÆÓÚB+Ê÷µÄ¶ÁÐÔÄÜ¡£¾ÝPercona¹«Ë¾²âÊÔ½á¹ûÏÔʾ,
TokuDB·ÖÐÎÊ÷µÄдÐÔÄÜÓÅÓÚInnoDBµÄB+Ê÷)£¬ ¶ÁÐÔÄÜÂÔµÍÓÚB+Ê÷¡£
ft-indexµÄ´ÅÅÌ´æ´¢½á¹¹
ft-index²ÉÓøü´óµÄË÷ÒýÒ³ºÍÊý¾ÝÒ³£¨ft-indexĬÈÏΪ4M, InnoDBĬÈÏΪ16K£©£¬
ÕâʹµÃft-indexµÄÊý¾ÝÒ³ºÍË÷ÒýÒ³µÄѹËõ±È¸ü¸ß¡£Ò²¾ÍÊÇ˵£¬ÔÚ´ò¿ªË÷ÒýÒ³ºÍÊý¾ÝҳѹËõµÄÇé¿öÏ£¬²åÈëµÈÁ¿µÄÊý¾Ý£¬
ft-indexÕ¼ÓõĴ洢¿Õ¼ä¸üÉÙ¡£ft-indexÖ§³ÖÔÚÏßÐÞ¸ÄDDL (Hot Schema Change)¡£
¼òµ¥À´½²£¬¾ÍÊÇÔÚ×öDDL²Ù×÷µÄͬʱ(ÀýÈçÌí¼ÓË÷Òý)£¬Óû§ÒÀÈ»¿ÉÒÔÖ´ÐÐдÈë²Ù×÷£¬ Õâ¸öÌØµãÊÇft-indexÊ÷ÐνṹÌìȻ֧³ÖµÄ¡£
´ËÍ⣬ ft-index»¹Ö§³ÖÊÂÎñ(ACID)ÒÔ¼°ÊÂÎñµÄMVCC(Multiple Version
Cocurrency Control ¶à°æ±¾²¢·¢¿ØÖÆ)£¬ Ö§³Ö±ÀÀ£»Ö¸´¡£ÕýÒòΪÉÏÊöÌØµã£¬ Percona¹«Ë¾Ðû³ÆTokuDBÒ»·½Ãæ´ø¸ø¿Í»§¼«´óµÄÐÔÄÜÌáÉý£¬
ÁíÒ»·½Ã滹½µµÍÁ˿ͻ§µÄ´æ´¢Ê¹Óóɱ¾¡£
ft-indexµÄË÷Òý½á¹¹Í¼ÈçÏ£º

»ÒÉ«ÇøÓò±íʾft-index·ÖÐÎÊ÷µÄÒ»¸öÒ³£¬ÂÌÉ«ÇøÓò±íʾһ¸ö¼üÖµ£¬Á½¸ñÂÌÉ«ÇøÓòÖ®¼ä±íʾһ¸ö¶ù×ÓÖ¸Õë¡£
BlockNum±íʾ¶ù×ÓÖ¸ÕëÖ¸ÏòµÄÒ³µÄÆ«ÒÆÁ¿¡£Fanout±íʾ·ÖÐÎÊ÷µÄÉȳö£¬Ò²¾ÍÊǶù×ÓÖ¸ÕëµÄ¸öÊý¡£
NodeSize±íʾһ¸öÒ³Õ¼ÓõÄ×Ö½ÚÊý¡£NonLeafNode±íʾµ±Ç°Ò³ÊÇÒ»¸ö·ÇÒ¶×ӽڵ㣬LeafNode±íʾµ±Ç°Ò³ÊÇÒ»¸öÒ¶×ӽڵ㣬Ҷ×Ó½ÚµãÊÇ×îµ×²ãµÄ´æ·ÅKey-value¼üÖµ¶ÔµÄ½Úµã£¬
·ÇÒ¶×ӽڵ㲻´æ·Åvalue¡£ Heigth±íʾÊ÷µÄ¸ß¶È£¬ ¸ù½ÚµãµÄ¸ß¶ÈΪ3£¬ ¸ù½ÚµãÏÂÒ»²ã½ÚµãµÄ¸ß¶ÈΪ2£¬
×îµ×²ãÒ¶×Ó½ÚµãµÄ¸ß¶ÈΪ1¡£Depth±íʾÊ÷µÄÉî¶È£¬¸ù½ÚµãµÄÉî¶ÈΪ0£¬ ¸ù½ÚµãµÄÏÂÒ»²ã½ÚµãÉî¶ÈΪ1¡£
·ÖÐÎÊ÷µÄÊ÷Ðνṹ·Ç³£ÀàËÆÓÚB+Ê÷, ËüµÄÊ÷ÐνṹÓÉÈô¸É¸ö½Úµã×é³É£¨ÎÒÃdzÆÖ®ÎªNode»òÕßBlock£¬ÔÚInnoDBÖУ¬ÎÒÃdzÆÖ®ÎªPage»òÕßÒ³£©¡£
ÿ¸ö½ÚµãÓÉÒ»×éÓÐÐòµÄ¼üÖµ×é³É¡£¼ÙÉèÒ»¸ö½ÚµãµÄ¼üÖµÐòÁÐΪ[3, 8], ÄÇôÕâ¸ö¼üÖµ½«(-00, +00)Õû¸öÇø¼ä»®·ÖΪ(-00,
3), [3, 8), [8, +00) ÕâÑù3¸öÇø¼ä£¬ ÿһ¸öÇø¼ä¾Í¶ÔÓ¦×ÅÒ»¸ö¶ù×ÓÖ¸Õ루ChildÖ¸Õ룩¡£
ÔÚB+Ê÷ÖУ¬ ChildÖ¸ÕëÒ»°ãÖ¸ÏòÒ»¸öÒ³£¬ ¶øÔÚ·ÖÐÎÊ÷ÖУ¬Ã¿Ò»¸öChildÖ¸Õë³ýÁËÐèÒªÖ¸ÏòÒ»¸öNodeµÄµØÖ·(BlockNum)Ö®Í⣬»¹»á´øÓÐÒ»¸öMessage
Buffer (msg_buffer)£¬ Õâ¸öMessage Buffer ÊÇÒ»¸öÏȽøÏȳö(FIFO)µÄ¶ÓÁУ¬ÓÃÀ´´æ·ÅInsert/Delete/Update/HotSchemaChangeÕâÑùµÄ¸üвÙ×÷¡£
°´ÕÕft-indexÔ´´úÂëµÄʵÏÖ£¬ ¶Ôft-indexÖзÖÐÎÊ÷¸üΪÑϽ÷µÄ˵·¨£º
½Úµã(block»òÕßnode, ÔÚInnoDBÖÐÎÒÃdzÆÖ®ÎªPage»òÕßÒ³)ÊÇÓÉÒ»×éÓÐÐòµÄ¼üÖµ×é³É£¬
µÚÒ»¸ö¼üÖµÉèÖÃΪnull¼üÖµ£¬ ±íʾ¸ºÎÞÇî´ó¡£
½Úµã·ÖΪÁ½ÖÖÀàÐÍ£¬Ò»ÖÖÊÇÒ¶×ӽڵ㣬 Ò»ÖÖÊÇ·ÇÒ¶×ӽڵ㡣 Ò¶×Ó½ÚµãµÄ¶ù×ÓÖ¸ÕëÖ¸ÏòµÄÊÇBasementNode,
·ÇÒ¶×Ó½ÚµãÖ¸ÏòµÄÊÇÕý³£µÄNode ¡£ ÕâÀïµÄBasementNode½Úµã´æ·ÅµÄÊǶà¸öK-V¼üÖµ¶Ô£¬
Ò²¾ÍÊÇ˵×îºóËùÓеIJéÕÒ²Ù×÷¶¼ÐèÒª¶¨Î»µ½BasementNode²ÅÄܳɹ¦»ñÈ¡µ½Êý¾Ý(Value)¡£ÕâÒ»µãÒ²ºÍB+Ê÷µÄLeafPageÀàËÆ£¬
Êý¾Ý(Value)¶¼ÊÇ´æ·ÅÔÚÒ¶×ӽڵ㣬 ·ÇÒ¶×Ó½ÚµãÓÃÀ´´æ·Å¼üÖµ(Key)×öË÷Òý¡£ µ±Ò¶×Ó½Úµã¼ÓÔØµ½ÄÚ´æºó£¬ÎªÁË¿ìËÙ²éÕÒµ½BasementNodeÖеÄÊý¾Ý(Value)£¬
ft-index»á°ÑÕû¸öBasementNodeÖеÄkey-value¶¼×ª»»ÎªÒ»¿ÃÈõƽºâ¶þ²æÊ÷£¬ Õâ¿Ãƽºâ¶þ²æÊ÷ÓÐÒ»¸öºÜ¶º±ÆµÄÃû×Ö£¬½Ð×öÌæ×ïÑòÊ÷¡£
ÿ¸ö½ÚµãµÄ¼üÖµÇø¼ä¶ÔÓ¦×ÅÒ»¸ö¶ù×ÓÖ¸Õë(Child Pointer)¡£ ·ÇÒ¶×Ó½ÚµãµÄ¶ù×ÓÖ¸ÕëЯ´ø×ÅÒ»¸öMessageBuffer£¬
MessageBufferÊÇÒ»¸öFIFO¶ÓÁС£ÓÃÀ´´æ·ÅInsert/Delete/Update/HotSchemaChangeÕâÑùµÄ¸üвÙ×÷¡£¶ù×ÓÖ¸ÕëÒÔ¼°MessageBuffer¶¼»áÐòÁл¯´æ·ÅÔÚNodeµÄ´ÅÅÌÎļþÖС£
ÿ¸ö·ÇÒ¶×Ó½Úµã(Non Leaf Node)¶ù×ÓÖ¸ÕëµÄ¸öÊý±ØÐëÔÚ[fantout/4, fantout]Õâ¸öÇø¼äÖ®ÄÚ¡£
ÕâÀïfantoutÊÇ·ÖÐÎÊ÷£¨B+Ê÷Ò²ÓÐÕâ¸ö¸ÅÄµÄÒ»¸ö²ÎÊý£¬Õâ¸ö²ÎÊýÖ÷ÒªÓÃÀ´Î¬³ÖÊ÷µÄ¸ß¶È¡£µ±Ò»¸ö·ÇÒ¶×Ó½ÚµãµÄ¶ù×ÓÖ¸Õë¸öÊýСÓÚfantout/4
£¬ ÄÇôÎÒÃÇÈÏΪÕâ¸ö½ÚµãµÄÌ«¿ÕÐéÁË£¬ÐèÒªºÍÆäËû½ÚµãºÏ²¢ÎªÒ»¸ö½Úµã(Node Merge)£¬ ÕâÑùÄܼõÉÙÕû¸öÊ÷µÄ¸ß¶È¡£µ±Ò»¸ö·ÇÒ¶×Ó½ÚµãµÄ¶ù×ÓÖ¸Õë¸öÊý³¬¹ýfantout£¬
ÄÇôÎÒÃÇÈÏΪÕâ¸ö½ÚµãÌ«±¥ÂúÁË£¬ ÐèÒª½«Ò»¸ö½ÚµãÒ»²ðΪ¶þ(Node Split)¡£ ͨ¹ýÕâÖÖÔ¼Êø¿ØÖÆ£¬ÀíÂÛÉϾÍÄܽ«´ÅÅÌÊý¾Ýά³ÖÔÚÒ»¸öÕý³£µÄÏà¶ÔƽºâµÄÊ÷Ðνṹ£¬ÕâÑù¿ÉÒÔ¿ØÖƲåÈëºÍ²éѯ¸´ÔÓ¶ÈÉÏÏÞ¡£
×¢Ò⣺ ÔÚft-indexʵÏÖÖУ¬¿ØÖÆÊ÷ƽºâµÄÌõ¼þ¸ü¼Ó¸´ÔÓ£¬ ÀýÈç³ýÁË¿¼ÂÇfantoutÖ®Í⣬»¹Òª±£Ö¤½Úµã×Ü×Ö½ÚÊýÔÚ[NodeSize/4,
NodeSize]Õâ¸öÇø¼ä£¬ NodeSizeÒ»°ãΪ4M £¬µ±²»ÔÚÕâ¸öÇø¼äʱ£¬ ÐèÒª×ö¶ÔÓ¦µÄºÏ²¢(Merge)»òÕß·ÖÁÑ(Split)²Ù×÷¡£
·ÖÐÎÊ÷µÄInsert/Delete/UpdateʵÏÖ
ÎÒÃÇ˵µ½·ÖÐÎÊ÷ÊÇÒ»ÖÖдÓÅ»¯µÄÊý¾Ý½á¹¹£¬ ËüµÄд²Ù×÷ÐÔÄÜÒªÓÅÓÚB+Ê÷µÄд²Ù×÷ÐÔÄÜ¡£ ÄÇôËü¾¿¾¹ÈçºÎ×öµ½¸üÓŵÄд²Ù×÷ÐÔÄÜÄØ£¿Ê×ÏÈ£¬
ÕâÀï˵µÄд²Ù×÷ÐÔÄÜ£¬Ö¸µÄÊÇËæ»úд²Ù×÷¡£ ¾Ù¸ö¼òµ¥Àý×Ó£¬¼ÙÉèÎÒÃÇÔÚMySQLµÄInnoDB±íÖв»¶ÏÖ´ÐÐÕâ¸öSQLÓï¾ä£º
insert into sbtest set x = uuid()£¬ ÆäÖÐsbtest±íÖÐÓÐÒ»¸öΨһË÷Òý×Ö¶ÎΪx¡£
ÓÉÓÚuuid()µÄËæ»úÐÔ£¬½«µ¼Ö²åÈëµ½sbtest±íÖеÄÊý¾ÝÉ¢ÂäÔÚ¸÷¸ö²»Í¬µÄÒ¶×Ó½Úµã(Leaf Node)ÖС£
ÔÚB+Ê÷ÖУ¬ ´óÁ¿µÄÕâÖÖËæ»úд²Ù×÷½«µ¼ÖÂLRU-CacheÖдóÁ¿µÄÈȵãÊý¾ÝÒ³ÂäÔÚB+Ê÷µÄÉϲã(ÈçÏÂͼËùʾ£©¡£ÕâÑùµ×²ãµÄÒ¶×Ó½ÚµãÃüÖÐCacheµÄ¸ÅÂʽµµÍ£¬´Ó¶øÔì³É´óÁ¿µÄ´ÅÅÌIO²Ù×÷£¬
Ò²¾Íµ¼ÖÂB+Ê÷µÄËæ»úдÐÔÄÜÆ¿¾±¡£µ«B+Ê÷µÄ˳Ðòд²Ù×÷ºÜ¿ì£¬ÒòΪ˳Ðòд²Ù×÷³ä·ÖÀûÓÃÁ˾ֲ¿ÈȵãÊý¾Ý£¬ ´ÅÅÌIO´ÎÊý´ó´ó½µµÍ¡£

ÏÂÃæÀ´ËµËµ·ÖÐÎÊ÷²åÈë²Ù×÷µÄÁ÷³Ì¡£ ΪÁË·½±ãºóÃæÃèÊö£¬Ô¼¶¨ÈçÏ£º
ÒÔInsert²Ù×÷ΪÀý£¬ ¼Ù¶¨²åÈëµÄÊý¾ÝΪ(Key, Value)
¼ÓÔØ½Úµã(Load Page)£¬¶¼ÊÇÏÈÅжϸýڵãÊÇ·ñÃüÖÐLRU-Cache¡£½öµ±»º´æ²»ÃüÖÐʱ£¬ ft-index²Å»áͨ¹ýseed¶¨Î»µ½Æ«ÒÆÁ¿¶ÁÈ¡Êý¾ÝÒ³µ½ÄÚ´æ
ÔÝʱ²»¿¼ÂDZÀÀ£ÈÕÖ¾ºÍÊÂÎñ´¦Àí¡£
ÏêϸÁ÷³ÌÈçÏ£º
1¡¢¼ÓÔØRoot½Úµã£»
2¡¢ÅжÏRoot½ÚµãÊÇ·ñÐèÒª·ÖÁÑ(»òºÏ²¢)£¬Èç¹ûÂú×ã·ÖÁÑ(»òÕߺϲ¢)Ìõ¼þ£¬Ôò·ÖÁÑ(»òÕߺϲ¢)Root½Úµã¡£
¾ßÌå·ÖÁÑRoot½ÚµãµÄÁ÷³Ì£¬¸ÐÐËȤµÄͬѧ¿ÉÒÔ¿ª¿ªÄÔ¶´¡£
3¡¢µ±Root½Úµãheight>0, Ò²¾ÍÊÇRootÊÇ·ÇÒ¶×Ó½Úµãʱ£¬ ͨ¹ý¶þ·ÖËÑË÷ÕÒµ½KeyËùÔڵļüÖµÇø¼äRange£¬½«(Key,
Value)°ü×°³ÉÒ»ÌõÏûÏ¢(Insert, Key, Value) £¬ ·ÅÈëµ½¼üÖµÇø¼äRange¶ÔÓ¦µÄChildÖ¸ÕëµÄMessage
BufferÖС£
4¡¢µ±Root½Úµãheight=0ʱ£¬¼´RootÊÇÒ¶×Ó½Úµãʱ£¬ ½«ÏûÏ¢(Insert, Key,
Value) Ó¦ÓÃ(Apply)µ½BasementNodeÉÏ£¬ Ò²¾ÍÊDzåÈë(Key, Value)µ½BasementNodeÖС£
ÕâÀïÓÐÒ»¸ö·Ç³£¹îÒìµÄµØ·½£¬ÔÚ´óÁ¿µÄ²åÈ루°üÀ¨Ëæ»úºÍ˳Ðò²åÈ룩Çé¿öÏ£¬ Root½Úµã»á¾³£ÐԵı»³Å±¥Âú£¬Õ⽫»áµ¼ÖÂRoot½Úµã×ö´óÁ¿µÄ·ÖÁѲÙ×÷¡£È»ºó£¬Root½Úµã×öÁË´óÁ¿µÄ·ÖÁѲÙ×÷Ö®ºó£¬²úÉú´óÁ¿µÄheight=1µÄ½Úµã£¬
È»ºóheight=1µÄ½Úµã±»³Å±¬ÂúÖ®ºó£¬ÓÖ»á²úÉú´óÁ¿height=2µÄ½Úµã£¬ ×îÖÕÊ÷µÄ¸ß¶ÈÔ½À´Ô½¸ß¡£
Õâ¸ö¹îÒìµÄÖ®´¦¾ÍÒþ²ØÁË·ÖÐÎÊ÷д²Ù×÷ÐÔÄܱÈB+Ê÷¸ßµÄÃØ¾÷£º ÿһ´Î²åÈë²Ù×÷¶¼ÂäÔÚRoot½Úµã¾ÍÂíÉÏ·µ»ØÁË£¬
ÿ´Îд²Ù×÷²¢²»ÐèÒªËÑË÷Ê÷Ðνṹ×îµ×²ãµÄBasementNode£¬ ÕâÑù»áµ¼Ö´óÁ¿µÄÈȵãÊý¾Ý¼¯ÖÐÂäÔÚÔÚRoot½ÚµãµÄÉϲã(´ËʱµÄÈȵãÊý¾Ý·Ö²¼Í¼ÀàËÆÓÚÉÏͼ)£¬
´Ó¶ø³ä·ÖÀûÓÃÈȵãÊý¾ÝµÄ¾Ö²¿ÐÔ£¬´ó´ó¼õÉÙÁË´ÅÅÌIO²Ù×÷¡£
Update/Delete²Ù×÷µÄÇé¿öºÍInsert²Ù×÷µÄÇé¿öÀàËÆ£¬ µ«ÊÇÐèÒªÌØ±ð×¢ÒâµÄµØ·½ÔÚÓÚ£¬ÓÉÓÚ·ÖÐÎÊ÷Ëæ»ú¶ÁÐÔÄܲ¢²»ÈçInnoDBµÄB+Ê÷¡£Òò´Ë£¬Update/Delete²Ù×÷ÐèҪϸ·ÖΪÁ½ÖÖÇé¿ö¿¼ÂÇ£¬ÕâÁ½ÖÖÇé¿ö²âÊÔÐÔÄÜ¿ÉÄܲî¾à¾Þ´ó£º
¸²¸ÇʽµÄUpdate/Delete (overwrite)¡£ Ò²¾ÍÊǵ±key´æÔÚʱ£¬ Ö´ÐÐUpdate/Delete£»
µ±key²»´æÔÚʱ£¬²»×öÈκβÙ×÷£¬Ò²²»ÐèÒª±¨´í¡£
ÑϸñÆ¥ÅäµÄUpdate/Delete¡£ µ±key´æÔÚʱ£¬ Ö´ÐÐupdate/delete ; µ±key²»´æÔÚʱ£¬
ÐèÒª±¨´í¸øÉϲãÓ¦Ó÷½¡£ ÔÚÕâÖÖÇé¿öÏ£¬ÎÒÃÇÐèÒªÏȲéѯkeyÊÇ·ñ´æÔÚÓÚft-indexµÄbasementnodeÖУ¬ÓÚÊÇPoint-QueryĬĬµÄÍÏÁËUpdate/Delete²Ù×÷µÄÐÔÄܺóÍË¡£
´ËÍ⣬ft-indexΪÁËÌáÉý˳ÐòдµÄÐÔÄÜ£¬¶Ô˳Ðò²åÈë²Ù×÷×öÁËһЩÓÅ»¯£¬ÀýÈç˳Ðòд¼ÓËÙ¡£
·ÖÐÎÊ÷µÄPoint-QueryʵÏÖ
ÔÚft-indexÖУ¬ ÀàËÆselect from table where id = ? £¨ÆäÖÐidÊÇË÷Òý£©µÄ²éѯ²Ù×÷³ÆÖ®ÎªPoint-Query£»
ÀàËÆselect from table where id >= ? and id <=
? £¨ÆäÖÐidÊÇË÷Òý£©µÄ²éѯ²Ù×÷³ÆÖ®ÎªRange-Query¡£ ÉÏÎÄÒѾÌáµ½£¬ Point-Query¶Á²Ù×÷ÐÔÄܲ¢²»ÈçInnoDBµÄB+Ê÷£¬
ÕâÀïÏêϸÃèÊöPoint-QueryµÄÏà¹ØÁ÷³Ì¡£ £¨ÕâÀï¼ÙÉèÒª²éѯµÄ¼üֵΪKey£©
1¡¢¼ÓÔØRoot½Úµã£¬Í¨¹ý¶þ·ÖËÑË÷È·¶¨KeyÂäÔÚRoot½ÚµãµÄ¼üÖµÇø¼äRange, ÕÒµ½¶ÔÓ¦µÄRangeµÄChildÖ¸Õë¡£
2¡¢¼ÓÔØChildÖ¸Õë¶ÔÓ¦µÄµÄ½Úµã¡£ Èô¸Ã½ÚµãΪ·ÇÒ¶×ӽڵ㣬Ôò¼ÌÐøÑØ×Å·ÖÐÎÊ÷Ò»Ö±ÍùϲéÕÒ£¬Ò»Ö±µ½Ò¶×Ó½ÚµãÍ£Ö¹¡£
Èôµ±Ç°½ÚµãΪҶ×ӽڵ㣬ÔòÍ£Ö¹²éÕÒ¡£
²éÕÒµ½Ò¶×Ó½Úµãºó£¬ÎÒÃDz¢²»ÄÜÖ±½Ó·µ»ØÒ¶×Ó½ÚµãÖеÄBasementNodeµÄValue¸øÓû§¡£ ÒòΪ·ÖÐÎÊ÷µÄ²åÈë²Ù×÷ÊÇͨ¹ýÏûÏ¢(Message)µÄ·½Ê½²åÈëµÄ£¬
´ËʱÐèÒª°Ñ´ÓRoot½Úµãµ½Ò¶×Ó½ÚµãÕâÌõ·¾¶ÉϵÄËùÓÐÏûÏ¢ÒÀ´Îapplyµ½Ò¶×Ó½ÚµãµÄBasementNode¡£
´ýapplyËùÓеÄÏûÏ¢Íê³ÉÖ®ºó£¬²éÕÒBasementNodeÖеÄkey¶ÔÓ¦µÄvalue£¬¾ÍÊÇÓû§ÐèÒª²éÕÒµÄÖµ¡£
·ÖÐÎÊ÷µÄ²éÕÒÁ÷³Ì»ù±¾ºÍ InnoDBµÄB+Ê÷µÄ²éÕÒÁ÷³ÌÀàËÆ£¬ Çø±ðÔÚÓÚ·ÖÐÎÊ÷ÐèÒª½«´ÓRoot½Úµãµ½Ò¶×Ó½ÚµãÕâÌõ·¾¶ÉϵÄmessge
buffer¶¼ÍùÏÂÍÆ£¬²¢½«ÏûÏ¢applyµ½BasementNode½ÚµãÉÏ¡£×¢Òâ²éÕÒÁ÷³ÌÐèÒªÏÂÍÆÏûÏ¢£¬
Õâ¿ÉÄÜ»áÔì³É·¾¶ÉϵIJ¿·Ö½Úµã±»³Å±¥Âú£¬µ«ÊÇft-indexÔÚ²éѯ¹ý³ÌÖв¢²»»á¶ÔÒ¶×Ó½Úµã×ö·ÖÁѺͺϲ¢²Ù×÷£¬
ÒòΪft-indexµÄÉè¼ÆÔÔòÊÇ£º Insert/Update/Delete²Ù×÷¸ºÔð½ÚµãµÄSplitºÍMerge,
Select²Ù×÷¸ºÔðÏûÏ¢µÄÑÓ³ÙÏÂÍÆ(Lazy Push)¡£ ÕâÑù£¬·ÖÐÎÊ÷¾Í½«Insert/Delete/UpdateÕâÀà¸üвÙ×÷ͨ¹ýδÀ´µÄSelect²Ù×÷Ó¦Óõ½¾ßÌåµÄÊý¾Ý½Úµã£¬´Ó¶øÍê³É¸üС£
·ÖÐÎÊ÷µÄRange-QueryʵÏÖ
ÏÂÃæÀ´½éÉÜRange-QueryµÄ²éѯʵÏÖ¡£¼òµ¥À´½²£¬ ·ÖÐÎÊ÷µÄRange-Query»ù±¾µÈ¼ÛÓÚ½øÐÐN´ÎPoint-Query²Ù×÷£¬²Ù×÷µÄ´ú¼ÛÒ²»ù±¾µÈ¼ÛÓÚN´ÎPoint-Query²Ù×÷µÄ´ú¼Û¡£
ÓÉÓÚ·ÖÐÎÊ÷ÔÚ·ÇÒ¶×Ó½ÚµãµÄmsg_bufferÖдæ·Å×ÅBasementNodeµÄ¸üвÙ×÷£¬Òò´ËÎÒÃÇÔÚ²éÕÒÿһ¸öKeyµÄValueʱ£¬¶¼ÐèÒª´Ó¸ù½Úµã²éÕÒµ½Ò¶×ӽڵ㣬
È»ºó½«ÕâÌõ·¾¶ÉϵÄÏûÏ¢applyµ½basenmentNodeµÄValueÉÏ¡£ Õâ¸öÁ÷³Ì¿ÉÒÔÓÃÏÂͼÀ´±íʾ¡£
.jpg)
µ«ÊÇÔÚB+Ê÷ÖУ¬ ÓÉÓڵײãµÄ¸÷¸öÒ¶×ӽڵ㶼ͨ¹ýÖ¸Õë×éÖ¯³ÉÒ»¸öË«ÏòÁ´±í£¬ ½á¹¹ÈçÏÂͼËùʾ¡£ Òò´Ë£¬ÎÒÃÇÖ»ÐèÒª´Ó¸ú½Úµãµ½Ò¶×ӽڵ㶨λµ½µÚÒ»¸öÂú×ãÌõ¼þµÄKey,
È»ºó²»¶ÏÔÚÒ¶×Ó½Úµãµü´únextÖ¸Õ룬¼´¿É»ñÈ¡µ½Range-QueryµÄËùÓÐKey-Value¼üÖµ¡£Òò´Ë£¬¶ÔÓÚB+Ê÷µÄRange-Query²Ù×÷À´Ëµ£¬³ýÁ˵ÚÒ»´ÎÐèÒª´Óroot½Úµã±éÀúµ½Ò¶×Ó½Úµã×öËæ»úд²Ù×÷£¬ºó¼ÌÊý¾Ý¶ÁÈ¡»ù±¾¿ÉÒÔ¿´×öÊÇ˳ÐòIO¡£

ͨ¹ý±È½Ï·ÖÐÎÊ÷ºÍB+Ê÷µÄRange-QueryʵÏÖ¿ÉÒÔ·¢ÏÖ£¬ ·ÖÐÎÊ÷µÄRange-Query²éѯ´ú¼ÛÃ÷ÏÔ±ÈB+Ê÷´ú¼Û¸ß£¬ÒòΪ·ÖÐÍÊ÷ÐèÒª±éÀúRoot½ÚµãµÄ¸²¸ÇRangeµÄÕû¿Å×ÓÊ÷£¬¶øB+Ê÷Ö»ÐèÒªÒ»´ÎSeedµ½RangeµÄÆðʼKey£¬ºóÐøµü´ú»ù±¾µÈ¼ÛÓÚ˳ÐòIO¡£
×ܽá
×ÜÌåÀ´Ëµ£¬·ÖÐÎÊ÷ÊÇÒ»ÖÖдÓÅ»¯µÄÊý¾Ý½á¹¹£¬ËüµÄºËÐÄ˼ÏëÊÇÀûÓýڵãµÄMessageBuffer»º´æ¸üвÙ×÷£¬³ä·ÖÀûÓÃÊý¾Ý¾Ö²¿ÐÔÔÀí£¬
½«Ëæ»úдת»»ÎªË³Ðòд£¬ÕâÑù¼«´óµÄÌá¸ßÁËËæ»úдµÄЧÂÊ¡£TokutekÑз¢ÍŶӵÄiiBench²âÊÔ½á¹ûÏÔʾ£º
TokuDBµÄinsert²Ù×÷(Ëæ»úд)µÄÐÔÄܱÈInnoDB¿ìºÜ¶à£¬¶øSelect²Ù×÷(Ëæ»ú¶Á)µÄÐÔÄܵÍÓÚInnoDBµÄÐÔÄÜ£¬µ«ÊDzî¾à½ÏС£¬Í¬Ê±ÓÉÓÚTokuDB²ÉÓÃÓÐ4MµÄ´óÒ³´æ´¢£¬Ê¹µÃѹËõ±È½Ï¸ß¡£ÕâÒ²ÊÇPercona¹«Ë¾Ðû³ÆTokuDB¸ü¸ßÐÔÄÜ£¬¸üµÍ³É±¾µÄÔÒò¡£
ÁíÍ⣬ÔÚÏ߸üбí½á¹¹(Hot Schema Change)ʵÏÖÒ²ÊÇ»ùÓÚMessageBufferÀ´ÊµÏֵģ¬
µ«ºÍInsert/Delete/Update²Ù×÷²»Í¬µÄÊÇ£¬ ǰÕßµÄÏûÏ¢ÏÂÍÆ·½Ê½Êǹ㲥ʽÏÂÍÆ£¨¸¸½ÚµãµÄÒ»ÌõÏûÏ¢£¬Ó¦Óõ½ËùÓеĶù×ӽڵ㣩£¬
ºóÕßµÄÏûÏ¢ÏÂÍÆ·½Ê½µ¥²¥Ê½ÏÂÍÆ£¨¸¸½ÚµãµÄÒ»ÌõÏûÏ¢£¬Ó¦Óõ½¶ÔÓ¦¼üÖµÇø¼äµÄ¶ù×Ó½Úµã)£¬ ÓÉÓÚʵÏÖÀàËÆÓÚInsert²Ù×÷£¬ËùÒÔ²»ÔÙÕ¹¿ªÃèÊö¡£
TokuDBµÄ¶à°æ±¾²¢·¢¿ØÖÆ(MVCC)
ÔÚ´«Í³µÄ¹ØÏµÐÍÊý¾Ý¿â£¨ÀýÈçOracle, MySQL, SQLServer£©ÖУ¬ÊÂÎñ¿ÉÒÔ˵ÊÇÑз¢ºÍÌÖÂÛ×îºËÐÄÄÚÈÝ¡£¶øÊÂÎñ×îºËÐĵÄÐÔÖʾÍÊÇACID¡£
A±íʾÔ×ÓÐÔ£¬Ò²¾ÍÊÇ×é³ÉÊÂÎñµÄËùÓÐ×ÓÈÎÎñÖ»ÓÐÁ½ÖÖ½á¹û£ºÒªÃ´Ëæ×ÅÊÂÎñµÄÌá½»£¬ËùÓÐ×ÓÈÎÎñ¶¼³É¹¦Ö´ÐУ»ÒªÃ´Ëæ×ÅÊÂÎñµÄ»Ø¹ö£¬ËùÓÐ×ÓÈÎÎñ¶¼³·Ïú¡£
C±íʾһÖÂÐÔ£¬Ò²¾ÍÊÇÎÞÂÛÊÂÎñÌá½»»òÕ߻عö£¬¶¼²»ÄÜÆÆ»µÊý¾ÝµÄÒ»ÖÂÐÔÔ¼Êø£¬ÕâЩһÖÂÐÔÔ¼Êø°üÀ¨¼üÖµÎ¨Ò»Ô¼Êø¡¢¼üÖµ¹ØÁª¹ØÏµÔ¼ÊøµÈ¡£
I±íʾ¸ôÀëÐÔ£¬¸ôÀëÐÔÒ»°ãÊÇÕë¶Ô¶à¸ö²¢·¢ÊÂÎñ¶øÑԵģ¬Ò²¾ÍÊÇÔÚͬһ¸öʱ¼äµã£¬t1ÊÂÎñºÍt2ÊÂÎñ¶ÁÈ¡µÄÊý¾ÝÓ¦¸ÃÊǸôÀëµÄ£¬ÕâÁ½¸öÊÂÎñ¾ÍºÃÏñ½øÁËͬһ¾ÆµêµÄÁ½¼ä·¿¼äÒ»Ñù£¬¸÷×ÔÔÚ¸÷×Եķ¿¼äÀïÃæ»î¶¯£¬ËûÃÇÏ໥֮¼ä²¢²»ÄÜ¿´µ½¸÷×ÔÔÚ¸ÉÂï¡£
D±íʾ³Ö¾ÃÐÔ£¬Õâ¸öÐÔÖʱ£Ö¤ÁËÒ»¸öÊÂÎñÒ»µ©³ÐŵÓû§³É¹¦Ìá½»£¬ÄÇô¼´±ãÊǺó¼ÌÊý¾Ý¿â½ø³Ìcrash»òÕß²Ù×÷ϵͳcrash£¬Ö»Òª´ÅÅÌÊý¾Ýû»µ£¬ÄÇôÏÂ´ÎÆô¶¯Êý¾Ý¿âºó£¬Õâ¸öÊÂÎñµÄÖ´Ðнá¹ûÈÔÈ»¿ÉÒÔ¶ÁÈ¡µ½¡£
TokuDBĿǰÍêȫ֧³ÖÊÂÎñµÄACID¡£ ´ÓʵÏÖÉÏ¿´£¬ ÓÉÓÚTokuDB²ÉÓõķÖÐÎÊ÷×÷ΪË÷Òý£¬¶øInnoDB²ÉÓÃB+Ê÷×÷ΪË÷Òý½á¹¹£¬Òò¶øTokuDBÔÚÊÂÎñµÄʵÏÖÉϺÍInnoDBÓкܴó²»Í¬¡£
ÔÚInnoDBÖУ¬ Éè¼ÆÁËredoºÍundoÁ½ÖÖÈÕÖ¾£¬redo´æ·ÅÒ³µÄÎïÀíÐÞ¸ÄÈÕÖ¾£¬ÓÃÀ´±£Ö¤ÊÂÎñµÄ³Ö¾ÃÐÔ£»
undo´æ·ÅÊÂÎñµÄÂß¼ÐÞ¸ÄÈÕÖ¾£¬Ëüʵ¼Ê´æ·ÅÁËÒ»Ìõ¼Ç¼ÔÚ¶à¸ö²¢·¢ÊÂÎñϵĶà¸ö°æ±¾£¬ÓÃÀ´ÊµÏÖÊÂÎñµÄ¸ôÀëÐÔ(MVCC)ºÍ»Ø¹ö²Ù×÷¡£ÓÉÓÚTokuDBµÄ·ÖÐÎÊ÷²ÉÓÃÏûÏ¢´«µÝµÄ·½Ê½À´×öÔöɾ¸Ä¸üвÙ×÷£¬Ò»ÌõÏûÏ¢¾ÍÊÇÊÂÎñ¶Ô¸Ã¼Ç¼Ð޸ĵÄÒ»¸ö°æ±¾£¬Òò´Ë£¬ÔÚTokuDBÔ´ÂëʵÏÖÖУ¬²¢Ã»ÓжîÍâµÄundo-logµÄ¸ÅÄîºÍʵÏÖ£¬È¡¶ø´úÖ®µÄÊÇÒ»Ìõ¼Ç¼¶àÌõÏûÏ¢µÄ¹ÜÀí»úÖÆ¡£ËäȻһÌõ¼Ç¼¶àÌõÏûÏ¢µÄ·½Ê½¿ÉÒÔʵÏÖÊÂÎñµÄMVCC£¬È´ÎÞ·¨½â¾öÊÂÎñ»Ø¹öµÄÎÊÌ⣬Òò´ËTokuDB¶îÍâÉè¼ÆÁËtokudb.rollbackÕâ¸öÈÕÖ¾ÎļþÀ´×ö°ïÖúʵÏÖÊÂÎñ»Ø¹ö¡£
ÕâÀïÖ÷Òª·ÖÎöTokuDBµÄÊÂÎñ¸ôÀëÐÔµÄʵÏÖ£¬Ò²¾ÍÊdz£Ìáµ½µÄ¶à°æ±¾²¢·¢¿ØÖÆ(MVCC)¡£
TokuDBµÄÊÂÎñ±íʾ
ÔÚtokudbÖУ¬ ÔÚÓû§Ö´ÐеÄÒ»¸öÊÂÎñ£¬¾ßÌåµ½´æ´¢ÒýÇæ²ãÃæ»á±»²ð¿ª³ÉÐí¶à¸öСÊÂÎñ(ÕâÖÖСÊÂÎñ¼ÇΪtxn)¡£
ÀýÈçÓû§Ö´ÐÐÕâÑùÒ»¸öÊÂÎñ£º

¶ÔÓ¦µ½TokuDB´æ´¢ÒýÇæµÄredo-logÖеļÇ¼Ϊ£º

¶ÔÓ¦µÄÊÂÎñÊ÷ÈçÏÂͼËùʾ£º

¶ÔÒ»¸ö½ÏΪ¸´ÔÓÒ»µã£¬´øÓÐsavepointµÄÊÂÎñÀý×Ó£º

¶ÔÓ¦µÄredo-logµÄ¼Ç¼Ϊ£º

Õâ¸öÊÂÎñ×é³ÉµÄÒ»¿ÃÊÂÎñÊ÷ÈçÏ£º

ÔÚtokudbÖУ¬Ê¹ÓÃ{parent_id, child_id}ÕâÑùÒ»¸ö¶þÔª×éÀ´¼Ç¼һ¸ötxnºÍÆäËûtxnµÄÒÀÀµ¹ØÏµ¡£ÕâÑù´Ó¸ùÊÂÎñµ½Ò¶×Ó¼¸µãµÄÒ»×é±êºÅ¾Í¿ÉÒÔΨһ±êʾһ¸ötxn£¬
ÕâÒ»×é±êºÅÁÐ±í³ÆÖ®Îªxids£¬ xidsÎÒÈÏΪҲ¿ÉÒÔ³ÆÎªÊÂÎñºÅ¡£ ÀýÈçtxn3µÄxids = {17,
2, 3 } , txn2µÄxids = {17, 2}, txn1µÄxids= {17, 1},
txn0µÄxids = {17, 0}¡£
ÓÚÊǶÔÓÚÊÂÎñÖеÄÿһ¸ö²Ù×÷(xbegin/xcommit/enq_insert/xprepare)£¬¶¼ÓÐÒ»¸öxidsÀ´±êʶÕâ¸ö²Ù×÷ËùÔÚµÄÊÂÎñºÅ¡£
TokuDBÖеÄÿһÌõÏûÏ¢£¨insert/delete/updateÏûÏ¢£©¶¼»áЯ´øÕâÑùÒ»¸öxidsÊÂÎñºÅ¡£Õâ¸öxidsÊÂÎñºÅ£¬ÔÚTokuDBµÄʵÏÖÖаçÑÝÕâ·Ç³£ÖØÒªµÄ½ÇÉ«£¬ÓëÖ®Ïà¹ØµÄ¹¦ÄÜÒ²ÌØ±ð¸´ÔÓ¡£
ÊÂÎñ¹ÜÀíÆ÷
ÊÂÎñ¹ÜÀíÆ÷ÓÃÀ´¹ÜÀíTokuDB´æ´¢ÒýÇæËùÓÐÊÂÎñ¼¯ºÏ£¬ ËüÖ÷Ҫά»¤×ÅÕ⼸¸öÐÅÏ¢£º
»îÔ¾ÊÂÎñÁÐ±í¡£»îÔ¾ÊÂÎñÁбíÖ»»á¼Ç¼rootÊÂÎñ£¬ÒòΪ¸ù¾ÝrootÊÂÎñÆäʵ¿ÉÒÔÕÒµ½Õû¿ÃÊÂÎñÊ÷µÄËùÓÐchildÊÂÎñ¡£
Õâ¸öÊÂÎñÁÐ±í±£´æÕ⵱ǰʱ¼äµãÒѾ¿ªÊ¼£¬µ«ÊÇÉÐδ½áÊøµÄËùÓÐrootÊÂÎñ¡£
¾µÏñ¶ÁÊÂÎñÁÐ±í£¨snapshot read transaction£©¡£
»îÔ¾ÊÂÎñµÄÒýÓÃÁбí(referenced_xids)¡£Õâ¸ö¸ÅÄîÓе㲻ºÃÀí½â£¬¼ÙÉèÒ»¸ö»îÔ¾ÊÂÎñ¿ªÊ¼(xbegin)ʱ¼äµãΪbegin_id,
Ìá½»(xcommit)µÄʱ¼äµãΪend_id¡£ÄÇôreferenced_xids¾ÍÊÇά»¤(begin_id,
end_id)ÕâÑùÒ»¸ö¶þÔª×飬Õâ¸ö¶þÔª×éµÄÓô¦¾ÍÊÇ¿ÉÒÔÕÒµ½Ò»¸öÊÂÎñµÄÕû¸öÉúÃüÖÜÆÚµÄËùÓлîÔ¾ÊÂÎñ£¬Óô¦Ö÷ÒªÊÇÓÃÀ´×öºóÎÄ˵µ½µÄfull
gc²Ù×÷¡£
·ÖÐÎÊ÷LeafEntry
ÉÏÎÄ·ÖÐÎÊ÷µÄÊ÷ÐνṹÖÐ˵µ½£¬ÔÚ×öinsert/delete/updateÕâÑùµÄ²Ù×÷ʱ£¬»á°Ñ´Órootµ½leafµÄËùÓÐÏûÏ¢¶¼applyµ½LeafNode½ÚµãÖС£
ΪÁ˺óÃæÏêϸÃèÊöapplyµÄ¹ý³Ì£¬ÏȽéÉÜÏÂLeafNodeµÄ´æ´¢½á¹¹¡£
leafNode¼òµ¥À´Ëµ£¬¾ÍÊÇÓɶà¸öleafEntry×é³É£¬Ã¿¸öleafEntry¾ÍÊÇÒ»¸ö{k,
v1, v2, ¡ }ÕâÑùµÄ¼üÖµ¶Ô£¬ ÆäÖÐv1, v2 .. ±íʾһ¸ökey¶ÔÓ¦µÄÖµµÄ¶à¸ö°æ±¾¡£¾ßÌåµ½Ò»¸ökey¶ÔÓ¦µÃleafEntryµÄ½á¹¹ÏêϸÈçÏÂͼËùʾ¡£

ÓÉÉÏͼ¿´³ö£¬Ò»¸öleafEntryÆäʵ¾ÍÊÇÒ»¸öÕ»£¬ Õâ¸öÕ»µ×²¿[0~5]ÕâÒ»¶Î±íʾÒѾÌá½»(commited
transaction)µÄÊÂÎñµÄValueÖµ¡£Õ»µÄ¶¥²¿[6~9]ÕâÒ»¶Î±íʾµ±Ç°ÉÐδÌá½»µÄ»îÔ¾ÊÂÎñ(uncommited
transaction)¡£ Õ»Öдæ·ÅµÄµ¥¸öÔªËØÎª(txid, type, len, data)ÕâÑùÒ»¸öËÄÔª×飬±íÃ÷ÁËÕâ¸öÊÂÎñ¶ÔÓ¦µÄvalueȡֵ¡£¸üͨÓÃÒ»µã½²£¬[0,
cxrs-1]ÕâÒ»¶ÎÕ»±íʾÒѾÌá½»µÄÊÂÎñ£¬±¾À´ÒѾÌá½»µÄÊÂÎñ²»Ó¦´æÔÚÓÚÕ»ÖУ¬µ«Ö®ËùÒÔ´æÔÚ£¬¾ÍÊÇÒòΪÓÐÆäËûÊÂÎñͨ¹ýsnapshot
readµÄ·½Ê½ÒýÓÃÁËÕâЩÊÂÎñ£¬Òò´Ë£¬³ý·ÇËùÓÐÒýÓÃ[0, cxrs-1]Õâ¶ÎÊÂÎñµÄËùÓÐÊÂÎñ¶¼Ìá½»£¬·ñÔò[0,
cxrs-1]Õâ¶ÎÕ»µÄÊÂÎñ¾Í²»»á±»»ØÊÕ¡£[cxrs, cxrs+pxrs-1]ÕâÒ»¶ÎÕ»±íʾµ±Ç°»îÔ¾µÄÉÐδÌá½»µÄÊÂÎñÁÐ±í£¬µ±Õⲿ·ÖÊÂÎñÌύʱ£¬cxrs»áÍùºóÒÆ¶¯£¬×îÖÕµ½Õ»¶¥¡£
MVCCʵÏÖ
1£©Ð´Èë²Ù×÷
ÕâÀïÎÒÃÇÈÏΪдÈë²Ù×÷°üÀ¨ÈýÖÖ£¬·Ö±ðΪinsert / delete / commit ÈýÖÖÀàÐÍ¡£¶ÔÓÚinsertºÍdeleteÕâÁ½ÖÖÀàÐ͵ÄдÈë²Ù×÷£¬Ö»ÐèÒªÔÚLeafEntryµÄÕ»¶¥·ÅÖÃÒ»¸öÔªËØ¼´¿É¡£
ÈçÏÂͼËùʾ£º

¶ÔÓÚcommit²Ù×÷£¬Ö»Ðè°ÑLeafEntryµÄÕ»¶¥ÔªËطŵ½cxrsÕâ¸öÖ¸Õë´¦£¬È»ºóÊÕËõÕ»¶¥Ö¸Õë¼´¿É¡£ÈçÏÂͼËùʾ£º

2£©¶ÁÈ¡²Ù×÷
¶Ô¶ÁÈ¡²Ù×÷¶øÑÔ£¬ Êý¾Ý¿âÒ»°ãÖ§³Ö¶à¸ö¸ôÀë¼¶±ð¡£MySQLµÄInnoDBÖ§³ÖRead UnCommitted(RU)¡¢Read
REPEATABLE(RR)¡¢Read Commited(RC)¡¢SERIALIZABLE(S)¡£ÆäÖÐRU´æÔÚÔà¶ÁµÄÇé¿ö(Ôà¶ÁÖ¸¶ÁÈ¡µ½Î´Ìá½»µÄÊÂÎñ)£¬
RC/RR/RU´æÔڻöÁµÄÇé¿ö£¨»Ã¶ÁÒ»°ãÖ¸Ò»¸öÊÂÎñÔÚ¸üÐÂʱ¿ÉÄÜ»á¸üе½ÆäËûÊÂÎñÒѾÌá½»µÄ¼Ç¼£©¡£
TokuDBͬÑùÖ§³ÖÉÏÊö4ÖиôÀë¼¶±ð£¬ ÔÚÔ´ÂëʵÏÖʱ, ft-index½«ÊÂÎñµÄ¶ÁÈ¡²Ù×÷°´ÕÕÊÂÎñ¸ôÀë¼¶±ð·Ö³É3Àà:
TXN_SNAPSHOT_NONE : ÕâÀ಻ÐèÒªsnapshot read£¬ SERIALIZABLEºÍRead
UncommitedÁ½¸ö¸ôÀë¼¶±ðÊôÓÚÕâÒ»Àà¡£
TXN_SNAPSHOT_ROOT : Read REPEATABLE¸ôÀë¼¶±ðÊôÓÚÕâÀà¡£ÔÚÕâÖÖÆäÇé¿öÏ£¬
˵Ã÷ÊÂÎñÖ»ÐèÒª¶ÁÈ¡µ½rootÊÂÎñ¶ÔÓ¦µÄxid֮ǰÒѾÌá½»µÄ¼Ç¼¼´¿É¡£
TXN_SNAPSHOT_CHILD: READ COMMITTEDÊôÓÚÕâÀà¡£ÔÚÕâÖÖÇé¿öÏ£¬¶ù×ÓÊÂÎñAÐèÒª¸ù¾Ý×Ô¼ºÊÂÎñµÄxidÀ´ÕÒµ½snapshot¶ÁµÄ°æ±¾£¬ÒòΪÔÚÕâ¸öÊÂÎñA¿ªÆôʱ£¬¿ÉÄÜÓÐÆäËûÊÂÎñB×öÁ˸üУ¬²¢Ìá½»£¬ÄÇôÊÂÎñA±ØÐë¶ÁÈ¡B¸üÐÂÖ®ºóµÄ½á¹û¡£
¶à°æ±¾¼Ç¼»ØÊÕ
Ëæ×Åʱ¼äµÄÍÆÒÆ£¬Ô½À´Ô½¶àµÄÀÏÊÂÎñ±»Ìá½»£¬ÐÂÊÂÎñ¿ªÊ¼Ö´ÐС£ ÔÚ·ÖÐÎÊ÷ÖеÄLeafNodeÖÐcommitedµÄÊÂÎñÊýÁ¿»áÔ½À´Ô½¶à£¬¼ÙÉè²»Ïë·½Éè·¨°ÑÕâЩ¹ýÆÚµÄÊÂÎñ¼Ç¼ÇåÀíµôµÄ»°£¬»áÔì³ÉBasementNode½ÚµãÕ¼ÓôóÁ¿¿Õ¼ä£¬Ò²»áÔì³ÉTokuDBµÄÊý¾ÝÎļþ´æ·Å´óÁ¿ÎÞÓõÄÊý¾Ý¡£
ÔÚTokuDBÖУ¬ ÇåÀíÕâЩ¹ýÆÚÊÂÎñµÄ²Ù×÷³ÆÖ®ÎªÀ¬»ø»ØÊÕ£¨Garbage Collection£©¡£
ÆäʵInnoDBÒ²´æÔÚ¹ýÆÚÊÂÎñ»ØÊÕÕâôһ¸ö¹ý³Ì£¬InnoDBµÄͬһ¸öKeyµÄ¶à¸ö°æ±¾µÄValue´æ·ÅÔÚundo
log Ò³ÉÏ£¬ µ±ÊÂÎñ¹ýÆÚʱ£¬ ºǫ́ÓÐÒ»¸öpurgeÏß³ÌרÃÅÀ´¸´ÔÓÇåÀíÕâЩ¹ýÆÚµÄÊÂÎñ£¬´Ó¶øÌÚ³öundo
logÒ³¸øºóÃæµÄÊÂÎñʹÓ㬠ÕâÑù¿ÉÒÔ¿ØÖÆundo logÎÞÏÞÔö³¤¡£
TokuDB´æ´¢ÒýÇæÖÐûÓÐÀàËÆÓÚInnoDBµÄpurgeÏß³ÌÀ´¸ºÔðÇåÀí¹ýÆÚÊÂÎñ£¬ÒòΪ¹ýÆÚÊÂÎñµÄÇåÀí¶¼ÊÇÔÚÖ´ÐиüвÙ×÷ÊÇ˳±ãGCµÄ¡£
Ò²¾ÍÊÇÔÚInsert/Delete/UpdateÕâЩ²Ù×÷Ö´ÐÐʱ£¬¶¼»áÅжÏÒÔϵ±Ç°µÄLeafEntryÊÇ·ñÂú×ãGCµÄÌõ¼þ£¬
ÈôÂú×ãGCÌõ¼þʱ£¬¾Íɾ³ýLeafEntryÖйýÆÚµÄÊÂÎñ£¬ ÖØÐÂÕûÀíLeafEntry µÄÄÚ´æ¿Õ¼ä¡£°´ÕÕTokuDBÔ´ÂëµÄʵÏÖ£¬GC·ÖΪÁ½ÖÖÀàÐÍ£º
Simple GC£ºÔÚÿ´Îapply ÏûÏ¢µ½leafentry ʱ£¬ ¶¼»áЯ´øÒ»¸ögc_info£¬
Õâ¸ögc_info Öаüº¬ÁËoldest_referenced_xidÕâ¸ö×ֶΡ£ ÄÇôsimple_gcµÄÒâ˼ÊÇÊ²Ã´ÄØ£¿
simple_gc¾ÍÊÇ×öÒ»´Î¼òµ¥µÄGC£¬ Ö±½Ó°ÑcommitedµÄÊÂÎñÁбíÇåÀíµô£¨¼ÇסҪʣÏÂÒ»¸öcommitÊÂÎñµÄ¼Ç¼£¬
·ñÔòÏ´βéÕÒÕâÌõcommitedµÄ¼Ç¼ÔõôÕҵĵ½£¿ £©¡£Õâ¾ÍÊÇsimple_gc£¬ ¼òµ¥±©Á¦¸ßЧ¡£
Full GC£ºfull gcµÄ´¥·¢Ìõ¼þºÍgcÁ÷³Ì¶¼±È½Ï¸´ÔÓ£¬ ¸ù±¾Òâͼ¶¼ÊÇÒªÇåÀíµô¹ýÆÚµÄÒѾÌá½»µÄÊÂÎñ¡£ÕâÀï²»ÔÙÕ¹¿ª¡£
×ܽá
±¾ÎÄ´óÖ½éÉÜÁËTokuDBÊÂÎñµÄ¸ôÀëÐÔʵÏÖÔÀí£¬ °üÀ¨TokuDBµÄÊÂÎñ±íʾ¡¢·ÖÐÎÊ÷µÄLeafEntryµÄ½á¹¹¡¢MVCCµÄʵÏÖÁ÷³Ì¡¢¶à°æ±¾¼Ç¼»ØÊÕ·½Ê½ÕâЩ·½ÃæµÄÄÚÈÝ¡£
TokuDBÖ®ËùÓÐûÓÐundo log£¬¾ÍÊÇÒòΪ·ÖÐÎÊ÷ÖеĸüÐÂÏûÏ¢±¾Éí¾Í¼Ç¼ÁËÊÂÎñµÄ¼Ç¼°æ±¾¡£ÁíÍ⣬
TokuDBµÄ¹ýÆÚÊÂÎñ»ØÊÕÒ²²»ÐèÒªÏñInnoDBÄÇÑùרÃÅ¿ªÆôÒ»¸öºǫ́Ïß³ÌÒì²½»ØÊÕ£¬¶øÊDzÅÓÃÔÚ¸üвÙ×÷Ö´ÐеĹý³ÌÖзÖ̯»ØÊÕ¡£×ÜÖ®£¬ÓÉÓÚTokuDB»ùÓÚ·ÖÐÎÊ÷Ö®ÉÏʵÏÖÊÂÎñ£¬Òò¶ø¸÷·½ÃæµÄ˼·¶¼ÓдóµÄ²îÒ죬ÕâÒ²ÊÇTokuDBÍŶӵĴ´Ð°ɡ£
|