HBase¼ò½é
HBase ¨C Hadoop Database£¬ÊÇÒ»¸ö¸ß¿É¿¿ÐÔ¡¢¸ßÐÔÄÜ¡¢ÃæÏòÁС¢¿ÉÉìËõµÄ·Ö²¼Ê½´æ´¢ÏµÍ³£¬ÀûÓÃHBase¼¼Êõ¿ÉÔÚÁ®¼ÛPC
ServerÉϴÆð´ó¹æÄ£½á¹¹»¯´æ´¢¼¯Èº¡£
HBaseÊÇGoogle BigtableµÄ¿ªÔ´ÊµÏÖ£¬ÀàËÆGoogle
BigtableÀûÓÃGFS×÷ΪÆäÎļþ´æ´¢ÏµÍ³£¬HBaseÀûÓÃHadoop HDFS×÷ΪÆäÎļþ´æ´¢ÏµÍ³£»GoogleÔËÐÐMapReduceÀ´´¦ÀíBigtableÖеĺ£Á¿Êý¾Ý£¬HBaseͬÑùÀûÓÃHadoop
MapReduceÀ´´¦ÀíHBaseÖеĺ£Á¿Êý¾Ý£»Google BigtableÀûÓà Chubby×÷ΪÐͬ·þÎñ£¬HBaseÀûÓÃZookeeper×÷Ϊ¶ÔÓ¦¡£

ÉÏͼÃèÊöÁËHadoop EcoSystemÖеĸ÷²ãϵͳ£¬ÆäÖÐHBaseλÓڽṹ»¯´æ´¢²ã£¬Hadoop
HDFSΪHBaseÌṩÁ˸߿ɿ¿ÐÔµÄµ×²ã´æ´¢Ö§³Ö£¬Hadoop MapReduceΪHBaseÌṩÁ˸ßÐÔÄܵļÆËãÄÜÁ¦£¬ZookeeperΪHBaseÌṩÁËÎȶ¨·þÎñºÍfailover»úÖÆ¡£
´ËÍ⣬PigºÍHive»¹ÎªHBaseÌṩÁ˸߲ãÓïÑÔÖ§³Ö£¬Ê¹µÃÔÚHBaseÉϽøÐÐÊý¾Ýͳ¼Æ´¦Àí±äµÄ·Ç³£¼òµ¥¡£
SqoopÔòΪHBaseÌṩÁË·½±ãµÄRDBMSÊý¾Ýµ¼È빦ÄÜ£¬Ê¹µÃ´«Í³Êý¾Ý¿âÊý¾ÝÏòHBaseÖÐÇ¨ÒÆ±äµÄ·Ç³£·½±ã¡£
HBase·ÃÎʽӿÚ
1. Native Java API£¬×î³£¹æºÍ¸ßЧµÄ·ÃÎÊ·½Ê½£¬ÊʺÏHadoop MapReduce Job²¢ÐÐÅú´¦ÀíHBase±íÊý¾Ý
2. HBase Shell£¬HBaseµÄÃüÁîÐй¤¾ß£¬×î¼òµ¥µÄ½Ó¿Ú£¬ÊʺÏHBase¹ÜÀíʹÓÃ
3. Thrift Gateway£¬ÀûÓÃThriftÐòÁл¯¼¼Êõ£¬Ö§³ÖC++£¬PHP£¬PythonµÈ¶àÖÖÓïÑÔ£¬ÊÊºÏÆäËûÒ칹ϵͳÔÚÏß·ÃÎÊHBase±íÊý¾Ý
4. REST Gateway£¬Ö§³ÖREST ·ç¸ñµÄHttp API·ÃÎÊHBase, ½â³ýÁËÓïÑÔÏÞÖÆ
5. Pig£¬¿ÉÒÔʹÓÃPig LatinÁ÷ʽ±à³ÌÓïÑÔÀ´²Ù×÷HBaseÖеÄÊý¾Ý£¬ºÍHiveÀàËÆ£¬±¾ÖÊ×îÖÕÒ²ÊDZàÒë³ÉMapReduce
JobÀ´´¦ÀíHBase±íÊý¾Ý£¬ÊʺÏ×öÊý¾Ýͳ¼Æ
6. Hive£¬µ±Ç°HiveµÄRelease°æ±¾ÉÐûÓмÓÈë¶ÔHBaseµÄÖ§³Ö£¬µ«ÔÚÏÂÒ»¸ö°æ±¾Hive
0.7.0Öн«»áÖ§³ÖHBase£¬¿ÉÒÔʹÓÃÀàËÆSQLÓïÑÔÀ´·ÃÎÊHBase
HBaseÊý¾ÝÄ£ÐÍ
Table & Column Family

Row Key: Ðмü£¬TableµÄÖ÷¼ü£¬TableÖеļǼ°´ÕÕRow KeyÅÅÐò
1.Timestamp: ʱ¼ä´Á£¬Ã¿´ÎÊý¾Ý²Ù×÷¶ÔÓ¦µÄʱ¼ä´Á£¬¿ÉÒÔ¿´×÷ÊÇÊý¾ÝµÄversion
number
2.Column Family£ºÁдأ¬TableÔÚˮƽ·½ÏòÓÐÒ»¸ö»òÕß¶à¸öColumn
Family×é³É£¬Ò»¸öColumn FamilyÖпÉÒÔÓÉÈÎÒâ¶à¸öColumn×é³É£¬¼´Column FamilyÖ§³Ö¶¯Ì¬À©Õ¹£¬ÎÞÐèÔ¤Ïȶ¨ÒåColumnµÄÊýÁ¿ÒÔ¼°ÀàÐÍ£¬ËùÓÐColumn¾ùÒÔ¶þ½øÖƸñʽ´æ´¢£¬Óû§ÐèÒª×ÔÐнøÐÐÀàÐÍת»»¡£
Table & Region
µ±TableËæ×żÇ¼Êý²»¶ÏÔö¼Ó¶ø±ä´óºó£¬»áÖð½¥·ÖÁѳɶà·Ýsplits£¬³ÉΪregions£¬Ò»¸öregionÓÉ[startkey,endkey)±íʾ£¬²»Í¬µÄregion»á±»Master·ÖÅ䏸ÏàÓ¦µÄRegionServer½øÐйÜÀí£º

-ROOT- && .META. Table
HBaseÖÐÓÐÁ½ÕÅÌØÊâµÄTable£¬-ROOT-ºÍ.META.
1..META.£º¼Ç¼ÁËÓû§±íµÄRegionÐÅÏ¢£¬.META.¿ÉÒÔÓжà¸öregoin
2.-ROOT-£º¼Ç¼ÁË.META.±íµÄRegionÐÅÏ¢£¬-ROOT-Ö»ÓÐÒ»¸öregion
3.ZookeeperÖмǼÁË-ROOT-±íµÄlocation

Client·ÃÎÊÓû§Êý¾Ý֮ǰÐèÒªÊ×ÏÈ·ÃÎÊzookeeper£¬È»ºó·ÃÎÊ-ROOT-±í£¬½Ó×Å·ÃÎÊ.META.±í£¬×îºó²ÅÄÜÕÒµ½Óû§Êý¾ÝµÄλÖÃÈ¥·ÃÎÊ£¬ÖмäÐèÒª¶à´ÎÍøÂç²Ù×÷£¬²»¹ýclient¶Ë»á×öcache»º´æ¡£
MapReduce on HBase
ÔÚHBaseϵͳÉÏÔËÐÐÅú´¦ÀíÔËË㣬×î·½±ãºÍʵÓõÄÄ£ÐÍÒÀÈ»ÊÇMapReduce£¬ÈçÏÂͼ£º

HBase TableºÍRegionµÄ¹ØÏµ£¬±È½ÏÀàËÆHDFS FileºÍBlockµÄ¹ØÏµ£¬HBaseÌṩÁËÅäÌ×µÄTableInputFormatºÍTableOutputFormat
API£¬¿ÉÒÔ·½±ãµÄ½«HBase Table×÷ΪHadoop MapReduceµÄSourceºÍSink£¬¶ÔÓÚMapReduce
JobÓ¦Óÿª·¢ÈËÔ±À´Ëµ£¬»ù±¾²»ÐèÒª¹Ø×¢HBaseϵͳ×ÔÉíµÄϸ½Ú¡£
HBaseϵͳ¼Ü¹¹

Client
HBase ClientʹÓÃHBaseµÄRPC»úÖÆÓëHMasterºÍHRegionServer½øÐÐͨÐÅ£¬¶ÔÓÚ¹ÜÀíÀà²Ù×÷£¬ClientÓëHMaster½øÐÐRPC£»¶ÔÓÚÊý¾Ý¶ÁдÀà²Ù×÷£¬ClientÓëHRegionServer½øÐÐRPC
Zookeeper
Zookeeper QuorumÖгýÁË´æ´¢ÁË-ROOT-±íµÄµØÖ·ºÍHMasterµÄµØÖ·£¬HRegionServerÒ²»á°Ñ×Ô¼ºÒÔEphemeral·½Ê½×¢²áµ½
ZookeeperÖУ¬Ê¹µÃHMaster¿ÉÒÔËæÊ±¸ÐÖªµ½¸÷¸öHRegionServerµÄ½¡¿µ×´Ì¬¡£´ËÍ⣬ZookeeperÒ²±ÜÃâÁËHMasterµÄ
µ¥µãÎÊÌ⣬¼ûÏÂÎÄÃèÊö
HMaster
HMasterûÓе¥µãÎÊÌ⣬HBaseÖпÉÒÔÆô¶¯¶à¸öHMaster£¬Í¨¹ýZookeeperµÄMaster
Election»úÖÆ±£Ö¤×ÜÓÐÒ»¸öMasterÔËÐУ¬HMasterÔÚ¹¦ÄÜÉÏÖ÷Òª¸ºÔðTableºÍRegionµÄ¹ÜÀí¹¤×÷£º
1. ¹ÜÀíÓû§¶ÔTableµÄÔö¡¢É¾¡¢¸Ä¡¢²é²Ù×÷
2. ¹ÜÀíHRegionServerµÄ¸ºÔؾùºâ£¬µ÷ÕûRegion·Ö²¼
3. ÔÚRegion Splitºó£¬¸ºÔðÐÂRegionµÄ·ÖÅä
4. ÔÚHRegionServerÍ£»úºó£¬¸ºÔðʧЧHRegionServer ÉϵÄRegionsÇ¨ÒÆ
HRegionServer
HRegionServerÖ÷Òª¸ºÔðÏìÓ¦Óû§I/OÇëÇó£¬ÏòHDFSÎļþϵͳÖжÁдÊý¾Ý£¬ÊÇHBaseÖÐ×îºËÐĵÄÄ£¿é¡£

HRegionServerÄÚ²¿¹ÜÀíÁËһϵÁÐHRegion¶ÔÏó£¬Ã¿¸öHRegion¶ÔÓ¦ÁËTableÖеÄÒ»¸ö
Region£¬HRegionÖÐÓɶà¸öHStore×é³É¡£Ã¿¸öHStore¶ÔÓ¦ÁËTableÖеÄÒ»¸öColumn
FamilyµÄ´æ´¢£¬¿ÉÒÔ¿´³öÿ¸öColumn FamilyÆäʵ¾ÍÊÇÒ»¸ö¼¯ÖеĴ洢µ¥Ôª£¬Òò´Ë×îºÃ½«¾ß±¸¹²Í¬IOÌØÐÔµÄcolumn·ÅÔÚÒ»¸öColumn
FamilyÖУ¬ÕâÑù×î¸ßЧ¡£
HStore´æ´¢ÊÇHBase´æ´¢µÄºËÐÄÁË£¬ÆäÖÐÓÉÁ½²¿·Ö×é³É£¬Ò»²¿·ÖÊÇMemStore£¬Ò»²¿·ÖÊÇStoreFiles¡£
MemStoreÊÇSorted Memory Buffer£¬Óû§Ð´ÈëµÄÊý¾ÝÊ×ÏÈ»á·ÅÈëMemStore£¬µ±MemStoreÂúÁËÒÔºó»áFlush³ÉÒ»¸öStoreFile£¨µ×²ãʵÏÖÊÇHFile£©£¬
µ±StoreFileÎļþÊýÁ¿Ôö³¤µ½Ò»¶¨ãÐÖµ£¬»á´¥·¢CompactºÏ²¢²Ù×÷£¬½«¶à¸öStoreFilesºÏ²¢³ÉÒ»¸öStoreFile£¬ºÏ²¢¹ý³ÌÖÐ»á½ø
Ðа汾ºÏ²¢ºÍÊý¾Ýɾ³ý£¬Òò´Ë¿ÉÒÔ¿´³öHBaseÆäʵֻÓÐÔö¼ÓÊý¾Ý£¬ËùÓеĸüкÍɾ³ý²Ù×÷¶¼ÊÇÔÚºóÐøµÄcompact¹ý³ÌÖнøÐеģ¬ÕâʹµÃÓû§µÄд²Ù×÷Ö»Òª
½øÈëÄÚ´æÖоͿÉÒÔÁ¢¼´·µ»Ø£¬±£Ö¤ÁËHBase I/OµÄ¸ßÐÔÄÜ¡£µ±StoreFiles Compactºó£¬»áÖð²½ÐγÉÔ½À´Ô½´óµÄStoreFile£¬µ±µ¥¸öStoreFile´óС³¬¹ýÒ»¶¨ãÐÖµºó£¬»á´¥·¢Split²Ù×÷£¬Í¬Ê±°Ñµ±Ç°
Region Split³É2¸öRegion£¬¸¸Region»áÏÂÏߣ¬ÐÂSplit³öµÄ2¸öº¢×ÓRegion»á±»HMaster·ÖÅäµ½ÏàÓ¦µÄHRegionServer
ÉÏ£¬Ê¹µÃÔÏÈ1¸öRegionµÄѹÁ¦µÃÒÔ·ÖÁ÷µ½2¸öRegionÉÏ¡£ÏÂͼÃèÊöÁËCompactionºÍSplitµÄ¹ý³Ì£º

ÔÚÀí½âÁËÉÏÊöHStoreµÄ»ù±¾ÔÀíºó£¬»¹±ØÐëÁ˽âÒ»ÏÂHLogµÄ¹¦ÄÜ£¬ÒòΪÉÏÊöµÄHStoreÔÚϵͳÕý³£¹¤×÷µÄǰÌáÏÂÊÇûÓÐÎÊ
ÌâµÄ£¬µ«ÊÇÔÚ·Ö²¼Ê½ÏµÍ³»·¾³ÖУ¬ÎÞ·¨±ÜÃâϵͳ³ö´í»òÕßå´»ú£¬Òò´ËÒ»µ©HRegionServerÒâÍâÍ˳ö£¬MemStoreÖеÄÄÚ´æÊý¾Ý½«»á¶ªÊ§£¬Õâ¾ÍÐè
ÒªÒýÈëHLogÁË¡£Ã¿¸öHRegionServerÖж¼ÓÐÒ»¸öHLog¶ÔÏó£¬HLogÊÇÒ»¸öʵÏÖWrite Ahead
LogµÄÀ࣬ÔÚÿ´ÎÓû§²Ù×÷дÈëMemStoreµÄͬʱ£¬Ò²»áдһ·ÝÊý¾Ýµ½HLogÎļþÖУ¨HLogÎļþ¸ñʽ¼ûºóÐø£©£¬HLogÎļþ¶¨ÆÚ»á¹ö¶¯³öÐµģ¬²¢
ɾ³ý¾ÉµÄÎļþ£¨Òѳ־û¯µ½StoreFileÖеÄÊý¾Ý£©¡£µ±HRegionServerÒâÍâÖÕÖ¹ºó£¬HMaster»áͨ¹ýZookeeper¸ÐÖª
µ½£¬HMasterÊ×ÏȻᴦÀíÒÅÁôµÄ HLogÎļþ£¬½«ÆäÖв»Í¬RegionµÄLogÊý¾Ý½øÐвð·Ö£¬·Ö±ð·Åµ½ÏàÓ¦regionµÄĿ¼Ï£¬È»ºóÔÙ½«Ê§Ð§µÄregionÖØÐ·ÖÅ䣬ÁìÈ¡
µ½ÕâЩregionµÄHRegionServerÔÚLoad RegionµÄ¹ý³ÌÖУ¬»á·¢ÏÖÓÐÀúÊ·HLogÐèÒª´¦Àí£¬Òò´Ë»áReplay
HLogÖеÄÊý¾Ýµ½MemStoreÖУ¬È»ºóflushµ½StoreFiles£¬Íê³ÉÊý¾Ý»Ö¸´¡£
HBase´æ´¢¸ñʽ
HBaseÖеÄËùÓÐÊý¾ÝÎļþ¶¼´æ´¢ÔÚHadoop HDFSÎļþϵͳÉÏ£¬Ö÷Òª°üÀ¨ÉÏÊöÌá³öµÄÁ½ÖÖÎļþÀàÐÍ£º
1. HFile£¬ HBaseÖÐKeyValueÊý¾ÝµÄ´æ´¢¸ñʽ£¬HFileÊÇHadoopµÄ¶þ½øÖƸñʽÎļþ£¬Êµ¼ÊÉÏStoreFile¾ÍÊǶÔHFile×öÁËÇáÁ¿¼¶°ü×°£¬¼´StoreFileµ×²ã¾ÍÊÇHFile
2. HLog File£¬HBaseÖÐWAL£¨Write Ahead Log£© µÄ´æ´¢¸ñʽ£¬ÎïÀíÉÏÊÇHadoopµÄSequence
File
HFile
ÏÂͼÊÇHFileµÄ´æ´¢¸ñʽ£º

Ê×ÏÈHFileÎļþÊDz»¶¨³¤µÄ£¬³¤¶È¹Ì¶¨µÄÖ»ÓÐÆäÖеÄÁ½¿é£ºTrailerºÍFileInfo¡£ÕýÈçͼÖÐËùʾµÄ£¬Trailer
ÖÐÓÐÖ¸ÕëÖ¸ÏòÆäËûÊý¾Ý¿éµÄÆðʼµã¡£File InfoÖмǼÁËÎļþµÄһЩMetaÐÅÏ¢£¬ÀýÈ磺AVG_KEY_LEN,
AVG_VALUE_LEN, LAST_KEY, COMPARATOR, MAX_SEQ_ID_KEYµÈ¡£Data
IndexºÍMeta Index¿é¼Ç¼ÁËÿ¸öData¿éºÍMeta¿éµÄÆðʼµã¡£
Data BlockÊÇHBase I/OµÄ»ù±¾µ¥Ôª£¬ÎªÁËÌá¸ßЧÂÊ£¬HRegionServerÖÐÓлùÓÚLRUµÄBlock
Cache»úÖÆ¡£Ã¿¸öData¿éµÄ´óС¿ÉÒÔÔÚ´´½¨Ò»¸öTableµÄʱºòͨ¹ý²ÎÊýÖ¸¶¨£¬´óºÅµÄBlockÓÐÀûÓÚ˳ÐòScan£¬Ð¡ºÅBlockÀûÓÚËæ»ú²éѯ¡£
ÿ¸öData¿é³ýÁË¿ªÍ·µÄMagicÒÔÍâ¾ÍÊÇÒ»¸ö¸öKeyValue¶ÔÆ´½Ó¶ø³É, MagicÄÚÈݾÍÊÇÒ»Ð©Ëæ»úÊý×Ö£¬Ä¿µÄÊÇ·ÀÖ¹Êý¾ÝË𻵡£ºóÃæ»áÏêϸ½éÉÜÿ¸öKeyValue¶ÔµÄÄÚ²¿¹¹Ôì¡£
HFileÀïÃæµÄÿ¸öKeyValue¶Ô¾ÍÊÇÒ»¸ö¼òµ¥µÄbyteÊý×é¡£µ«ÊÇÕâ¸öbyteÊý×éÀïÃæ°üº¬Á˺ܶàÏ²¢ÇÒÓй̶¨µÄ½á¹¹¡£ÎÒÃÇÀ´¿´¿´ÀïÃæµÄ¾ßÌå½á¹¹£º

¿ªÊ¼ÊÇÁ½¸ö¹Ì¶¨³¤¶ÈµÄÊýÖµ£¬·Ö±ð±íʾKeyµÄ³¤¶ÈºÍValueµÄ³¤¶È¡£½ô½Ó×ÅÊÇKey£¬¿ªÊ¼Êǹ̶¨³¤¶ÈµÄÊýÖµ£¬±íʾRowKey
µÄ³¤¶È£¬½ô½Ó×ÅÊÇRowKey£¬È»ºóÊǹ̶¨³¤¶ÈµÄÊýÖµ£¬±íʾFamilyµÄ³¤¶È£¬È»ºóÊÇFamily£¬½Ó×ÅÊÇQualifier£¬È»ºóÊÇÁ½¸ö¹Ì¶¨³¤¶ÈµÄÊý
Öµ£¬±íʾTime StampºÍKey Type£¨Put/Delete£©¡£Value²¿·ÖûÓÐÕâô¸´ÔӵĽṹ£¬¾ÍÊÇ´¿´âµÄ¶þ½øÖÆÊý¾ÝÁË¡£
HLogFile

ÉÏͼÖÐʾÒâÁËHLogÎļþµÄ½á¹¹£¬ÆäʵHLogÎļþ¾ÍÊÇÒ»¸öÆÕͨµÄHadoop
Sequence File£¬Sequence File µÄKeyÊÇHLogKey¶ÔÏó£¬HLogKeyÖмǼÁËдÈëÊý¾ÝµÄ¹éÊôÐÅÏ¢£¬³ýÁËtableºÍregionÃû×ÖÍ⣬ͬʱ»¹°üÀ¨
sequence numberºÍtimestamp£¬timestampÊÇ¡°Ð´Èëʱ¼ä¡±£¬sequence
numberµÄÆðʼֵΪ0£¬»òÕßÊÇ×î½üÒ»´Î´æÈëÎļþϵͳÖÐsequence number¡£
HLog Sequece FileµÄValueÊÇHBaseµÄKeyValue¶ÔÏ󣬼´¶ÔÓ¦HFileÖеÄKeyValue£¬¿É²Î¼ûÉÏÎÄÃèÊö¡£
½áÊø
±¾ÎĶÔHBase¼¼ÊõÔÚ¹¦ÄܺÍÉè¼ÆÉϽøÐÐÁË´óÖµĽéÉÜ£¬ÓÉÓÚÆª·ùÓÐÏÞ£¬±¾ÎÄûÓйý¶àÉîÈëµØÃèÊöHBaseµÄһЩϸ½Ú¼¼Êõ¡£Ä¿Ç°Ò»ÌԵĴ洢ϵͳ¾ÍÊÇ»ùÓÚHBase¼¼Êõ´î½¨µÄ£¬ºóÐø½«½éÉÜ¡°Ò»ÌÔ·Ö²¼Ê½´æ´¢ÏµÍ³¡±£¬Í¨¹ýʵ¼Ê°¸ÀýÀ´¸ü¶àµÄ½éÉÜHBaseÓ¦Óᣠ|