±à¼ÍƼö: |
±¾ÎÄÀ´×ÔÓÚcsdn£¬±¾ÎÄͨ¹ýÖð²½ÉîÈë¼Ç¼CephµÄѧϰºÍÔËÓýéÉÜÁËcephµÄ¼Ü¹¹ÔÀí£¬Ï£Íû¶ÔÄúµÄѧϰÓÐËù°ïÖú¡£ |
|
Ceph¼ò½é
CephÊÇÒ»¸ö·Ö²¼Ê½´æ´¢ÏµÍ³£¬Ìṩ¶ÔÏ󣬿éºÍÎļþ´æ´¢£¬ÊÇÒ»¸öÃâ·Ñ¿ªÔ´Èí¼þµÄ´æ´¢½â¾ö·½°¸£¬¿ÉÒÔ²¿ÊðÓÚÆÕͨµÄx86¼æÈÝ·þÎñÆ÷ÉÏ£¬¿ÉÓÃÓÚ½â¾öͳһ´æ´¢µÄioÎÊÌâ¡£Cephµ®ÉúÓÚ2004Ä꣬×îÔçÊÇSageWeilÒ»Ïî¹ØÓڴ洢ϵͳµÄPhDÑо¿ÏîÄ¿£¬ÖÂÁ¦ÓÚ¿ª·¢ÏÂÒ»´ú¸ßÐÔÄÜ·Ö²¼Ê½ÎļþϵͳµÄÏîÄ¿¡£Ëæ×ÅÔÆ¼ÆËãµÄ·¢Õ¹£¬ceph³ËÉÏÁËOpenStackµÄ´º·ç£¬½ø¶ø³ÉΪÁË¿ªÔ´ÉçÇøÊܹØ×¢½Ï¸ßµÄÏîĿ֮һ¡£
¸Ãϵͳ±»Éè¼Æ³É×Ô¶¯ÐÞ¸´ºÍÖÇÄܹÜÀí£¬Ï£Íû¼õµÍ¹ÜÀíÔ±ºÍÔ¤Ë㿪Ïú¡£
Ïë´ïµ½µÄÄ¿±ê£ºÃ»Óе¥µã¹ÊÕϵÄÍêÈ«·Ö²¼Ê½´æ´¢ÏµÍ³£¬Ê¹Êý¾ÝÄÜÈÝ´íºÍÎÞ·ìµÄ¸´ÖÆ£¬¿ÉÀ©Õ¹EBˮƽ(EB,PB,TB,GB)
Cephͬʱ֧³Ö¿é¡¢Îļþ¡¢¶ÔÏó½Ó¿Ú£¬Ö§³ÖPB¼¶±ðÀ©Õ¹£¬¹æ¸ñÉϿɲ¿Êðµ½ÉÏǧ̨ͨÓ÷þÎñÆ÷¡£¶ÔÏóS3ºÍSwiftдÈëµÄÊý¾ÝÊÇÏ໥¿É¶ÁÈ¡µÄ¡£
CephµÄÓŵã
CRUSHËã·¨
CrushËã·¨ÊÇcephµÄÁ½´ó´´ÐÂÖ®Ò»£¬¼òµ¥À´Ëµ£¬cephÞðÆúÁË´«Í³µÄ¼¯ÖÐʽ´æ´¢ÔªÊý¾ÝѰַµÄ·½°¸£¬×ª¶øÊ¹ÓÃCRUSHËã·¨Íê³ÉÊý¾ÝµÄѰַ²Ù×÷¡£CRUSHÔÚÒ»ÖÂÐÔ¹þÏ£»ù´¡ÉϺܺõĿ¼ÂÇÁËÈÝÔÖÓòµÄ¸ôÀ룬Äܹ»ÊµÏÖ¸÷Àà¸ºÔØµÄ¸±±¾·ÅÖùæÔò£¬ÀýÈç¿ç»ú·¿¡¢»ú¼Ü¸ÐÖªµÈ¡£CrushËã·¨ÓÐÏ൱ǿ´óµÄÀ©Õ¹ÐÔ£¬ÀíÂÛÉÏÖ§³ÖÊýǧ¸ö´æ´¢½Úµã¡£
¸ß¿ÉÓÃ
CephÖеÄÊý¾Ý¸±±¾ÊýÁ¿¿ÉÒÔÓɹÜÀíÔ±×ÔÐж¨Ò壬²¢¿ÉÒÔͨ¹ýCRUSHËã·¨Ö¸¶¨¸±±¾µÄÎïÀí´æ´¢Î»ÖÃÒÔ·Ö¸ô¹ÊÕÏÓò£¬Ö§³ÖÊý¾ÝǿһÖÂÐÔ£»
ceph¿ÉÒÔÈÌÊܶàÖÖ¹ÊÕϳ¡¾°²¢×Ô¶¯³¢ÊÔ²¢ÐÐÐÞ¸´¡£
¸ßÀ©Õ¹ÐÔ
Ceph²»Í¬ÓÚswift£¬¿Í»§¶ËËùÓеĶÁд²Ù×÷¶¼Òª¾¹ý´úÀí½Úµã¡£Ò»µ©¼¯Èº²¢·¢Á¿Ôö´óʱ£¬´úÀí½ÚµãºÜÈÝÒ׳ÉΪµ¥µãÆ¿¾±¡£Ceph±¾Éí²¢Ã»ÓÐÖ÷¿Ø½Úµã£¬À©Õ¹ÆðÀ´±È½ÏÈÝÒ×£¬²¢ÇÒÀíÂÛÉÏ£¬ËüµÄÐÔÄÜ»áËæ×Å´ÅÅÌÊýÁ¿µÄÔö¼Ó¶øÏßÐÔÔö³¤¡£
ÌØÐԷḻ
CephÖ§³ÖÈýÖÖµ÷Óýӿڣº¶ÔÏó´æ´¢£¬¿é´æ´¢£¬Îļþϵͳ¹ÒÔØ¡£ÈýÖÖ·½Ê½¿ÉÒÔһͬʹÓá£ÔÚ¹úÄÚһЩ¹«Ë¾µÄÔÆ»·¾³ÖУ¬Í¨³£»á²ÉÓÃceph×÷ΪopenstackµÄΨһºó¶Ë´æ´¢À´ÌáÉýÊý¾Ýת·¢Ð§ÂÊ¡£
CephµÄ´æ´¢ÊµÏּܹ¹
Cephϵͳ¿ÉÒÔ´óÖ»®·ÖΪÁ½´ó²¿·Ö£¬¿Í»§¶ËºÍ·þÎñ¶Ë£¬¿Í»§¶Ë°üº¬ÁËËÄÖÖ½Ó¿Ú£¬·þÎñ¶Ë°üº¬ÁËÔªÊý¾Ý·þÎñÆ÷£¬¶ÔÏó´æ´¢¼¯ÈººÍ¼¯Èº¼àÊÓÆ÷£º
¿Í»§¶Ë
ÃæÏòÓû§µÄʹÓÃÌṩ½Ó¿Ú£¬Ä¿Ç°ÓÐÈýÖÖ´æ´¢·½Ê½½Ó¿ÚÌṩ£¬¶ÔÏó´æ´¢ RGW(rados
gateway)¡¢¿é´æ´¢ RBD(rados block device) ºÍÎļþ´æ´¢ CephFS¡£
¿é´æ´¢ºÍÎļþ´æ´¢¶¼ÊÇ»ùÓÚ¶ÔÏó´æ´¢À´½øÐзâװʵÏֵģ¬¿é´æ´¢ºÍÎļþ´æ´¢µÄµ×²ã»¹ÊǶÔÏó´æ´¢¡£
¶ÔÏó´æ´¢£¨RGW:RADOS gateway£©
Ceph ¶ÔÏó´æ´¢·þÎñÌṩÁË REST ·ç¸ñµÄ API £¬ËüÓÐÓë Amazon
S3 ºÍ OpenStack Swift ¼æÈݵĽӿڡ£Ò²¾ÍÊÇͨ³£ÒâÒåµÄ¼üÖµ´æ´¢£¬Æä½Ó¿Ú¾ÍÊǼòµ¥µÄGET¡¢PUT¡¢DELºÍÆäËûÀ©Õ¹;
RADOSGWÊÇÒ»Ì×»ùÓÚµ±Ç°Á÷ÐеÄRESTFULÐÒéµÄÍø¹Ø£¬²¢ÇÒ¼æÈÝS3ºÍSwift¡£
¿é´æ´¢£¨RBD£ºRADOS block device£©
RBDͨ¹ýLinuxÄں˿ͻ§¶ËºÍQEMU/KVMÇý¶¯À´Ìṩһ¸ö·Ö²¼Ê½µÄ¿éÉ豸¡£
RBD ÊÇͨ¹ýlibrbd¿â¶ÔÓ¦ÓÃÌṩ¿é´æ´¢£¬Ö÷ÒªÃæÏòÔÆÆ½Ì¨µÄÐéÄâ»úÌṩÐéÄâ´ÅÅÌ£»RBDÀàËÆ´«Í³µÄSAN´æ´¢£¬ÌṩÊý¾Ý¿é¼¶±ðµÄ·ÃÎÊ£»
Ŀǰ RBD ÌṩÁËÁ½¸ö½Ó¿Ú£¬Ò»ÖÖÊÇÖ±½ÓÔÚÓû§Ì¬ÊµÏÖ£¬ ͨ¹ý QEMU
Driver ¹© KVM ÐéÄâ»úʹÓᣠÁíÒ»ÖÖÊÇÔÚ²Ù×÷ϵͳÄÚºË̬ʵÏÖÁËÒ»¸öÄÚºËÄ£¿é¡£Í¨¹ý¸ÃÄ£¿é¿ÉÒÔ°Ñ¿éÉ豸ӳÉ䏸ÎïÀíÖ÷»ú£¬ÓÉÎïÀíÖ÷»úÖ±½Ó·ÃÎÊ¡£
Îļþ´æ´¢ £¨CEPH FS£©
CEPH FSͨ¹ýLinuxÄں˿ͻ§¶ËºÍFUSEÀ´Ìṩһ¸ö¼æÈÝPOSIXµÄÎļþϵͳ¡£
Ceph Îļþϵͳ·þÎñÌṩÁ˼æÈÝ POSIX µÄÎļþϵͳ£¬¿ÉÒÔÖ±½Ó¹ÒÔØÎªÓû§¿Õ¼äÎļþϵͳ¡£Ëü¸ú´«Í³µÄÎļþϵͳÈçExt4ÊÇÒ»¸öÀàÐÍ£¬Çø±ðÔÚÓÚ·Ö²¼Ê½´æ´¢ÌṩÁ˲¢Ðл¯µÄÄÜÁ¦£»
ÔÉú½Ó¿Ú
³ýÁËÒÔÉÏ3ÖÖ´æ´¢½Ó¿Ú£¬ »¹¿ÉÒÔÖ±½ÓʹÓà librados µÄÔÉú½Ó¿Ú£¬Ö±½ÓºÍRADOSͨÐÅ£»
ÔÉú½Ó¿ÚµÄÓŵãÊÇÊÇËüÖ±½ÓºÍºÍÓ¦ÓôúÂ뼯³É£¬²Ù×÷ÎļþºÜ·½±ã£»µ«ËüµÄÎÊÌâÊÇËü²»»áÖ÷¶¯ÎªÉÏ´«µÄÊý¾Ý·ÖƬ£»Ò»¸ö1GµÄ´ó¶ÔÏóÉÏ´«£¬Âäµ½
Ceph µÄ´æ´¢´ÅÅÌÉϾÍÊÇ1GµÄÎļþ£»
¶øÒÔÉÏÈý¸ö½Ó¿ÚÊǾßÓÐ·ÖÆ¬¹¦ÄÜ£¨¼´:Ìõ´ø»¯ file-striping)
·þÎñ¶Ë
ÔªÊý¾Ý·þÎñÆ÷
Ö÷ÒªÊÇʵÏÖ¼¯ÈºÔªÊý¾ÝµÄ·Ö²¼Ê½¹ÜÀí
¶ÔÏó´æ´¢¼¯Èº
ÒòΪcephµÄÈýÖÖ´æ´¢½Ó¿Ú¶¼ÊÇͨ¹ý¶ÔÏó´æ´¢ÊµÏֵ쬶ÔÏó´æ´¢¼¯Èº½«Êý¾ÝºÍÔªÊý¾Ý×÷Ϊ¶ÔÏó´æ´¢£¬Ö´ÐÐÆäËû¹Ø¼üÖ°ÄÜ¡£
¶ÔÏó´æ´¢¼¯ÈºµÄºËÐÄ×é¼þÊÇRADOS (Reliable, AutonomicDistributed
Object Store)¡£
¼¯Èº¼àÊÓÆ÷
Ö´ÐмàÊÓ¹¦ÄÜ£¬±£Ö¤¼¯ÈºµÄ½¡¿µÔËÐк͸澯
¿Í»§¶ËºÍ·þÎñ¶Ë½»»¥
ËüÃÇÖ®¼äµÄ½á¹¹ºÍ½»»¥Èçͼ:



´Ó¼Ü¹¹Í¼ÖпÉÒÔ¿´µ½×îµ×²ãµÄÊÇRADOS£¬RADOS×ÔÉíÊÇÒ»¸öÍêÕûµÄ·Ö²¼Ê½¶ÔÏó´æ´¢ÏµÍ³£¬Ëü¾ßÓпɿ¿¡¢ÖÇÄÜ¡¢·Ö²¼Ê½µÈÌØÐÔ£¬CephËùÓеĴ洢¹¦Äܶ¼ÊÇ»ùÓÚRADOSʵÏÖ£¬ËùÒÔCephµÄ¸ß¿É¿¿¡¢¸ß¿ÉÍØÕ¹¡¢¸ßÐÔÄÜ¡¢¸ß×Ô¶¯»¯¶¼ÊÇÓÉÕâÒ»²ãÀ´ÌṩµÄ£¬Óû§Êý¾ÝµÄ´æ´¢×îÖÕÒ²¶¼ÊÇͨ¹ýÕâÒ»²ãÀ´½øÐд洢µÄ£¬RADOS¿ÉÒÔ˵¾ÍÊÇCephµÄºËÐÄ¡£
RADOSϵͳÖ÷ÒªÓÉÁ½²¿·Ö×é³É£¬·Ö±ðÊÇOSDºÍMonitor¡£
RADOS²ÉÓÃC++¿ª·¢£¬ËùÌṩµÄÔÉúLibrados API°üÀ¨CºÍC++Á½ÖÖ¡£CephµÄÉϲãÓ¦Óõ÷Óñ¾»úÉϵÄlibrados
API£¬ÔÙÓɺóÕßͨ¹ýsocketÓëRADOS¼¯ÈºÖÐµÄÆäËû½ÚµãͨÐŲ¢Íê³É¸÷ÖÖ²Ù×÷¡£
»ùÓÚRADOS²ãµÄÉÏÒ»²ãÊÇLIBRADOS£¬LIBRADOSÊÇÒ»¸ö¿â£¬ËüÔÊÐíÓ¦ÓóÌÐòͨ¹ý·ÃÎʸÿâÀ´ÓëRADOSϵͳ½øÐн»»¥£¬Ö§³Ö¶àÖÖ±à³ÌÓïÑÔ£¬±ÈÈçC¡¢C++¡¢PythonµÈ¡£
»ùÓÚLIBRADOS²ã¿ª·¢µÄÓÖ¿ÉÒÔ¿´µ½ÓÐÈý²ã£¬·Ö±ðÊÇRADOSGW¡¢RBDºÍCEPH
FS¡£
RADOS GateWay¡¢RBDÆä×÷ÓÃÊÇÔÚlibrados¿âµÄ»ù´¡ÉÏÌṩ³éÏó²ã´Î¸ü¸ß¡¢¸ü±ãÓÚÓ¦Óûò¿Í»§¶ËʹÓõÄÉϲã½Ó¿Ú¡£
ÆäÖУ¬RADOS GWÊÇÒ»¸öÌṩÓëAmazon S3ºÍSwift¼æÈݵÄRESTful
APIµÄgateway£¬ÒÔ¹©ÏàÓ¦µÄ¶ÔÏó´æ´¢Ó¦Óÿª·¢Ê¹Óá£RBDÔòÌṩÁËÒ»¸ö±ê×¼µÄ¿éÉ豸½Ó¿Ú£¬³£ÓÃÓÚÔÚÐéÄ⻯µÄ³¡¾°ÏÂΪÐéÄâ»ú´´½¨volume¡£
Ŀǰ£¬Red HatÒѾ½«RBDÇý¶¯¼¯³ÉÔÚKVM/QEMUÖУ¬ÒÔÌá¸ßÐéÄâ»ú·ÃÎÊÐÔÄÜ¡£
ÕâÁ½ÖÖ·½Ê½Ä¿Ç°ÔÚÔÆ¼ÆËãÖÐÓ¦ÓõıȽ϶ࡣ
CEPHFSÔòÌṩÁËPOSIX½Ó¿Ú£¬Óû§¿ÉÖ±½Óͨ¹ý¿Í»§¶Ë¹ÒÔØÊ¹Óá£ËüÊÇÄÚºË̬µÄ³ÌÐò£¬ËùÒÔÎÞÐèµ÷ÓÃÓû§¿Õ¼äµÄlibrados¿â¡£Ëüͨ¹ýÄÚºËÖеÄnetÄ£¿éÀ´ÓëRados½øÐн»»¥¡£Í¨¹ýFUSE¹ÒÔØµ½¿Í»§¶ËµÄ´æ´¢ÏµÍ³Ê¹ÓÃÆðÀ´¸ú±¾µØÓ²Å̵ÄʹÓ÷½Ê½Ò»Ö£¬Ê¹ÓùÒÔØÂ·¾¶¼´¿É·ÃÎÊ¡£
CephµÄÎïÀí²¿Êð

·þÎñ¶Ë RADOS ¼¯ÈºÖ÷ÒªÓÉÁ½ÖÖ½Úµã×é³É£ºÒ»ÖÖÊÇΪÊýÖÚ¶àµÄ¡¢¸ºÔðÍê³ÉÊý¾Ý´æ´¢ºÍά»¤¹¦ÄܵÄOSD£¨Object
Storage Device£©£¬ÁíÒ»ÖÖÔòÊÇÈô¸É¸ö¸ºÔðÍê³Éϵͳ״̬¼ì²âºÍά»¤µÄmonitor¡£
Monitor
Monitor ¼¯ÈºÌṩÁËÕû¸ö´æ´¢ÏµÍ³µÄ½ÚµãÐÅÏ¢µÈÈ«¾ÖµÄÅäÖÃÐÅÏ¢£¬Í¨¹ý
Paxos Ëã·¨±£³ÖÊý¾ÝµÄÒ»ÖÂÐÔ¡£
OSD
PoolÊÇ´æ´¢¶ÔÏóµÄÂß¼·ÖÇø£¬Ëü¹æ¶¨ÁËÊý¾ÝÈßÓàµÄÀàÐͺͶÔÓ¦µÄ¸±±¾·Ö²¼²ßÂÔ£»Ö§³ÖÁ½ÖÖÀàÐÍ£º¸±±¾£¨replicated£©ºÍ
¾ÀɾÂ루 Erasure Code£©£»Ä¿Ç°ÎÒÃǹ«Ë¾ÄÚ²¿Ê¹ÓõÄPool¶¼ÊǸ±±¾ÀàÐÍ£¨3¸±±¾£©£»
PG£¨ placement group£©ÊÇÒ»¸ö·ÅÖòßÂÔ×飬ËüÊǶÔÏóµÄ¼¯ºÏ£¬¸Ã¼¯ºÏÀïµÄËùÓжÔÏó¶¼¾ßÓÐÏàͬµÄ·ÅÖòßÂÔ£»¼òµ¥µã˵¾ÍÊÇÏàͬPGÄڵĶÔÏó¶¼»á·Åµ½ÏàͬµÄÓ²ÅÌÉÏ£»
PGÊÇ cephµÄºËÐĸÅÄ ·þÎñ¶ËÊý¾Ý¾ùºâºÍ»Ö¸´µÄ×îСÁ£¶È¾ÍÊÇPG£»
OSDÊǸºÔðÎïÀí´æ´¢µÄ½ø³Ì£¬Ò»°ãÅäÖóɺʹÅÅÌÒ»Ò»¶ÔÓ¦£¬Ò»¿é´ÅÅÌÆô¶¯Ò»¸öOSD½ø³Ì£»
ÏÂÃæÕâÕÅͼÐÎÏóµÄÃè»æÁËËüÃÇÖ®¼äµÄ¹ØÏµ£º
Ò»¸öPoolÀïÓкܶàPG£¬
Ò»¸öPGÀï°üº¬Ò»¶Ñ¶ÔÏó£»Ò»¸ö¶ÔÏóÖ»ÄÜÊôÓÚÒ»¸öPG£»
PGÓÐÖ÷´ÓÖ®·Ö£¬Ò»¸öPG·Ö²¼ÔÚ²»Í¬µÄOSDÉÏ£¨Õë¶ÔÈý¸±±¾ÀàÐÍ£©

CephµÄ×é¼þÏê½â
CephµÄºËÐÄ×é¼þ°üÀ¨Ceph OSD¡¢Ceph MonitorºÍCeph
MDS¡£
Ceph OSD
OSDµÄÓ¢ÎÄÈ«³ÆÊÇObject Storage Device£¬ËüµÄÖ÷Òª¹¦ÄÜÊÇ´æ´¢Êý¾Ý¡¢¸´ÖÆÊý¾Ý¡¢Æ½ºâÊý¾Ý¡¢»Ö¸´Êý¾ÝµÈ£¬ÓëÆäËüOSD¼ä½øÐÐÐÄÌø¼ì²éµÈ£¬²¢½«Ò»Ð©±ä»¯Çé¿öÉϱ¨¸øCeph
Monitor¡£Ò»°ãÇé¿öÏÂÒ»¿éÓ²Å̶ÔÓ¦Ò»¸öOSD£¬ÓÉOSDÀ´¶ÔÓ²ÅÌ´æ´¢½øÐйÜÀí£¬µ±È»Ò»¸ö·ÖÇøÒ²¿ÉÒÔ³ÉΪһ¸öOSD¡£
Ceph OSDµÄ¼Ü¹¹ÊµÏÖÓÉÎïÀí´ÅÅÌÇý¶¯Æ÷¡¢LinuxÎļþϵͳºÍCeph
OSD·þÎñ×é³É£¬¶ÔÓÚCeph OSD Deamon¶øÑÔ£¬LinuxÎļþϵͳÏÔÐÔµÄÖ§³ÖÁËÆäÍØÕ¹ÐÔ£¬Ò»°ãLinuxÎļþϵͳÓкü¸ÖÖ£¬±ÈÈçÓÐBTRFS¡¢XFS¡¢Ext4µÈ£¬BTRFSËäÈ»ÓкܶàÓŵãÌØÐÔ£¬µ«ÏÖÔÚ»¹Ã»´ïµ½Éú²ú»·¾³ËùÐèµÄÎȶ¨ÐÔ£¬Ò»°ã±È½ÏÍÆ¼öʹÓÃXFS¡£
OSDÊÇǿһÖÂÐԵķֲ¼Ê½´æ´¢£¬ËüµÄ¶ÁдÁ÷³ÌÈçÏÂͼ

CephµÄ¶Áд²Ù×÷²ÉÓÃÖ÷´ÓÄ£ÐÍ£¬¿Í»§¶ËÒª¶ÁдÊý¾Ýʱ£¬Ö»ÄÜÏò¶ÔÏóËù¶ÔÓ¦µÄÖ÷osd½Úµã·¢ÆðÇëÇó¡£Ö÷½ÚµãÔÚ½ÓÊܵ½Ð´ÇëÇóʱ£¬»áͬ²½µÄÏò´ÓOSDÖÐдÈëÊý¾Ý¡£µ±ËùÓеÄOSD½Úµã¶¼Ð´ÈëÍê³Éºó£¬Ö÷½Úµã²Å»áÏò¿Í»§¶Ë±¨¸æÐ´ÈëÍê³ÉµÄÐÅÏ¢¡£Òò´Ë±£Ö¤ÁËÖ÷´Ó½ÚµãÊý¾ÝµÄ¸ß¶ÈÒ»ÖÂÐÔ¡£¶ø¶ÁÈ¡µÄʱºò£¬¿Í»§¶ËÒ²Ö»»áÏòÖ÷osd½Úµã·¢Æð¶ÁÇëÇ󣬲¢²»»áÓÐÀàËÆÓÚÊý¾Ý¿âÖеĶÁд·ÖÀëµÄÇé¿ö³öÏÖ£¬ÕâÒ²ÊdzöÓÚǿһÖÂÐԵĿ¼ÂÇ¡£ÓÉÓÚËùÓÐд²Ù×÷¶¼Òª½»¸øÖ÷osd½ÚµãÀ´´¦Àí£¬ËùÒÔÔÚÊý¾ÝÁ¿ºÜ´óʱ£¬ÐÔÄÜ¿ÉÄÜ»á±È½ÏÂý£¬ÎªÁ˿˷þÕâ¸öÎÊÌâÒÔ¼°ÈÃcephÄÜÖ§³ÖÊÂÎÿ¸öosd½Úµã¶¼°üº¬ÁËÒ»¸öjournalÎļþ¡£
°éËæOSDµÄ»¹ÓÐÒ»¸ö¸ÅÄî½Ð×öJournalÅÌ£¬Ò»°ãдÊý¾Ýµ½Ceph¼¯ÈºÊ±£¬¶¼ÊÇÏȽ«Êý¾ÝдÈëµ½JournalÅÌÖУ¬È»ºóÿ¸ôÒ»¶Îʱ¼ä±ÈÈç5ÃëÔÙ½«JournalÅÌÖеÄÊý¾Ýˢе½ÎļþϵͳÖС£Ò»°ãΪÁËʹ¶ÁдʱÑÓ¸üС£¬JournalÅ̶¼ÊDzÉÓÃSSD£¬Ò»°ã·ÖÅä10GÒÔÉÏ£¬µ±È»·ÖÅä¶àµãÄÇÊǸüºÃ£¬CephÖÐÒýÈëJournalÅ̵ĸÅÄîÊÇÒòΪJournalÔÊÐíCeph
OSD¹¦Äܺܿì×öСµÄд²Ù×÷£»Ò»¸öËæ»úдÈëÊ×ÏÈдÈëÔÚÉÏÒ»¸öÁ¬ÐøÀàÐ͵Äjournal£¬È»ºóˢе½Îļþϵͳ£¬Õâ¸øÁËÎļþϵͳ×ã¹»µÄʱ¼äÀ´ºÏ²¢Ð´Èë´ÅÅÌ£¬Ò»°ãÇé¿öÏÂʹÓÃSSD×÷ΪOSDµÄjournal¿ÉÒÔÓÐЧ»º³åÍ»·¢¸ºÔØ¡£
ÔÚcephÖУ¬Ã¿Ò»¸öosd½ø³Ì¶¼¿É³Æ×÷ÊÇÒ»¸öosd½Úµã£¬Ò²¾ÍÊÇ˵£¬Ã¿Ì¨´æ´¢·þÎñÆ÷ÉÏ¿ÉÄܰüº¬ÁËÖÚ¶àµÄosd½Úµã£¬Ã¿¸öosd½Úµã¼àÌý²»Í¬µÄ¶Ë¿Ú£¬ÀàËÆÓÚÔÚͬһ̨·þÎñÆ÷ÉÏÅܶà¸ömysql»òredis¡£Ã¿¸öosd½Úµã¿ÉÒÔÉèÖÃÒ»¸öĿ¼×÷Ϊʵ¼Ê´æ´¢ÇøÓò£¬Ò²¿ÉÒÔÊÇÒ»¸ö·ÖÇø£¬Ò»Õû¿éÓ²ÅÌ¡£ÈçÏÂͼ£¬µ±Ç°Õą̂»úÆ÷ÉÏÅÜÁËÁ½¸öosd½ø³Ì£¬Ã¿¸öosd¼àÌý4¸ö¶Ë¿Ú£¬·Ö±ðÓÃÓÚ½ÓÊÕ¿Í»§ÇëÇó¡¢´«ÊäÊý¾Ý¡¢·¢ËÍÐÄÌø¡¢Í¬²½Êý¾ÝµÈ²Ù×÷¡£

ÈçÉÏͼËùʾ£¬osd½ÚµãĬÈϼàÌýtcpµÄ6800µ½6803¶Ë¿Ú£¬Èç¹ûͬһ̨·þÎñÆ÷ÉÏÓжà¸öosd½Úµã£¬ÔòÒÀ´ÎÍùºóÅÅÐò¡£
ÔÚÉú²ú»·¾³ÖеÄosd×îÉÙ¿ÉÄܶ¼ÓÐÉϰٸö£¬ËùÒÔÿ¸öosd¶¼ÓÐÒ»¸öÈ«¾ÖµÄ±àºÅ£¬ÀàËÆosd0£¬osd1£¬osd2¡¡..ÐòºÅ¸ù¾Ýosdµ®ÉúµÄ˳ÐòÅÅÁУ¬²¢ÇÒÊÇÈ«¾ÖΨһµÄ¡£´æ´¢ÁËÏàͬPGµÄosd½Úµã³ýÁËÏòmon½Úµã·¢ËÍÐÄÌøÍ⣬»¹»á»¥Ïà·¢ËÍÐÄÌøÐÅÏ¢ÒÔ¼ì²âpgÊý¾Ý¸±±¾ÊÇ·ñÕý³£¡£
֮ǰÔÚ½éÉÜÊý¾ÝÁ÷Ïòʱ˵¹ý£¬Ã¿¸öosd½Úµã¶¼°üº¬Ò»¸öjournalÎļþ£¬ÈçÏÂͼ£º
ĬÈÏ´óСΪ5G£¬Ò²¾Í˵ÿ´´½¨Ò»¸öosd½Úµã£¬»¹Ã»Ê¹ÓþÍÒª±»journalÕ¼×ß5GµÄ¿Õ¼ä¡£Õâ¸öÖµÊÇ¿ÉÒÔµ÷ÕûµÄ£¬¾ßÌå´óСҪÒÀosdµÄ×Ü´óС¶ø¶¨¡£
JournalµÄ×÷ÓÃÀàËÆÓÚmysql innodbÒýÇæÖеÄÊÂÎïÈÕ־ϵͳ¡£µ±ÓÐÍ»·¢µÄ´óÁ¿Ð´Èë²Ù×÷ʱ£¬ceph¿ÉÒÔÏȰÑһЩÁãÉ¢µÄ£¬Ëæ»úµÄIOÇëÇó±£´æµ½»º´æÖнøÐкϲ¢£¬È»ºóÔÙͳһÏòÄں˷¢ÆðIOÇëÇó¡£ÕâÑù×öЧÂÊ»á±È½Ï¸ß£¬µ«ÊÇÒ»µ©osd½Úµã±ÀÀ££¬»º´æÖеÄÊý¾Ý¾Í»á¶ªÊ§£¬ËùÒÔÊý¾ÝÔÚ»¹Î´Ð´½øÓ²ÅÌÖÐʱ£¬¶¼»á¼Ç¼µ½journalÖУ¬µ±osd±ÀÀ£ºóÖØÐÂÆô¶¯Ê±£¬»á×Ô¶¯³¢ÊÔ´Ójournal»Ö¸´Òò±ÀÀ£¶ªÊ§µÄ»º´æÊý¾Ý¡£Òò´ËjournalµÄioÊǷdz£Ãܼ¯µÄ£¬¶øÇÒÓÉÓÚÒ»¸öÊý¾ÝÒªioÁ½´Î£¬ºÜ´ó³Ì¶ÈÉÏÒ²ËðºÄÁËÓ²¼þµÄioÐÔÄÜ£¬ËùÒÔͨ³£ÔÚÉú²ú»·¾³ÖУ¬Ê¹ÓÃssdÀ´µ¥¶À´æ´¢journalÎļþÒÔÌá¸ßceph¶ÁдÐÔÄÜ¡£
Ceph Monitor
ÓɸÃÓ¢ÎÄÃû×ÖÎÒÃÇ¿ÉÒÔÖªµÀËüÊÇÒ»¸ö¼àÊÓÆ÷£¬¸ºÔð¼àÊÓCeph¼¯Èº£¬Î¬»¤Ceph¼¯ÈºµÄ½¡¿µ×´Ì¬£¬Í¬Ê±Î¬»¤×ÅCeph¼¯ÈºÖеĸ÷ÖÖMapͼ£¬±ÈÈçOSD
Map¡¢Monitor Map¡¢PG MapºÍCRUSH Map£¬ÕâЩMapͳ³ÆÎªCluster Map£¬Cluster
MapÊÇRADOSµÄ¹Ø¼üÊý¾Ý½á¹¹£¬¹ÜÀí¼¯ÈºÖеÄËùÓгÉÔ±¡¢¹ØÏµ¡¢ÊôÐÔµÈÐÅÏ¢ÒÔ¼°Êý¾ÝµÄ·Ö·¢£¬±ÈÈçµ±Óû§ÐèÒª´æ´¢Êý¾Ýµ½Ceph¼¯ÈºÊ±£¬OSDÐèÒªÏÈͨ¹ýMonitor»ñÈ¡×îеÄMapͼ£¬È»ºó¸ù¾ÝMapͼºÍobject
idµÈ¼ÆËã³öÊý¾Ý×îÖÕ´æ´¢µÄλÖá£
Mon½Úµã¼à¿Ø×ÅÕû¸öceph¼¯ÈºµÄ״̬ÐÅÏ¢£¬¼àÌýÓÚtcpµÄ6789¶Ë¿Ú¡£Ã¿Ò»¸öceph¼¯ÈºÖÐÖÁÉÙÒªÓÐÒ»¸öMon½Úµã£¬¹Ù·½ÍƼöÿ¸ö¼¯ÈºÖÁÉÙ²¿ÊðÈý̨¡£Mon½ÚµãÖб£´æÁË×îеİ汾¼¯ÈºÊý¾Ý·Ö²¼Í¼£¨cluster
map£©µÄÖ÷¸±±¾¡£¿Í»§¶ËÔÚʹÓÃʱ£¬ÐèÒª¹ÒÔØmon½ÚµãµÄ6789¶Ë¿Ú£¬ÏÂÔØ×îеÄcluster map£¬Í¨¹ýcrushËã·¨»ñµÃ¼¯ÈºÖи÷osdµÄIPµØÖ·£¬È»ºóÔÙÓëosd½ÚµãÖ±½Ó½¨Á¢Á¬½ÓÀ´´«ÊäÊý¾Ý¡£ËùÒÔ¶ÔÓÚcephÀ´Ëµ£¬²¢²»ÐèÒªÓм¯ÖÐʽµÄÖ÷½ÚµãÓÃÓÚ¼ÆËãÓëѰַ£¬¿Í»§¶Ë·Ö̯ÁËÕⲿ·Ö¹¤×÷¡£¶øÇÒ¿Í»§¶ËÒ²¿ÉÒÔÖ±½ÓºÍosdͨÐÅ£¬Ê¡È¥ÁËÖмä´úÀí·þÎñÆ÷µÄ¶îÍ⿪Ïú¡£
Mon½ÚµãÖ®¼äʹÓÃPaxosËã·¨À´±£³Ö¸÷½Úµãcluster mapµÄÒ»ÖÂÐÔ£»¸÷mon½ÚµãµÄ¹¦ÄÜ×ÜÌåÉÏÊÇÒ»ÑùµÄ£¬Ï໥¼äµÄ¹ØÏµ¿ÉÒÔ±»¼òµ¥Àí½âΪÖ÷±¸¹ØÏµ¡£Èç¹ûÖ÷mon½ÚµãË𻵣¬ÆäËûmon´æ»î½Úµã³¬¹ý°ëÊýʱ£¬¼¯Èº»¹¿ÉÒÔÕý³£ÔËÐС£µ±¹ÊÕÏmon½Úµã»Ö¸´Ê±£¬»áÖ÷¶¯ÏòÆäËûmon½ÚµãÀÈ¡×îеÄcluster
map¡£
Mon½Úµã²¢²»»áÖ÷¶¯ÂÖѯ¸÷¸öosdµÄµ±Ç°×´Ì¬£¬Ïà·´£¬osdÖ»ÓÐÔÚÒ»Ð©ÌØÊâÇé¿ö²Å»áÉϱ¨×Ô¼ºµÄÐÅÏ¢£¬Æ½³£Ö»»á¼òµ¥µÄ·¢ËÍÐÄÌø¡£ÌØÊâÇé¿ö°üÀ¨£º1¡¢ÐµÄOSD±»¼ÓÈ뼯Ⱥ£»2¡¢Ä³¸öOSD·¢ÏÖ×ÔÉí»òÆäËûOSD·¢ÉúÒì³£¡£Mon½ÚµãÔÚÊÕµ½ÕâЩÉϱ¨ÐÅϢʱ£¬Ôò»á¸üÐÂcluster
mapÐÅÏ¢²¢¼ÓÒÔÀ©É¢¡£
cluster mapÐÅÏ¢ÊÇÒÔÒì²½ÇÒlazyµÄÐÎʽÀ©É¢µÄ¡£monitor²¢²»»áÔÚÿһ´Îcluster
map°æ±¾¸üк󶼽«Ð°汾¹ã²¥ÖÁÈ«ÌåOSD£¬¶øÊÇÔÚÓÐOSDÏò×Ô¼ºÉϱ¨ÐÅϢʱ£¬½«¸üлظ´¸ø¶Ô·½¡£ÀàËÆµÄ£¬¸÷¸öOSDÒ²ÊÇÔÚºÍÆäËûOSDͨÐÅʱ£¬Èç¹û·¢ÏÖ¶Ô·½µÄosdÖгÖÓеÄcluster
map°æ±¾½ÏµÍ£¬Ôò°Ñ×Ô¼º¸üеİ汾·¢Ë͸ø¶Ô·½¡£
ÍÆ¼öʹÓÃÒÔϵļܹ¹

ÕâÀïµÄceph³ýÁ˹ÜÀíÍø¶ÎÍ⣬ÉèÁËÁ½¸öÍø¶Î£¬Ò»¸öÓÃÓÚ¿Í»§¶Ë¶Áд´«ÊäÊý¾Ý¡£ÁíÒ»¸öÓÃÓÚ¸÷OSD½ÚµãÖ®¼äͬ²½Êý¾ÝºÍ·¢ËÍÐÄÌøÐÅÏ¢µÈ¡£ÕâÑù×öµÄºÃ´¦ÊÇ¿ÉÒÔ·Öµ£Íø¿¨µÄIOѹÁ¦¡£·ñÔòÔÚÊý¾ÝÇåϴʱ£¬¿Í»§¶ËµÄ¶ÁдËÙ¶È»á±äµÃ¼«Îª»ºÂý¡£
Ceph MDS
È«³ÆÊÇCeph MetaData Server£¬MdsÊÇceph¼¯ÈºÖеÄÔªÊý¾Ý·þÎñÆ÷£¬¶øÍ¨³£Ëü¶¼²»ÊDZØÐëµÄ£¬ÒòΪֻÓÐÔÚʹÓÃcephfsµÄʱºò²ÅÐèÒªËü£¬¶ÔÏó´æ´¢ºÍ¿é´æ´¢É豸ÊDz»ÐèҪʹÓø÷þÎñµÄ£¬¶øÄ¿Ç°ÔƼÆËãÖÐÓõĸü¹ã·ºµÄÊÇÁíÍâÁ½ÖÖ´æ´¢·½Ê½¡£
MdsËäÈ»ÊÇÔªÊý¾Ý·þÎñÆ÷£¬µ«ÊÇËü²»¸ºÔð´æ´¢ÔªÊý¾Ý£¬ÔªÊý¾ÝÒ²ÊDZ»ÇгɶÔÏó´æÔÚ¸÷¸öosd½ÚµãÖеģ¬ÈçÏÂͼ£º

ÔÚ´´½¨CEPHFSʱ£¬ÒªÖÁÉÙ´´½¨Á½¸öPOOL£¬Ò»¸öÓÃÓÚ´æ·ÅÊý¾Ý£¬ÁíÒ»¸öÓÃÓÚ´æ·ÅÔªÊý¾Ý¡£MdsÖ»ÊǸºÔð½ÓÊÜÓû§µÄÔªÊý¾Ý²éѯÇëÇó£¬È»ºó´ÓosdÖаÑÊý¾ÝÈ¡³öÀ´Ó³Éä½ø×Ô¼ºµÄÄÚ´æÖй©¿Í»§·ÃÎÊ¡£ËùÒÔmdsÆäʵÀàËÆÒ»¸ö´úÀí»º´æ·þÎñÆ÷£¬Ìæosd·Öµ£ÁËÓû§µÄ·ÃÎÊѹÁ¦,ÈçÏÂͼ£º

CephÓëÔÆÆ½Ì¨µÄ¹ØÏµ
CephÒѾ³ÉΪOpenStackºó¶Ë´æ´¢±êÅ䣬OpenStack×÷ΪIaaSϵͳ£¬Éæ¼°µ½´æ´¢µÄ²¿·ÖÖ÷ÒªÊÇ¿é´æ´¢·þÎñÄ£¿é¡¢¶ÔÏó´æ´¢·þÎñÄ£¿é¡¢¾µÏñ¹ÜÀíÄ£¿éºÍ¼ÆËã·þÎñÄ£¿é£¬¶ÔӦΪÆäÖеÄCinder¡¢Swift¡¢GlanceºÍNovaËĸöÏîÄ¿¡£

Ceph RBD¿é´æ´¢ÊÇÒÔ¶ÀÁ¢¾íµÄ·½Ê½¹Ò½Óµ½OpenStcak CinderÄ£¿é£¬Ö÷ÒªÓÃ×÷Êý¾ÝÅÌ£¬ÕâÖÖ·½Ê½Ö÷Ҫͨ¹ýCinder
DriverʵÏÖ£¬É¾³ýÐéÄâ»úʱ¾íÒÀÈ»´æÔÚ¡£Nova¶Ô½ÓCephʱ£¬Ceph RBD¿é´æ´¢¾íÐèÒªÓëÐéÄâ»ú°ó¶¨£¬ËùÒÔɾ³ýÐéÄâ»úʱ¾íҲɾ³ý£¬Ò»°ãÓÃ×÷Æô¶¯ÅÌ¡£CephÒ²¿ÉÒÔºÍGlance¶Ô½ÓÓÃÓÚ¾µÏñ¾í¡£Keystone×÷ΪOpenStack¶ÔÏóSwiftµÄÈÏ֤ģ¿é£¬Ö§³ÖCephͨ¹ýRADOSGWÍø¹ØÈÏÖ¤£¬¸øOpenStcakÌṩSwift´æ´¢·þÎñ¡£
CephÉçÇøÒѾ°ÑCeph µÄRBD¿é´æ´¢¾µÏñÖ§³Ö¹¦ÄÜÀ©Õ¹µ½DockerÖС£ÔÚDockerÖÐCephµÄRBD¾µÏñ¹¦ÄÜÖ÷ÒªÊǸºÔð°ÑRBD¾µÏñͨ¹ýÒ첽ͨÐŵķ½Ê½´ÓÒ»¸öCeph¼¯Èº¸´ÖƵ½ÁíÒ»¸öCeph¼¯Èº£¬ÓÃÓÚ¶ÔDocker¾µÏñÈÝÔÖ±£»¤ºÍ»Ö¸´¡£
CephÄÚ²¿Êý¾Ý´æ´¢ÊÓͼ
ÔÚCeph´æ´¢ÏµÍ³ÖУ¬CehpµÄ»ù´¡·þÎñ¼Ü¹¹Ö÷Òª°üÀ¨ÁËObject
Storage Device(OSD)£¬MonitorºÍMDS¡£
ÌṩÁËLibradosÔÉú¶ÔÏó»ù´¡¿â¡¢Librbd¿é´æ´¢¿â¡¢»ùÓÚS3
ºÍSwift¼æÈݵÄLibrgw¶ÔÏó¿âºÍLibcephÎļþϵͳ¿â¡£
´î½¨Ò»Ì¨CephϵͳÖÁÉÙÐèÒª1¸öCeph MonitorºÍ2¸öCeph
OSD¡£
Ò»¸öCluster¿ÉÂß¼ÉÏ»®·ÖΪ¶à¸öPool¡£
Ò»¸ö PoolÓÉÈô¸É¸öÂß¼ PG( Placement Group)×é³É£¬PoolÄڵĸ±±¾ÊýÁ¿Ò²ÊÇ¿ÉÒÔÉèÖõÄ

Cephµ×²ãÊǶÔÏóϵͳ£¬ËùÒÔÒ»¸öÎļþ»á±»ÇзÖΪ¶à¸öObject£¬Ã¿¸öObject»á±»Ó³Éäµ½Ò»¸öPG£¬Ã¿¸ö
PG »áÓ³Éäµ½Ò»×é OSD(Object Storage Device)£¬ÆäÖеÚÒ»¸öOSD ÊÇÖ÷£¬ÆäÓàµÄÊDZ¸£¬OSD¼äͨ¹ýÐÄÌøÀ´Ï໥¼à¿Ø´æ»î״̬¡£ÒýÈëPG¸ÅÄîºó£¬OSDÖ»ºÍPGÏà¹Ø£¬²»µ«¼ò»¯ÁËOSDµÄÊý¾Ý´æ´¢£¬¶øÇÒʵÏÖÁËObjectµ½OSDµÄ¶¯Ì¬Ó³É䣬OSDµÄÌí¼ÓºÍ¹ÊÕϲ»Ó°ÏìObjectµÄÓ³Éä¡£
CephÊý¾ÝÈçºÎ´æ´¢
ÔÚCeph´æ´¢ÏµÍ³ÖУ¬Êý¾Ý´æ´¢·ÖÈý¸öÓ³Éä¹ý³Ì
Ê×ÏÈÒª½«Óû§Òª²Ù×÷µÄfile£¬Ó³ÉäΪRADOSÄܹ»´¦ÀíµÄobject¡£¾ÍÊǼòµ¥µÄ°´ÕÕobjectµÄsize¶Ôfile½øÐÐÇз֣¬Ï൱ÓÚRAIDÖеÄÌõ´ø»¯¹ý³Ì¡£
½Ó×ŰÑObjectÓ³Éäµ½PG£¬ÔÚfile±»Ó³ÉäΪһ¸ö»ò¶à¸öobjectÖ®ºó£¬¾ÍÐèÒª½«Ã¿¸öobject¶ÀÁ¢µØÓ³Éäµ½Ò»¸öPGÖÐÈ¥¡£
µÚÈý´ÎÓ³Éä¾ÍÊǽ«×÷ΪobjectµÄÂß¼×éÖ¯µ¥ÔªµÄPGÓ³Éäµ½Êý¾ÝµÄʵ¼Ê´æ´¢µ¥ÔªOSD¡£

Îļþ´æÈëʱ£¬Ê×ÏȰÑFileÇзÖΪRADOS²ãÃæµÄObject£¬Ã¿¸öObjectÒ»°ãΪ2MB»ò4MB(´óС¿ÉÉèÖÃ)¡£Ã¿¸öObjectͨ¹ý¹þÏ£Ëã·¨Ó³É䵽ΨһµÄPG¡£Ã¿¸öPGͨ¹ýCrushËã·¨Ó³É䵽ʵ¼Ê´æ´¢µ¥ÔªOSD£¬PGºÍOSD¼äÊǶà¶Ô¶àµÄÓ³Éä¹ØÏµ¡£OSDÔÚÎïÀíÉϿɻ®·Öµ½¶à¸ö¹ÊÕÏÓòÖУ¬¹ÊÕÏÓò¿ÉÒÔ¿ç»ú¹ñºÍ·þÎñÆ÷£¬Í¨¹ý²ßÂÔÅäÖÃʹPGµÄ²»Í¬¸±±¾Î»ÓÚ²»Í¬µÄ¹ÊÕÏÓòÖС£
CephÊý¾Ý·Ö²¼Ëã·¨
ÔÚ·Ö²¼Ê½´æ´¢ÏµÍ³ÖбȽϹØ×¢µÄÒ»µãÊÇÈçºÎʹµÃÊý¾ÝÄܹ»·Ö²¼µÃ¸ü¼Ó¾ùºâ£¬³£¼ûµÄÊý¾Ý·Ö²¼Ëã·¨ÓÐÒ»ÖÂÐÔHashºÍCephµÄCrushËã·¨¡£CrushÊÇÒ»ÖÖÎ±Ëæ»úµÄ¿ØÖÆÊý¾Ý·Ö²¼¡¢¸´ÖƵÄËã·¨£¬CephÊÇΪ´ó¹æÄ£·Ö²¼Ê½´æ´¢¶øÉè¼ÆµÄ£¬Êý¾Ý·Ö²¼Ëã·¨±ØÐëÄܹ»Âú×ãÔÚ´ó¹æÄ£µÄ¼¯ÈºÏÂÊý¾ÝÒÀÈ»Äܹ»¿ìËÙµÄ׼ȷµÄ¼ÆËã´æ·ÅλÖã¬Í¬Ê±Äܹ»ÔÚÓ²¼þ¹ÊÕÏ»òÀ©Õ¹Ó²¼þÉ豸ʱ×öµ½¾¡¿ÉÄÜСµÄÊý¾ÝÇ¨ÒÆ£¬CephµÄCRUSHËã·¨¾ÍÊǾ«ÐÄΪÕâÐ©ÌØÐÔÉè¼ÆµÄ£¬¿ÉÒÔ˵CRUSHËã·¨Ò²ÊÇCephµÄºËÐÄÖ®Ò»¡£
CephÒÔ˽ÓÐClient·½Ê½¶ÔÍâÌṩ·þÎñ£¬Ö§³ÖLinuxÓû§Ì¬(Fuse)ºÍÄÚºË̬(VFS)·½Ê½£¬Clinet»¹ÊµÏÖÊý¾ÝÇÐÆ¬£¬Í¨¹ýCrushËã·¨¶¨Î»¶ÔÏóλÖ㬲¢½øÐÐÊý¾ÝµÄ¶Áд¡£µ«ÔÚ²âÊÔÖУ¬Í¨³£ÔÚCeph·þÎñÆ÷¶Ë½«CephÅäÖóÉNFS·þÎñµÄExportFS£¬Í¨¹ý±ê×¼NFS½Ó¿Úµ¼³öĿ¼¡£
CephFS Ö§³ÖPOSIX¡¢HDFS¡¢NFS¡¢CIFS·þÎñ½Ó¿Ú£¬ÆäÖÐNFSºÍCIFSͨ¹ýÍâÖÃÍø¹ØÊµÏÖ(ͨ¹ýClinetµ¼³ö)¡£
ÔÚPGͨ¹ýCrushËã·¨Ó³Éäµ½Êý¾ÝµÄʵ¼Ê´æ´¢µ¥ÔªOSDʱ£¬ÐèÇóͨ¹ýCrush
Map¡¢Crush RulesºÍCrushËã·¨ÅäºÏ²ÅÄÜÍê³É¡£

Cluster MapÓÃÀ´¼Ç¼ȫ¾Öϵͳ״̬¼ÇÊý¾Ý½á¹¹£¬ÓÉCrush
MapºÍOSD MapÁ½²¿·Ö×é³É¡£ Crush Map°üº¬µ±Ç°´ÅÅÌ¡¢·þÎñÆ÷¡¢»ú¼ÜµÄ²ã¼¶½á¹¹£¬OSD
Map°üº¬µ±Ç°ËùÓÐPoolµÄ״̬ºÍËùÓÐOSDµÄ״̬¡£
Crush Rules¾ÍÊÇÊý¾ÝÓ³ÉäµÄ²ßÂÔ£¬¾ö¶¨ÁËÿ¸öÊý¾Ý¶ÔÏóÓжàÉÙ¸ö¸±±¾£¬ÕâЩ¸±±¾ÈçºÎ´æ´¢¡£
CrushËã·¨ÊÇÒ»ÖÖÎ±Ëæ»úËã·¨£¬Í¨¹ýÈ¨ÖØ¾ö¶¨Êý¾Ý´æ·Å£¨Èç¿ç»ú·¿¡¢»ú¼Ü¸ÐÖªµÈ£©£¬Í¨³£²ÉÓûùÓÚÈÝÁ¿µÄÈ¨ÖØ¡£CrushËã·¨Ö§³Ö¸±±¾ºÍECÁ½ÖÖÊý¾ÝÈßÓ෽ʽ£¬»¹ÌṩÁËËÄÖÖ²»Í¬ÀàÐ͵ÄBucket(Uniform¡¢List¡¢Tree¡¢Straw)£¬´ó¶àÊýÇé¿öÏµĶ¼²ÉÓÃStraw¡£
ÔÚ˵Ã÷CRUSHËã·¨µÄ»ù±¾ÔÀí֮ǰ£¬ÏȽéÉܼ¸¸ö¸ÅÄîºÍËüÃÇÖ®¼äµÄ¹ØÏµ¡£
´æ´¢Êý¾ÝÓëobjectµÄ¹ØÏµ£ºµ±Óû§Òª½«Êý¾Ý´æ´¢µ½Ceph¼¯ÈºÊ±£¬´æ´¢Êý¾Ý¶¼»á±»·Ö¸î³É¶à¸öobject£¬Ã¿¸öobject¶¼ÓÐÒ»¸öobject
id£¬Ã¿¸öobjectµÄ´óСÊÇ¿ÉÒÔÉèÖõģ¬Ä¬ÈÏÊÇ4MB£¬object¿ÉÒÔ¿´³ÉÊÇCeph´æ´¢µÄ×îС´æ´¢µ¥Ôª¡£
objectÓëpgµÄ¹ØÏµ£ºÓÉÓÚobjectµÄÊýÁ¿ºÜ¶à£¬ËùÒÔCephÒýÈëÁËpgµÄ¸ÅÄîÓÃÓÚ¹ÜÀíobject£¬Ã¿¸öobject×îºó¶¼»áͨ¹ýCRUSH¼ÆËãÓ³É䵽ij¸öpgÖУ¬Ò»¸öpg¿ÉÒÔ°üº¬¶à¸öobject¡£
pgÓëosdµÄ¹ØÏµ£ºpgÒ²ÐèҪͨ¹ýCRUSH¼ÆËãÓ³Éäµ½osdÖÐÈ¥´æ´¢£¬Èç¹ûÊǶþ¸±±¾µÄ£¬Ôòÿ¸öpg¶¼»áÓ³Éäµ½¶þ¸öosd£¬±ÈÈç[osd.1,osd.2]£¬ÄÇôosd.1ÊÇ´æ·Å¸ÃpgµÄÖ÷¸±±¾£¬osd.2ÊÇ´æ·Å¸ÃpgµÄ´Ó¸±±¾£¬±£Ö¤ÁËÊý¾ÝµÄÈßÓà¡£
pgºÍpgpµÄ¹ØÏµ£ºpgÊÇÓÃÀ´´æ·ÅobjectµÄ£¬pgpÏ൱ÓÚÊÇpg´æ·ÅosdµÄÒ»ÖÖÅÅÁÐ×éºÏ£¬ÎÒ¾Ù¸öÀý×Ó£¬±ÈÈçÓÐ3¸öosd£¬osd.1¡¢osd.2ºÍosd.3£¬¸±±¾ÊýÊÇ2£¬Èç¹ûpgpµÄÊýĿΪ1£¬ÄÇôpg´æ·ÅµÄosd×éºÏ¾ÍÖ»ÓÐÒ»ÖÖ£¬¿ÉÄÜÊÇ[osd.1,osd.2]£¬ÄÇôËùÓеÄpgÖ÷´Ó¸±±¾·Ö±ð´æ·Åµ½osd.1ºÍosd.2£¬Èç¹ûpgpÉèΪ2£¬ÄÇôÆäosd×éºÏ¿ÉÒÔÁ½ÖÖ£¬¿ÉÄÜÊÇ[osd.1,osd.2]ºÍ[osd.1,osd.3]£¬ÊDz»ÊǺÜÏñÎÒÃǸßÖÐÊýѧѧ¹ýµÄÅÅÁÐ×éºÏ£¬pgp¾ÍÊÇ´ú±íÕâ¸öÒâ˼¡£Ò»°ãÀ´ËµÓ¦¸Ã½«pgºÍpgpµÄÊýÁ¿ÉèÖÃΪÏàµÈ¡£ÕâÑù˵¿ÉÄܲ»¹»Ã÷ÏÔ£¬ÎÒÃÇͨ¹ýÒ»×éʵÑéÀ´Ìå»áÏ£º
ÏÈ´´½¨Ò»¸öÃûΪtestpool°üº¬6¸öPGºÍ6¸öPGPµÄ´æ´¢³Ø
ceph osd pool create testpool 6 6
ͨ¹ýдÊý¾ÝºóÎÒÃDz鿴ÏÂpgµÄ·Ö²¼Çé¿ö£¬Ê¹ÓÃÒÔÏÂÃüÁ
ceph pg dump pgs | grep ^1 | awk ¡®{print
1,1,2,$15}¡¯
dumped pgs in format plain
1.1 75 [3,6,0]
1.0 83 [7,0,6]
1.3 144 [4,1,2]
1.2 146 [7,4,1]
1.5 86 [4,6,3]
1.4 80 [3,0,4]
µÚ1ÁÐΪpgµÄid£¬µÚ2ÁÐΪ¸ÃpgËù´æ´¢µÄ¶ÔÏóÊýÄ¿£¬µÚ3ÁÐΪ¸ÃpgËùÔÚµÄosd
ÎÒÃÇÀ©´óPGÔÙ¿´¿´
ceph osd pool set testpool pg_num
12
ÔÙ´ÎÓÃÉÏÃæµÄÃüÁî²éѯ·Ö²¼Çé¿ö£º
1.1 37 [3,6,0]
1.9 38 [3,6,0]
1.0 41 [7,0,6]
1.8 42 [7,0,6]
1.3 48 [4,1,2]
1.b 48 [4,1,2]
1.7 48 [4,1,2]
1.2 48 [7,4,1]
1.6 49 [7,4,1]
1.a 49 [7,4,1]
1.5 86 [4,6,3]
1.4 80 [3,0,4]
ÎÒÃÇ¿ÉÒÔ¿´µ½pgµÄÊýÁ¿Ôö¼Óµ½12¸öÁË£¬pg1.1µÄ¶ÔÏóÊýÁ¿±¾À´ÊÇ75µÄ£¬ÏÖÔÚÊÇ37¸ö£¬¿ÉÒÔ¿´µ½Ëü°Ñ¶ÔÏóÊý·Ö¸øÐÂÔöµÄpg1.9ÁË£¬¸ÕºÃÊÇ38£¬¼ÓÆðÀ´ÊÇ75£¬¶øÇÒ¿ÉÒÔ¿´µ½pg1.1ºÍpg1.9µÄosdÅÌÊÇÒ»ÑùµÄ¡£
¶øÇÒ¿ÉÒÔ¿´µ½osdÅ̵Ä×éºÏ»¹ÊÇÄÇ6ÖÖ¡£
ÎÒÃÇÔö¼ÓpgpµÄÊýÁ¿À´¿´Ï£¬Ê¹ÓÃÃüÁ
ceph osd pool set testpool pgp_num
12
ÔÙ¿´ÏÂ
1.a 49 [1,2,6]
1.b 48 [1,6,2]
1.1 37 [3,6,0]
1.0 41 [7,0,6]
1.3 48 [4,1,2]
1.2 48 [7,4,1]
1.5 86 [4,6,3]
1.4 80 [3,0,4]
1.7 48 [1,6,0]
1.6 49 [3,6,7]
1.9 38 [1,4,2]
1.8 42 [1,2,3]
ÔÙ¿´pg1.1ºÍpg1.9£¬¿ÉÒÔ¿´µ½pg1.9²»ÔÚ[3,6,0]ÉÏ£¬¶øÔÚ[1,4,2]ÉÏÁË£¬¸Ã×éºÏÊÇмӵģ¬¿ÉÒÔÖªµÀÔö¼Ópgp_numÆäʵÊÇÔö¼ÓÁËosdÅ̵Ä×éºÏ¡£
ͨ¹ýʵÑé×ܽ᣺
£¨1£©PGÊÇÖ¸¶¨´æ´¢³Ø´æ´¢¶ÔÏóµÄĿ¼ÓжàÉÙ¸ö£¬PGPÊÇ´æ´¢³ØPGµÄOSD·Ö²¼×éºÏ¸öÊý
£¨2£©PGµÄÔö¼Ó»áÒýÆðPGÄÚµÄÊý¾Ý½øÐзÖÁÑ£¬·ÖÁÑÏàͬµÄOSDÉÏÐÂÉú³ÉµÄPGµ±ÖÐ
£¨3£©PGPµÄÔö¼Ó»áÒýÆð²¿·ÖPGµÄ·Ö²¼½øÐб仯£¬µ«ÊDz»»áÒýÆðPGÄÚ¶ÔÏóµÄ±ä¶¯
pgºÍpoolµÄ¹ØÏµ£ºpoolÒ²ÊÇÒ»¸öÂß¼´æ´¢¸ÅÄÎÒÃÇ´´½¨´æ´¢³ØpoolµÄʱºò£¬¶¼ÐèÒªÖ¸¶¨pgºÍpgpµÄÊýÁ¿£¬Âß¼ÉÏÀ´ËµpgÊÇÊôÓÚij¸ö´æ´¢³ØµÄ£¬¾ÍÓеãÏñobjectÊÇÊôÓÚij¸öpgµÄ¡£
ÒÔÏÂÕâ¸öͼ±íÃ÷ÁË´æ´¢Êý¾Ý£¬object¡¢pg¡¢pool¡¢osd¡¢´æ´¢´ÅÅ̵ĹØÏµ

±¾ÖÊÉÏCRUSHËã·¨ÊǸù¾Ý´æ´¢É豸µÄÈ¨ÖØÀ´¼ÆËãÊý¾Ý¶ÔÏóµÄ·Ö²¼µÄ£¬È¨ÖصÄÉè¼Æ¿ÉÒÔ¸ù¾Ý¸Ã´ÅÅ̵ÄÈÝÁ¿ºÍ¶ÁдËÙ¶ÈÀ´ÉèÖ㬱ÈÈç¸ù¾ÝÈÝÁ¿´óС¿ÉÒÔ½«1TµÄÓ²ÅÌÉè±¸È¨ÖØÉèΪ1£¬2TµÄ¾ÍÉèΪ2£¬ÔÚ¼ÆËã¹ý³ÌÖУ¬CRUSHÊǸù¾ÝCluster
Map¡¢Êý¾Ý·Ö²¼²ßÂÔºÍÒ»¸öËæ»úÊý¹²Í¬¾ö¶¨Êý×é×îÖյĴ洢λÖõġ£
Cluster MapÀïµÄÄÚÈÝÐÅÏ¢°üÀ¨´æ´¢¼¯ÈºÖпÉÓõĴ洢×ÊÔ´¼°ÆäÏ໥֮¼äµÄ¿Õ¼ä²ã´Î¹ØÏµ£¬±ÈÈ缯ȺÖÐÓжàÉÙ¸öÖ§¼Ü£¬Ã¿¸öÖ§¼ÜÖÐÓжàÉÙ¸ö·þÎñÆ÷£¬Ã¿¸ö·þÎñÆ÷ÓжàÉÙ¿é´ÅÅÌÓÃÒÔOSDµÈ¡£
Êý¾Ý·Ö²¼²ßÂÔÊÇÖ¸¿ÉÒÔͨ¹ýCeph¹ÜÀíÕßͨ¹ýÅäÖÃÐÅÏ¢Ö¸¶¨Êý¾Ý·Ö²¼µÄÒ»Ð©ÌØµã£¬±ÈÈç¹ÜÀíÕßÅäÖõĹÊÕÏÓòÊÇHost£¬Ò²¾ÍÒâζ×ŵ±ÓÐһ̨HostÆð²»À´Ê±£¬Êý¾ÝÄܹ»²»¶ªÊ§£¬CRUSH¿ÉÒÔͨ¹ý½«Ã¿¸öpgµÄÖ÷´Ó¸±±¾·Ö±ð´æ·ÅÔÚ²»Í¬HostµÄOSDÉϼ´¿É´ïµ½£¬²»µ¥µ¥¿ÉÒÔÖ¸¶¨Host£¬»¹¿ÉÒÔÖ¸¶¨»ú¼ÜµÈ¹ÊÕÏÓò£¬³ýÁ˹ÊÕÏÓò£¬»¹ÓÐÑ¡ÔñÊý¾ÝÈßÓàµÄ·½Ê½£¬±ÈÈ縱±¾Êý»ò¾ÀɾÂë¡£
ÏÂÃæÕâ¸öʽ×Ó¼òµ¥µÄ±íÃ÷CRUSHµÄ¼ÆËã±í´ïʽ£º
CRUSH(X) -> (osd.1,osd.2¡..osd.n)
ʽ×ÓÖеÄX¾ÍÊÇÒ»¸öËæ»úÊý¡£
ÏÂÃæÍ¨¹ýÒ»¸ö¼ÆËãPG IDµÄʾÀýÀ´¿´CRUSHµÄÒ»¸ö¼ÆËã¹ý³Ì£º
£¨1£©ClientÊäÈëPool IDºÍ¶ÔÏóID£»
£¨2£©CRUSH»ñµÃ¶ÔÏóID²¢¶ÔÆä½øÐÐHashÔËË㣻
£¨3£©CRUSH¼ÆËãOSDµÄ¸öÊý£¬Hashȡģ»ñµÃPGµÄID£¬±ÈÈç0x48£»
£¨4£©CRUSHÈ¡µÃ¸ÃPoolµÄID£¬±ÈÈçÊÇ1£»
£¨5£©CRUSHÔ¤ÏÈ¿¼Âǵ½Pool IDÏàͬµÄPG ID£¬±ÈÈç1.48¡£
¶ÔÏóµÄѰַ¹ý³Ì
²éÕÒ¶ÔÏóÔÚ¼¯ÈºÖеĴ洢µÄλÖ㬾ßÌå·ÖΪÁ½²½£º
µÚÒ»²½£¬¶ÔÏóµ½PGµÄÓ³É䣻½«¶ÔÏóµÄid ͨ¹ýhashÓ³É䣬ȻºóÓÃPG×ÜÊý¶ÔhashֵȡģµÃµ½pg
id£º
pg_
id = hash( object_ id ) % pg_num |
µÚ¶þ²½£¬PGµ½osdÁбíÓ³Éä; ͨ¹ýcrushËã·¨¼ÆËãPGÉϵĶÔÏó·Ö²¼µ½ÄÄЩOSDÓ²ÅÌÉÏ£»
CRUSHËã·¨ÊÇ cephµÄ¾«»ªËùÔÚ£»
crushµÄÄ¿±ê
ÏÈ¿´¿´crushËã·¨µÄÏ£Íû´ï³ÉµÄÄ¿±ê£º
Êý¾Ý¾ùÔȵķֲ¼µ½¼¯ÈºÖУ»
ÐèÒª¿¼ÂǸ÷¸öOSDÈ¨ÖØµÄ²»Í¬£¨¸ù¾Ý¶ÁдÐÔÄܵIJîÒ죬´ÅÅ̵ÄÈÝÁ¿µÄ´óС²îÒìµÈÉèÖò»Í¬µÄÈ¨ÖØ£©
µ±ÓÐOSDËð»µÐèÒªÊý¾ÝÇ¨ÒÆÊ±£¬Êý¾ÝµÄÇ¨ÒÆÁ¿¾¡¿ÉÄܵÄÉÙ£»
crushËã·¨¹ý³Ì
¼òµ¥ËµÏÂcrushËã·¨µÄ¹ý³Ì£º
µÚÒ»²½ÊäÈëPG id¡¢¿É¹©Ñ¡ÔñµÄOSD id ÁÐ±í£¬ºÍÒ»¸ö³£Á¿£¬Í¨¹ýÒ»¸öÎ±Ëæ»úËã·¨£¬µÃµ½Ò»¸öËæ»úÊý£¬Î±Ëæ»úËã·¨±£Ö¤ÁËͬһ¸ökey×ÜÊǵõ½ÏàͬµÄËæ»úÊý£¬´Ó¶ø±£Ö¤Ã¿´Î¼ÆËãµÄ´æ´¢Î»Öò»»á¸Ä±ä£»
CRUSH_HASH(
PG_ID, OSD_ID, r ) = draw! |
µÚ¶þ²½½«ÉÏÃæµÃµ½µÄËæ»úÊýºÍÿ¸öOSDµÄÈ¨ÖØÏà³Ë£¬È»ºóÌô³ö³Ë»ý×î´óµÄÄǸöOSD£»
( draw &0xffff ) * osd_weight = osd_straw |
ÔÚÑù±¾ÈÝÁ¿×ã¹»´óÖ®ºó£¬Õâ¸öËæ»úÊý¶ÔÌôÖеĽá¹û²»ÔÙÓÐÓ°Ï죬Æð¾ö¶¨ÐÔÓ°ÏìµÄÊÇOSDµÄÈ¨ÖØ£¬Ò²¾ÍÊÇ˵£¬OSDµÄÈ¨ÖØÔ½´ó£¬±»ÌôÖеĸÅÂÊÔ½´ó¡£
µ½ÕâÀïÁËÎÒÃÇÔÙ¿´¿´crushËã·¨ÈçºÎ´ï³ÉµÄÄ¿±ê£º
ͨ¹ýËæ»úËã·¨ÈÃÊý¾Ý¾ùºâ·Ö²¼£¬³ËÒÔÈ¨ÖØÈÃÌôÑ¡µÄ½á¹û¿¼ÂÇÁËÈ¨ÖØ£»¶øÈç¹û³öÏÖ¹ÊÕÏOSD£¬Ö»ÐèÒª»Ö¸´Õâ¸öOSDÉϵÄÊý¾Ý£¬²»ÔÚÕâ¸ö½ÚµãÉϵÄÊý¾Ý²»ÐèÒÆ¶¯£»
crushÓÅȱµã
Áĵ½ÕâÀcrushËã·¨µÄÓÅȱµã¾ÍÃ÷ÏÔÁË£º
Óŵ㣺
ÊäÈëÔªÊý¾Ý£¨ cluster map¡¢ placement rule£©
½ÏÉÙ£¬ ¿ÉÒÔÓ¦¶Ô´ó¹æÄ£¼¯Èº¡£
¿ÉÒÔÓ¦¶Ô¼¯ÈºµÄÀ©ÈݺÍËõÈÝ¡£
²ÉÓÃÒÔ¸ÅÂÊΪ»ù´¡µÄͳ¼ÆÉϵľùºâ£¬ÔÚ´ó¹æÄ£¼¯ÈºÖпÉÒÔʵÏÖÊý¾Ý¾ùºâ
ȱµã
ÔÚС¹æÄ£¼¯ÈºÖУ¬ »áÓÐÒ»¶¨µÄÊý¾Ý²»¾ùºâÏÖÏó£¨È¨ÖصÄÓ°ÏìµÍ£¬Ö÷ÒªÆð×÷ÓõÄÊÇÎ±Ëæ»úËã·¨£©¡£
¿´Çå³þÁËѰַµÄ¹ý³Ì£¬¾ÍÃ÷°×ΪɶPG²»ÄÜÇáÒ×±ä¸üÁË£»PGÊÇѰַµÚÒ»²½ÖеÄȡģ²ÎÊý£¬±ä¸üPG»áµ¼Ö¶ÔÏóµÄPG
id ¶¼·¢Éú±ä»¯£¬´Ó¶øµ¼ÖÂÕû¸ö¼¯ÈºµÄÊý¾ÝÇ¨ÒÆ£»
Ceph ÊÇSega±¾È˵IJ©Ê¿ÂÛÎÄ×÷Æ·, Æä²©Ê¿ÂÛÎı»ÕûÀí³ÉÈýƪ¶ÌÂÛÎÄ£¬ÆäÖÐһƪ¾ÍÊÇ
CRUSH
CRUSHÂÛÎıêÌâΪ¡¶CRUSH: Controlled, Scalable,
Decentralized Placement of Replicated Data¡·£¬½éÉÜÁËCRUSHµÄÉè¼ÆÓëʵÏÖϸ½Ú¡£
£¨PS£ºÁíÍâÁ½ÆªÊÇ RADOSºÍ CephFS, ·Ö±ð½² Ceph
µÄ·þÎñÆ÷ʵÏÖºÍ Ceph ÎļþϵͳµÄϸ½ÚʵÏÖ£©
´íÎó¼ì²âºÍ»Ö¸´
´íÎó¼ì²â
ÀûÓÃÐÄÌø
Éϱ¨monitor
¸üÐÂmap
´íÎó»Ö¸´
Ö÷osdÖ÷³Ö»Ö¸´¹¤×÷
ÈôÖ÷osd¹Òµô£¬¶þ¼¶osdÑ¡ÔñÒ»¸ö¶¥ÉÏ

Êý¾ÝÌõ´ø»¯
ÓÉÓÚ´æ´¢É豸ÍÌÍÂÁ¿µÄÏÞÖÆ£¬Ó°ÏìÐÔÄܺͿÉÉìËõÐÔ¡£
¿ç¶à¸ö´æ´¢É豸µÄÁ¬Ðø¿éÌõ´ø»¯´æ´¢ÐÅÏ¢£¬ÒÔÌá¸ßÍÌÍÂÁ¿ºÍÐÔÄÜ
CephÌõ´ø»¯ÏàËÆÓÚRAID0
×¢Ò⣺cephÌõ´ø»¯ÊôÓÚclient¶Ë£¬²»ÔÚRADOS·¶³ë

×¢Ò⣺Ìõ´ø»¯ÊǶÀÁ¢ÓÚ¶ÔÏ󸱱¾µÄ¡£ÓÉÓÚCRUSH¸±±¾¶ÔÏó¿çÔ½OSDs£¬ËùÒÔÌõ´ø×Ô¶¯µÄ±»¸´ÖÆ¡£
Ìõ´ø»¯²ÎÊý
Object Size:
×ã¹»´ó¿ÉÒÔÈÝÄÉÌõ´øµ¥Ôª£¬±ØÐëÈÝÄÉÒ»¸ö»òÕß¶à¸öÌõ´øµ¥Ôª¡££¨Èç2MB£¬4MB£©
Stripe Width:
Ò»¸öÌõ´øµ¥ÔªµÄ´óС£¬³ýÁË×îºóÒ»¸ö£¬ÆäËû±ØÐëÒ»Ñù´ó£¨Èç64K£©
Stripe Count:
Á¬ÐøÐ´ÈëһϵÁеĶÔÏóµÄ¸öÊý£¨Èç4¸ö£©
×¢Ò⣺
²ÎÊýÒ»µ©ÉèÖò»¿É¸Ä±ä£¬Ìáǰ×öºÃÐÔÄܲâÊÔ

CephµÄ¸ß¼¶¹¦ÄÜ
CephÖ§³Ö·á¸»µÄ´æ´¢¹¦ÄÜ£¬´Ó·Ö²¼Ê½ÏµÍ³×î»ù±¾µÄºáÏòÀ©Õ¹¡¢¶¯Ì¬ÉìËõ¡¢ÈßÓàÈÝÔÖ¡¢¸ºÔØÆ½ºâµÈ£¬µ½Éú²ú»·¾³¹ö¶¯Éý¼¶¡¢¶à´æ´¢³Ø¡¢ÑÓ³Ùɾ³ýµÈ£¬ÔÙµ½¸ß´óÉϵÄCeph¼¯Èº¡¢¿ìÕÕ¡¢EC¾ÀɾÂë¡¢¿ç´æ´¢³Ø»º´æµÈ£¬ÏÂÃæÎÒÃǼòµ¥½éÉܼ¸¸ö¹Ø¼üÌØÐÔ¡£

Ceph»ùÓÚͳһ´æ´¢ÏµÍ³Éè¼Æ£¬Ö§³ÖÈýÖÖ½Ó¿Ú¡£FileÎļþϵͳ֧³ÖPOSIX¡¢HDFS¡¢NFS¡¢CIFS·þÎñ½Ó¿Ú£¬Block¿é·þÎñÖ§³Ö¾«¼òÅäÖá¢COW¿ìÕÕ¡¢¿Ë¡£¬¶ÔÏó·þÎñÖ§³ÖÔÉúµÄObject
API¡¢Ò²¼æÈÝSwiftºÍS3µÄAPI¡£
ÔÚCeph storage 2ÖУ¬ÌṩȫÇò¶ÔÏó´æ´¢¼¯Èº£¬Ö§³Öµ¥¸öÃüÃû¿Õ¼ä£¬²¢Ö§³ÖÔÚ¶àRegionµØÇøÔËÐеļ¯ÈºÖ®¼äÌṩÁËÊý¾Ýͬ²½£¬°üº¬RegionÄÚÖ÷Zoneµ½´ÓZoneÊý¾Ýͬ²½(¿Éͬ²½Êý¾ÝºÍÔªÊý¾Ý)ºÍ²»Í¬Region¼äÊý¾Ýͬ²½(Ö»ÄÜͬ²½ÔªÊý¾Ý£¬°üº¬Íø¹ØÓû§ºÍͰÐÅÏ¢¡¢µ«²»°üº¬Í°ÄڵĶÔÏó)¡£
ÄÄЩ¹«Ë¾ÔÚʹÓÃCeph
ºìñ
ÃÀ¹úÔ¤²â·ÖÎö¹«Ë¾FICO
°Ä´óÀûÑǵÄĪÄÉʲ´óѧ 500PB
ÀÖÊÓ£¬Ò»µã×ÊѶ£¬½ñÈÕÍ·Ìõ£¬µÎµÎ£¬ÇàÔÆµÈ
Ceph½ö½öÊÇOpenStackºó¶Ë´æ´¢±êÅ䣬ĿǰºÜ¶à´æ´¢³§ÉÌ¡¢´óÆóÒµ¶¼»ùÓÚCeph¼¼Êõ¿ª·¢»ò´î½¨´æ´¢ÏµÍ³£¬ÎÒÃÇÊ×ÏÈ¿´¿´¼¸¼Ò´æ´¢³§É̵IJúÆ·£¬ÈçHopeBayºÍSanDisk¡£
Hope Bay¿Æ¼¼ÊÇÒ»¼ÒרעÓÚÔÆÆ½Ì¨µÄ¿Æ¼¼¹«Ë¾£¬ÓµÓÐArkEase
Pro´æ´¢·þÎñƽ̨¡¢ArkFlexÊý¾Ý´æ´¢Æ½Ì¨¡¢Ark Express´æ´¢Íø¹ØºÍArkVoiceÆóÒµÔÆ¶ËÓïÒôÂ¼ÖÆÆ½Ì¨¡£ÔÚArkFlexÊý¾Ý´æ´¢Æ½Ì¨ÖУ¬Hope
Bay¶ÔCephÎļþϵͳ½øÐиÄÁ¼£¬½«CIFS¡¢NFS¡¢iSCSI½¨¹¹ÔÚCeph RBDÖ®ÉÏ¡£
SanDiskÊÕ¹ºFusion-ioÖ®ºóÏà¼ÌÍÆ³öioControl»ìºÏʽ´æ´¢ÕóÁкÍInfiniFlashϵÁÐÉÁ´æ¡£°þÀëÏà¹ØÒµÎñµ½Ð³ÉÁ¢ÐÂNextGen¹«Ë¾£¬SanDiskͨ¹ýInfiniFlashϵÁÐÉÁ´æÖ÷¹¥ÉÁ´æÊг¡£¬ÆäÖоÍÓÐÒ»¿î»úÐÍInfiniFlash
System IF500²ÉÓÃCeph¼¼Êõ(IF100Ó²¼þºÍInfiniFlash OS CephºáÏòÀ©Õ¹Èí¼þ)£¬Í¬Ê±Ìṩ¶ÔÏó´æ´¢Óë¿é´æ´¢·þÎñ¡£SanDiskµÄ´æ´¢²ßÂÔÊDZȽϿª·Å£¬µÍ¶Ë´æ´¢IF100(´¿Ó²¼þÐÎ̬)ÕûºÏÁËNexentaµÄ»ùÓÚZFSÎļþϵͳ¿ªÔ´NexentaStorÈí¼þ(Ö§³ÖNASºÍiSCSI)£¬¶ø¸ß¶ËµÄIF700ÔòʹÓÃÁËFusion-ioʱÆÚµÄ
ION AcceleratorÊý¾Ý¿â¼ÓËÙÈí¼þ¡£
´ËÍ⣬ºÜ¶à´óÐÍÆóÒµÒ²²ÉÓÃCeph¹¹½¨¹¹½¨ÔÆÆ½Ì¨ºÍ·Ö²¼Ê½´æ´¢½â¾ö·½°¸£¬Ò²ÕýÊÇÒòΪCephºÍOpenStackµÄÉî¶È¼¯³É£¬Ê¹µÃCephºÍOpenStackÅäºÏ±»»¥ÁªÍø¹«Ë¾ÓÃÀ´´î½¨ÔÆÆ½Ì¨¡£
ÀÖÊÓ»ùÓÚOpenStack ºÍCeph(RBD¿é´æ´¢ºÍRADOSGW¶ÔÏó)´î½¨ÀÖÊÓÔÆÆ½Ì¨£»±¦µÂÔÆÒ²»ùÓÚOpenStack¡¢Ceph(RBD¿é´æ´¢ºÍCephFS)
ºÍDocker¹¹½¨¡£µçÉÌeBayÒ²²ÉÓÃCephºÍ OpenStack ½¨Éè˽ÓÐÔÆ£¬Ã¿¸öCeph¼¯ÈºÈÝÁ¿¶¼¸ß´ïÊý
PB ¼¶±ð£¬ÕâЩ¼¯ÈºÖ÷ҪΪ OpenStack ·þÎñ¡£Í¬Ê±£¬eBay ÍŶÓÔÚNASÔÆ»¯Í¶ÈëÖð½¥¼Ó´ó£¬CephFSÓпÉÄÜ×÷ΪNAS
ÔÆ»¯µÄ²»¶þ֮ѡ¡£
Я³Ì»ùÓÚCeph´î½¨PB¼¶ÔƶÔÏó´æ´¢£¬À˳±AS13000ϵÁд洢ҲÊÇ»ùÓÚCeph¿ª·¢£¬Ë¼¿ÆUCSÁ÷ýÌå·þÎñ´æ´¢Ò²ÊÇ»ùÓÚCeph¶ÔÏó´æ´¢£¬ÑÅ»¢»ùÓÚCeph´î½¨ÔƶÔÏó´æ´¢¡£ÁªÍ¨Ñо¿Ôº¡¢CERNʵÑéÊÒ¡¢United
StackµÈÒ²»ùÓÚCeph´î½¨ÁË¿ª·¢»·¾³¡£
CephÒѾ֧³ÖÔÆReady: Ëæ×ÅÔÆ¼ÆËãµÄ·¢Õ¹£¬Ê×ÏÈCeph´îÉÏÁËOpenStackÕâÖ»´ó´¬£¬Ô¤Ê¾×ÅCephÒѾÍêÈ«ÔÆReady¡£½Ó×ÅCephÊܵ½Intel¡¢SanDisk¡¢Ë¼¿Æ¡¢YahooµÈ¹«Ë¾Ö§³Ö£¬ÓÈÆäÊÇRedHatÒÔÖØ½ðÊÕ¹ºInktank¹«Ë¾£¬½«Æä×÷Ϊ·¢Õ¹µÄÖ÷·½Ïò¡£Í¨¹ý¶àÄê·¢Õ¹£¬RadHatÒ²Ã÷È·ÁËCephºÍGluster²àÖØµãºÍ·¢Õ¹·½Ïò£¬Gluster¸üרעÓÚÎļþ£¬Ceph¸üרעÓÚ¿éºÍ¶ÔÏó¡£
CephÉçÇøÁ¦Á¿Ö§³Ö:CephÉçÇøÏÖÔÚÒѾÓкܶ೧É̲ÎÈë½øÀ´£¬´ÓIntel¡¢Ë¼¿Æ¡¢SanDiskµÈÕâÑùµÄ¾ÞÍ·£¬µ½United
StackÕâÑùµÄStartup¹«Ë¾£¬ÔÙµ½µçÐÅ¡¢´óѧ¡¢Ñо¿ËùÕâÀà·Ç´æ´¢ÁìÓòµÄ¹«Ë¾»òµ¥Î»£¬CephµÄ²ÎÓë¶ÓÎéÔ½À´Ô½ÅÓ´ó¡£
Ceph¹¦ÄܵIJ»¶ÏÍêÉÆ: CephµÄÐÔÄܲ»¶ÏµÃµ½ÌáÉý£¬´æ´¢ÌØÐÔÒ²²»¶Ï·á¸»£¬ÉõÖÁ¿ÉÒÔÓ봫ͳרҵ´æ´¢æÇÃÀ£¬Í걸µÄ´æ´¢·þÎñºÍµÍÁ®µÄͶ×ʳɱ¾£¬Ê¹µÃÔ½À´Ô½¶àµÄÆóÒµºÍµ¥Î»Ñ¡ÓÃCephÌṩ´æ´¢·þÎñ¡£
SDSºÍ·Ö²¼Ê½¼Ü¹¹: CephÈí¼þÓëÓ²¼þƽ̨֮¼äÍêÈ«½âñ¶ÔÆóÒµÀ´Ëµ´î½¨Ceph´æ´¢ÏµÍ³µÄÃż÷ÊÇÖð½¥±äµÍ£¬²¿Êð¼òµ¥»ùÓÚLinux
UbuntuºÍ±ê×¼X86ƽ̨¡£CephÓë´æ´¢Sandisk¡¢±¦µÂ£¬ÔƼÆËãUnited Stack¡¢Ð¯³ÌºÍÀÖÊӵȹ«Ë¾µÄ³É¹¦Êµ¼ù£¬Ò²ÎªCephµÄ¹ã·ºÓ¦ÓôòÏ²ο¼»ù´¡¡£
¸ü¶àÏêϸ·½°¸¿É²Î¿¼:
ÆäËû¹«Ë¾Ó¦ÓÃCephµÄ¾ßÌå·½°¸
Ïà¹ØÊ¹ÓþÑé
Ô¤ÏÈÉèÖÃPG²»¸ü¸Ä
Ò»¸öPoolÀïÉèÖõÄPGÊýÁ¿ÊÇÔ¤ÏÈÉèÖõģ¬PGµÄÊýÁ¿²»ÊÇËæÒâÉèÖã¬ÐèÒª¸ù¾ÝOSDµÄ¸öÊý¼°¸±±¾²ßÂÔÀ´È·¶¨£º
Total
PGs = ((Total_number_of_OSD * 100) / max_replication_count)
/ pool_count |
ÏßÉϾ¡Á¿²»Òª¸ü¸ÄPGµÄÊýÁ¿£¬PGµÄÊýÁ¿µÄ±ä¸ü½«µ¼ÖÂÕû¸ö¼¯Èº¶¯ÆðÀ´£¨¸÷¸öOSDÖ®¼äcopyÊý¾Ý£©£¬´óÁ¿Êý¾Ý¾ùºâÆÚ¼ä¶ÁдÐÔÄÜϽµÑÏÖØ£»
Á¼ºÃµÄ¹¤³Ìʵ¼ù½¨Ò飨µô¿ÓºóµÄ½Ìѵ£©£º
Ô¤Ïȹ滮PoolµÄ¹æÄ££¬ÉèÖÃPGÊýÁ¿£»Ò»µ©ÉèÖÃÖ®ºó¾Í²»ÔÙ±ä¸ü£»ºóÐøÐèÒªÀ©ÈݾÍÒÔ
Pool Ϊά¶ÈΪÀ©ÈÝ£¬Í¨¹ýÐÂÔöPoolÀ´ÊµÏÖ£¨Poolͨ¹ý crushmapʵÏÖ¹ÊÕÏÓò¸ôÀ룩£»
¹ÊÕÏÓòµÄ»®·Ö
¸Õ¿ªÊ¼½Ó´¥ Ceph£¬Í¨³£»áºöÂÔ crushmap£¬ÒòΪ¼´Ê¹¶ÔËü²»×öÈκÎÉèÖã¬Ò²²»Ó°ÏìÎÒÃǵÄÕý³£Ê¹Óã»
Ò»µ©¼¯Èº´óÁË£¬Ã»ÓÐËü¼¯Èº¾Í´¦ÓÚÒ»¸öΣÏÕµÄÔËÐÐ״̬ÖУ»
ûÓйÊÕÏÓòµÄ»®·Ö£¬Õû¸ö¼¯Èº¾Í´¦ÓÚÒ»¸öδ¸ôÀëµÄ×ÊÔ´³ØÖУ»
Ò»¸ö¶ÔÏó´æ¹ýÈ¥£¬¿ÉÄÜÂäÔÚ 500¸öOSDÓ²Å̵ÄÈÎÒâÈý¸öÉÏ£»
Èç¹ûÒ»¿éÓ²ÅÌ»µÁË£¬¿ÉÄÜ´øÀ´µÄÊÇÈ«¾ÖÓ°Ï죨¸±±¾copy£¬Õâ¸öÓ²ÅÌÉ϶ªÊ§µÄPG¸±±¾¿ÉÄÜ·Ö²¼ÔÚÈ«¾Ö¸÷¸öÓ²ÅÌÉÏ£©£»
ʹÓÃcrushmap ½«Õû¸ö¼¯ÈºµÄOSD »®·ÖΪһ¸ö¸ö¹ÊÕÏÓò£¬ÀàËÆ½«Ò»¸ö¼¯Èº°´ÒµÎñ»®·Ö³ÉΪÁ˶à¸öС¼¯Èº£»Ã¿¸öPool
Ö»»áÓõ½Ìض¨µÄ OSD£¬ÕâÑù£¬Ò»µ©Ä³¸öOSD Ë𻵣¬Ó°ÏìµÄÖ»ÊÇij¸öÒµÎñµÄij¸öPool£¬½«¹ÊÕϵķ¶Î§¿ØÖÆÔÚÒ»¸öºÜСµÄ·¶Î§ÄÚ¡£
Á¼ºÃµÄ¹¤³Ìʵ¼ù½¨Ò飺
ʹÓÃcrushmap »®·Ö¹ÊÕÏÓò£¬½«poolÏÞÖÆÔÚÌØ¶¨µÄosd listÉÏ£¬osdµÄËð»µÖ»»áÒýÆðÕâ¸öpoolÄÚµÄÊý¾Ý¾ùºâ£¬²»»áÔì³ÉÈ«¾ÖÓ°Ï죻
·þÎñ¶Ë¶ÔÏóµÄ´æ´¢
¶ÔÏóÊÇÊý¾Ý´æ´¢µÄ»ù±¾µ¥Ôª£¬ Ò»°ãĬÈÏ 4MB ´óС£¨ÕâÀïÖ¸µÄÊÇRADOSµÄµ×²ã´æ´¢µÄ¶ÔÏ󣬷ÇRGW½Ó¿ÚµÄ¶ÔÏ󣩡£
¶ÔÏóµÄ×é³É·ÖΪ3²¿·Ö£ºkey ¡¢value¡¢ÔªÊý¾Ý£»
ÔªÊý¾Ý¿ÉÖ±½Ó´æÔÚÎļþµÄÀ©Õ¹ÊôÐÔÖУ¨±ØÐëÊDZê×¼µÄÎļþÊôÐÔ£©£¬Ò²¿É´æµ½levelDbÖУ»
value ¾ÍÊǶÔÏóÊý¾Ý£¬ÔÚ±¾µØÎļþϵͳÖÐÓÃÒ»¸öÎļþ´æ´¢£»
¶ÔÓÚ´óÎļþµÄ´æ´¢£¬Ceph ÌṩµÄ¿Í»§¶Ë½Ó¿Ú»á¶Ô´óÎļþ·ÖƬ£¨Ìõ´ø»¯£©ºó´æ´¢µ½·þÎñ¶Ë£»Õâ¸öÌõ´ø»¯²Ù×÷ÊÇÔÚ¿Í»§¶Ë½Ó¿Ú³ÌÐòÍê³ÉµÄ£¬ÔÚ
Ceph ´æ´¢¼¯ÈºÄÚ´æ´¢µÄÄÇЩ¶ÔÏóÊÇûÌõ´ø»¯µÄ¡£¿Í»§¶Ëͨ¹ý librados Ö±½ÓдÈë Ceph ´æ´¢µÄÊý¾Ý²»»á·ÖƬ¡£
Á¼ºÃµÄ¹¤³Ìʵ¼ù½¨Ò飺
¶ÔÓÚ¶ÔÏó´æ´¢£¬Ö»Ê¹Óà Ceph ÌṩµÄ RGW ½Ó¿Ú£¬ ²»Ê¹Óà libradosÔÉú½Ó¿Ú£»²»½öÓÐ·ÖÆ¬¹¦ÄÜ£¬À©Õ¹Ò²¸üÈÝÒ×£¨RGWÊÇÎÞ״̬µÄ£¬¿ÉˮƽÀ©Õ¹£©£»´óÁ¿´ó¶ÔÏóÖ±½Ó´æ·Åµ½
CephÖлáÓ°Ïì Ceph Îȶ¨ÐÔ£¨´æ´¢ÈÝÁ¿´ïµ½60%ºó£©£»
Ceph¶þ´Î¿ª·¢¿ÉÓÅ»¯µÄµØ·½
ÄÚÍø´«ÊäµÄ¼ÓÃܰ²È«ÎÊÌâ
ÓÅ»¯Ceph¶ÔlevelDBµü´úÆ÷µÄʹÓÃ
|