Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Model Center   Code  
»áÔ±   
   
 
     
   
 ¶©ÔÄ
  ¾èÖú
Ò»Îĸ㶮RaftËã·¨
 
×÷Õߣºxybaby
  1933  次浏览      27
2020-6-1 
 
±à¼­ÍƼö:
±¾ÎÄÖ÷Òª½éÉÜÁËraftËã·¨¸ÅÀÀ¡¢leader election¡¢log replication¡¢safety¡¢corner caseµÄµÈÏà¹ØÄÚÈÝ¡£
±¾ÎÄÀ´×ÔÓÚ²©¿ÍÔ°£¬ÓÉ»ðÁú¹ûÈí¼þAnna±à¼­ÍƼö¡£

ÕýÎÄ

raftÊǹ¤³ÌÉÏʹÓýÏΪ¹ã·ºµÄǿһÖÂÐÔ¡¢È¥ÖÐÐÄ»¯¡¢¸ß¿ÉÓõķֲ¼Ê½Ð­Òé¡£ÔÚÕâÀïÇ¿µ÷ÁËÊÇÔÚ¹¤³ÌÉÏ£¬ÒòΪÔÚѧÊõÀíÂ۽磬×îÒ«Ñ۵ϹÊÇ´óÃû¶¦¶¦µÄPaxos¡£µ«PaxosÊÇ£ºÉÙÊýÕæÕýÀí½âµÄÈ˾õµÃ¼òµ¥£¬ÉÐδÀí½âµÄÈ˾õµÃºÜÄÑ£¬´ó¶àÊýÈ˶¼ÊÇÒ»Öª°ë½â¡£±¾ÈËÒ²»¨Á˺ܶàʱ¼ä¡¢¿´Á˺ܶà²ÄÁÏҲûÓÐÕæÕýÀí½â¡£Ö±µ½¿´µ½raftµÄÂÛÎÄ£¬Á½Î»Ñо¿ÕßÒ²Ìáµ½£¬ËûÃÇÒ²»¨Á˺ܳ¤µÄʱ¼äÀ´Àí½âPaxos£¬ËûÃÇÒ²¾õµÃºÜÄÑÀí½â£¬ÓÚÊÇÑо¿³öÁËraftËã·¨¡£

raftÊÇÒ»¸ö¹²Ê¶Ëã·¨£¨consensus algorithm£©£¬Ëùν¹²Ê¶£¬¾ÍÊǶà¸ö½Úµã¶Ôij¸öÊÂÇé´ï³ÉÒ»ÖµĿ´·¨£¬¼´Ê¹ÊÇÔÚ²¿·Ö½Úµã¹ÊÕÏ¡¢ÍøÂçÑÓʱ¡¢ÍøÂç·Ö¸îµÄÇé¿öÏ¡£ÕâЩÄê×îΪ»ðÈȵļÓÃÜ»õ±Ò£¨±ÈÌØ±Ò¡¢Çø¿éÁ´£©¾ÍÐèÒª¹²Ê¶Ëã·¨£¬¶øÔÚ·Ö²¼Ê½ÏµÍ³ÖУ¬¹²Ê¶Ëã·¨¸ü¶àÓÃÓÚÌá¸ßϵͳµÄÈÝ´íÐÔ£¬±ÈÈç·Ö²¼Ê½´æ´¢Öеĸ´ÖƼ¯£¨replication£©£¬ÔÚ´ø×ÅÎÊÌâѧϰ·Ö²¼Ê½ÏµÍ³Ö®ÖÐÐÄ»¯¸´ÖƼ¯Ò»ÎÄÖнéÉÜÁËÖÐÐÄ»¯¸´ÖƼ¯µÄÏà¹ØÖªÊ¶¡£raftЭÒé¾ÍÊÇÒ»ÖÖleader-basedµÄ¹²Ê¶Ëã·¨£¬ÓëÖ®ÏàÓ¦µÄÊÇleaderlessµÄ¹²Ê¶Ëã·¨¡£

±¾ÎÄ»ùÓÚÂÛÎÄAIn Search of an Understandable Consensus Algorithm¶ÔraftЭÒé½øÐзÖÎö£¬µ±È»£¬»¹Êǽ¨Òé¶ÁÕßÖ±½Ó¿´ÂÛÎÄ¡£

raftËã·¨¸ÅÀÀ

RaftËã·¨µÄÍ·ºÅÄ¿±ê¾ÍÊÇÈÝÒ×Àí½â£¨UnderStandable£©£¬Õâ´ÓÂÛÎĵıêÌâ¾Í¿ÉÒÔ¿´³öÀ´¡£µ±È»£¬RaftÔöÇ¿ÁË¿ÉÀí½âÐÔ£¬ÔÚÐÔÄÜ¡¢¿É¿¿ÐÔ¡¢¿ÉÓÃÐÔ·½ÃæÊDz»ÊäÓÚPaxosµÄ¡£

Raft more understandable than Paxos and also provides a better foundation for building practical systems

ΪÁË´ïµ½Ò×ÓÚÀí½âµÄÄ¿±ê£¬raft×öÁ˺ܶàŬÁ¦£¬ÆäÖÐ×îÖ÷ÒªÊÇÁ½¼þÊÂÇ飺

ÎÊÌâ·Ö½â

״̬¼ò»¯

ÎÊÌâ·Ö½âÊǽ«"¸´ÖƼ¯ÖнڵãÒ»ÖÂÐÔ"Õâ¸ö¸´ÔÓµÄÎÊÌâ»®·ÖΪÊý¸ö¿ÉÒÔ±»¶ÀÁ¢½âÊÍ¡¢Àí½â¡¢½â¾öµÄ×ÓÎÊÌâ¡£ÔÚraft£¬×ÓÎÊÌâ°üÀ¨£¬leader election£¬ log replication£¬safety£¬membership changes¡£¶ø×´Ì¬¼ò»¯¸üºÃÀí½â£¬¾ÍÊǶÔËã·¨×ö³öһЩÏÞÖÆ£¬¼õÉÙÐèÒª¿¼ÂǵÄ״̬Êý£¬Ê¹µÃËã·¨¸ü¼ÓÇåÎú£¬¸üÉٵIJ»È·¶¨ÐÔ£¨±ÈÈ磬±£Ö¤ÐÂÑ¡¾Ù³öÀ´µÄleader»á°üº¬ËùÓÐcommited log entry£©

Raft implements consensus by first electing a distinguished leader, then giving the leader complete responsibility for managing the replicated log. The leader accepts log entries from clients, replicates them on other servers, and tells servers when it is safe to apply log entries to their state machines. A leader can fail or become disconnected from the other servers, in which case a new leader is elected.

ÉÏÃæµÄÒýÎĶÔraftЭÒéµÄ¹¤×÷Ô­Àí½øÐÐÁ˸߶ȵĸÅÀ¨£ºraft»áÏÈÑ¡¾Ù³öleader£¬leaderÍêÈ«¸ºÔðreplicated logµÄ¹ÜÀí¡£leader¸ºÔð½ÓÊÜËùÓпͻ§¶Ë¸üÐÂÇëÇó£¬È»ºó¸´ÖƵ½follower½Úµã£¬²¢ÔÚ¡°°²È«¡±µÄʱºòÖ´ÐÐÕâЩÇëÇó¡£Èç¹ûleader¹ÊÕÏ£¬followes»áÖØÐÂÑ¡¾Ù³öеÄleader¡£

Õâ¾ÍÉæ¼°µ½raft×îеÄÁ½¸ö×ÓÎÊÌ⣺ leader electionºÍlog replication

leader election

raftЭÒéÖУ¬Ò»¸ö½ÚµãÈÎһʱ¿Ì´¦ÓÚÒÔÏÂÈý¸ö״̬֮һ£º

leader

follower

candidate

¸ø³ö×´Ì¬×ªÒÆÍ¼ÄܺÜÖ±¹ÛµÄÖ±µ½ÕâÈý¸ö״̬µÄÇø±ð

¿ÉÒÔ¿´³öËùÓнڵãÆô¶¯Ê±¶¼ÊÇfollower״̬£»ÔÚÒ»¶Îʱ¼äÄÚÈç¹ûûÓÐÊÕµ½À´×ÔleaderµÄÐÄÌø£¬´ÓfollowerÇл»µ½candidate£¬·¢ÆðÑ¡¾Ù£»Èç¹ûÊÕµ½majorityµÄÔì³ÉƱ£¨º¬×Ô¼ºµÄһƱ£©ÔòÇл»µ½leader״̬£»Èç¹û·¢ÏÖÆäËû½Úµã±È×Ô¼º¸üУ¬ÔòÖ÷¶¯Çл»µ½follower¡£

×ÜÖ®£¬ÏµÍ³ÖÐ×î¶àÖ»ÓÐÒ»¸öleader£¬Èç¹ûÔÚÒ»¶Îʱ¼äÀï·¢ÏÖûÓÐleader£¬Ôò´ó¼Òͨ¹ýÑ¡¾Ù-ͶƱѡ³öleader¡£leader»á²»Í£µÄ¸øfollower·¢ÐÄÌøÏûÏ¢£¬±íÃ÷×Ô¼ºµÄ´æ»î״̬¡£Èç¹ûleader¹ÊÕÏ£¬ÄÇôfollower»áת»»³Écandidate£¬ÖØÐÂÑ¡³öleader¡£

term

´ÓÉÏÃæ¿ÉÒÔ¿´³ö£¬Äĸö½Úµã×öleaderÊÇ´ó¼ÒͶƱѡ¾Ù³öÀ´µÄ£¬Ã¿¸öleader¹¤×÷Ò»¶Îʱ¼ä£¬È»ºóÑ¡³öеÄleader¼ÌÐø¸ºÔð¡£Õâ¸ùÃñÖ÷Éç»áµÄÑ¡¾ÙºÜÏñ£¬Ã¿Ò»½ìеÄÂÄÖ°ÆÚ³ÆÖ®ÎªÒ»½ìÈÎÆÚ£¬ÔÚraftЭÒéÖУ¬Ò²ÊÇÕâÑùµÄ£¬¶ÔÓ¦µÄÊõÓï½Ðterm¡£

term£¨ÈÎÆÚ£©ÒÔÑ¡¾Ù£¨election£©¿ªÊ¼£¬È»ºó¾ÍÊÇÒ»¶Î»ò³¤»ò¶ÌµÄÎȶ¨¹¤×÷ÆÚ£¨normal Operation£©¡£´ÓÉÏͼ¿ÉÒÔ¿´µ½£¬ÈÎÆÚÊǵÝÔöµÄ£¬Õâ¾Í³äµ±ÁËÂß¼­Ê±ÖÓµÄ×÷Óã»ÁíÍ⣬term 3չʾÁËÒ»ÖÖÇé¿ö£¬¾ÍÊÇ˵ûÓÐÑ¡¾Ù³öleader¾Í½áÊøÁË£¬È»ºó»á·¢ÆðеÄÑ¡¾Ù£¬ºóÃæ»á½âÊÍÕâÖÖsplit voteµÄÇé¿ö¡£

Ñ¡¾Ù¹ý³ÌÏê½â

ÉÏÃæÒѾ­Ëµ¹ý£¬Èç¹ûfollowerÔÚelection timeoutÄÚûÓÐÊÕµ½À´×ÔleaderµÄÐÄÌø£¬£¨Ò²Ðí´Ëʱ»¹Ã»ÓÐÑ¡³öleader£¬´ó¼Ò¶¼Ôڵȣ»Ò²Ðíleader¹ÒÁË£»Ò²ÐíÖ»ÊÇleaderÓë¸ÃfollowerÖ®¼äÍøÂç¹ÊÕÏ£©£¬Ôò»áÖ÷¶¯·¢ÆðÑ¡¾Ù¡£²½ÖèÈçÏ£º

Ôö¼Ó½Úµã±¾µØµÄ current term £¬Çл»µ½candidate״̬

Ͷ×Ô¼ºÒ»Æ±

²¢ÐиøÆäËû½Úµã·¢ËÍ RequestVote RPCs

µÈ´ýÆäËû½ÚµãµÄ»Ø¸´

ÔÚÕâ¸ö¹ý³ÌÖУ¬¸ù¾ÝÀ´×ÔÆäËû½ÚµãµÄÏûÏ¢£¬¿ÉÄܳöÏÖÈýÖÖ½á¹û

ÊÕµ½majorityµÄͶƱ£¨º¬×Ô¼ºµÄһƱ£©£¬ÔòÓ®µÃÑ¡¾Ù£¬³ÉΪleader

±»¸æÖª±ðÈËÒѵ±Ñ¡£¬ÄÇô×ÔÐÐÇл»µ½follower

Ò»¶Îʱ¼äÄÚûÓÐÊÕµ½majorityͶƱ£¬Ôò±£³Öcandidate״̬£¬ÖØÐ·¢³öÑ¡¾Ù

µÚÒ»ÖÖÇé¿ö£¬Ó®µÃÁËÑ¡¾ÙÖ®ºó£¬ÐµÄleader»áÁ¢¿Ì¸øËùÓнڵ㷢ÏûÏ¢£¬¹ã¶ø¸æÖ®£¬±ÜÃâÆäÓà½Úµã´¥·¢ÐµÄÑ¡¾Ù¡£ÔÚÕâÀÏȻص½Í¶Æ±ÕßµÄÊӽǣ¬Í¶Æ±ÕßÈçºÎ¾ö¶¨ÊÇ·ñ¸øÒ»¸öÑ¡¾ÙÇëÇóÍ¶Æ±ÄØ£¬ÓÐÒÔÏÂÔ¼Êø£º

ÔÚÈÎÒ»ÈÎÆÚÄÚ£¬µ¥¸ö½Úµã×î¶àÖ»ÄÜͶһƱ

ºòÑ¡ÈËÖªµÀµÄÐÅÏ¢²»ÄܱÈ×Ô¼ºµÄÉÙ£¨ÕâÒ»²¿·Ö£¬ºóÃæ½éÉÜlog replicationºÍsafetyµÄʱºò»áÏêϸ½éÉÜ£©

first-come-first-served ÏÈÀ´ÏȵÃ

µÚ¶þÖÖÇé¿ö£¬±ÈÈçÓÐÈý¸ö½ÚµãA B C¡£A Bͬʱ·¢ÆðÑ¡¾Ù£¬¶øAµÄÑ¡¾ÙÏûÏ¢Ïȵ½´ïC£¬C¸øAͶÁËһƱ£¬µ±BµÄÏûÏ¢µ½´ïCʱ£¬ÒѾ­²»ÄÜÂú×ãÉÏÃæÌáµ½µÄµÚÒ»¸öÔ¼Êø£¬¼´C²»»á¸øBͶƱ£¬¶øAºÍBÏÔÈ»¶¼²»»á¸ø¶Ô·½Í¶Æ±¡£Aʤ³öÖ®ºó£¬»á¸øB,C·¢ÐÄÌøÏûÏ¢£¬½ÚµãB·¢ÏÖ½ÚµãAµÄterm²»µÍÓÚ×Ô¼ºµÄterm£¬ÖªµÀÓÐÒѾ­ÓÐLeaderÁË£¬ÓÚÊÇת»»³Éfollower¡£

µÚÈýÖÖÇé¿ö£¬Ã»ÓÐÈκνڵã»ñµÃmajorityͶƱ£¬±ÈÈçÏÂͼÕâÖÖÇé¿ö£º

×ܹ²ÓÐËĸö½Úµã£¬Node C¡¢Node Dͬʱ³ÉΪÁËcandidate£¬½øÈëÁËterm 4£¬µ«Node AͶÁËNodeDһƱ£¬NodeBͶÁËNode CһƱ£¬Õâ¾Í³öÏÖÁËÆ½Æ± split voteµÄÇé¿ö¡£Õâ¸öʱºò´ó¼Ò¶¼ÔڵȰ¡µÈ£¬Ö±µ½³¬Ê±ºóÖØÐ·¢ÆðÑ¡¾Ù¡£Èç¹û³öÏÖÆ½Æ±µÄÇé¿ö£¬ÄÇô¾ÍÑÓ³¤ÁËϵͳ²»¿ÉÓõÄʱ¼ä£¨Ã»ÓÐleaderÊDz»ÄÜ´¦Àí¿Í»§¶ËдÇëÇóµÄ£©£¬Òò´ËraftÒýÈëÁËrandomized election timeoutsÀ´¾¡Á¿±ÜÃâÆ½Æ±Çé¿ö¡£Í¬Ê±£¬leader-based ¹²Ê¶Ëã·¨ÖУ¬½ÚµãµÄÊýÄ¿¶¼ÊÇÆæÊý¸ö£¬¾¡Á¿±£Ö¤majorityµÄ³öÏÖ¡£

log replication

µ±ÓÐÁËleader£¬ÏµÍ³Ó¦¸Ã½øÈë¶ÔÍ⹤×÷ÆÚÁË¡£¿Í»§¶ËµÄÒ»ÇÐÇëÇóÀ´·¢Ë͵½leader£¬leaderÀ´µ÷¶ÈÕâЩ²¢·¢ÇëÇóµÄ˳Ðò£¬²¢ÇÒ±£Ö¤leaderÓëfollowers״̬µÄÒ»ÖÂÐÔ¡£raftÖеÄ×ö·¨ÊÇ£¬½«ÕâЩÇëÇóÒÔ¼°Ö´ÐÐ˳Ðò¸æÖªfollowers¡£leaderºÍfollowersÒÔÏàͬµÄ˳ÐòÀ´Ö´ÐÐÕâЩÇëÇ󣬱£Ö¤×´Ì¬Ò»Ö¡£

Replicated state machines

¹²Ê¶Ëã·¨µÄʵÏÖÒ»°ãÊÇ»ùÓÚ¸´ÖÆ×´Ì¬»ú£¨Replicated state machines£©£¬ºÎΪ¸´ÖÆ×´Ì¬»ú£º

If two identical, deterministic processes begin in the same state and get the same inputs in the same order, they will produce the same output and end in the same state.

¼òµ¥À´Ëµ£ºÏàͬµÄ³õʶ״̬ + ÏàͬµÄÊäÈë = ÏàͬµÄ½áÊø×´Ì¬¡£ÒýÎÄÖÐÓÐÒ»¸öºÜÖØÒªµÄ´Êdeterministic£¬¾ÍÊÇ˵²»Í¬½ÚµãÒªÒÔÏàͬÇÒÈ·¶¨ÐԵĺ¯ÊýÀ´´¦ÀíÊäÈ룬¶ø²»ÒªÒýÈëһϲ»È·¶¨µÄÖµ£¬±ÈÈç±¾µØÊ±¼äµÈ¡£ÈçºÎ±£Ö¤ËùÓнڵã get the same inputs in the same order£¬Ê¹ÓÃreplicated logÊÇÒ»¸öºÜ²»´íµÄ×¢Ò⣬log¾ßÓг־û¯¡¢±£ÐòµÄÌØµã£¬ÊÇ´ó¶àÊý·Ö²¼Ê½ÏµÍ³µÄ»ùʯ¡£

Òò´Ë£¬¿ÉÒÔÕâô˵£¬ÔÚraftÖУ¬leader½«¿Í»§¶ËÇëÇó£¨command£©·â×°µ½Ò»¸ö¸ölog entry£¬½«ÕâЩlog entries¸´ÖÆ£¨replicate£©µ½ËùÓÐfollower½Úµã£¬È»ºó´ó¼Ò°´Ïàͬ˳ÐòÓ¦Óã¨apply£©log entryÖеÄcommand£¬Ôò״̬¿Ï¶¨ÊÇÒ»Öµġ£

ÏÂͼÐÎÏóչʾÁËÕâÖÖlog-based replicated state machine

ÇëÇóÍêÕûÁ÷³Ì

µ±ÏµÍ³£¨leader£©ÊÕµ½Ò»¸öÀ´×Ô¿Í»§¶ËµÄдÇëÇ󣬵½·µ»Ø¸ø¿Í»§¶Ë£¬Õû¸ö¹ý³Ì´ÓleaderµÄÊÓ½ÇÀ´¿´»á¾­ÀúÒÔϲ½Ö裺

leader append log entry

leader issue AppendEntries RPC in parallel

leader wait for majority response

leader apply entry to state machine

leader reply to client

leader notify follower apply log

¿ÉÒÔ¿´µ½ÈÕÖ¾µÄÌá½»¹ý³ÌÓеãÀàËÆÁ½½×¶ÎÌá½»(2PC)£¬²»¹ýÓë2PCµÄÇø±ðÔÚÓÚ£¬leaderÖ»ÐèÒª´ó¶àÊý£¨majority£©½ÚµãµÄ»Ø¸´¼´¿É£¬ÕâÑùÖ»Òª³¬¹ýÒ»°ë½Úµã´¦ÓÚ¹¤×÷״̬Ôòϵͳ¾ÍÊÇ¿ÉÓõġ£

ÄÇôÈÕÖ¾ÔÚÿ¸ö½ÚµãÉÏÊÇʲôÑù×ÓµÄÄØ

²»ÄÑ¿´µ½£¬logsÓÉ˳Ðò±àºÅµÄlog entry×é³É £¬Ã¿¸ölog entry³ýÁ˰üº¬command£¬»¹°üº¬²úÉú¸Ãlog entryʱµÄleader term¡£´ÓÉÏͼ¿ÉÒÔ¿´µ½£¬Îå¸ö½ÚµãµÄÈÕÖ¾²¢²»ÍêȫһÖ£¬raftË㷨ΪÁ˱£Ö¤¸ß¿ÉÓ㬲¢²»ÊÇǿһÖÂÐÔ£¬¶øÊÇ×îÖÕÒ»ÖÂÐÔ£¬leader»á²»¶Ï³¢ÊÔ¸øfollower·¢log entries£¬Ö±µ½ËùÓнڵãµÄlog entries¶¼Ïàͬ¡£

ÔÚÉÏÃæµÄÁ÷³ÌÖУ¬leaderÖ»ÐèÒªÈÕÖ¾±»¸´ÖƵ½´ó¶àÊý½Úµã¼´¿ÉÏò¿Í»§¶Ë·µ»Ø£¬Ò»µ©Ïò¿Í»§¶Ë·µ»Ø³É¹¦ÏûÏ¢£¬ÄÇôϵͳ¾Í±ØÐë±£Ö¤log£¨ÆäʵÊÇlogËù°üº¬µÄcommand£©ÔÚÈκÎÒì³£µÄÇé¿ö϶¼²»»á·¢Éú»Ø¹ö¡£ÕâÀïÓÐÁ½¸ö´Ê£ºcommit£¨committed£©£¬apply(applied)£¬Ç°ÕßÊÇÖ¸ÈÕÖ¾±»¸´ÖƵ½ÁË´ó¶àÊý½ÚµãºóÈÕÖ¾µÄ״̬£»¶øºóÕßÔòÊǽڵ㽫ÈÕÖ¾Ó¦Óõ½×´Ì¬»ú£¬ÕæÕýÓ°Ïìµ½½Úµã״̬¡£

The leader decides when it is safe to apply a log entry to the state machines; such an entry is called committed. Raft guarantees that committed entries are durable and will eventually be executed by all of the available state machines. A log entry is committed once the leader that created the entry has replicated it on a majority of the servers

safety

ÔÚÉÏÃæÌáµ½Ö»ÒªÈÕÖ¾±»¸´ÖƵ½majority½Úµã£¬¾ÍÄܱ£Ö¤²»»á±»»Ø¹ö£¬¼´Ê¹ÔÚ¸÷ÖÖÒì³£Çé¿öÏ£¬Õâ¸ùleader electionÌáµ½µÄÑ¡¾ÙÔ¼ÊøÓйء£ÔÚÕâÒ»²¿·Ö£¬Ö÷ÒªÌÖÂÛraftЭÒéÔÚ¸÷ÖÖ¸÷ÑùµÄÒì³£Çé¿öÏÂÈçºÎ¹¤×÷µÄ¡£

ºâÁ¿Ò»¸ö·Ö²¼Ê½Ëã·¨£¬ÓÐÐí¶àÊôÐÔ£¬Èç

safety£ºnothing bad happens,

liveness£º something good eventually happens.

ÔÚÈκÎϵͳģÐÍÏ£¬¶¼ÐèÒªÂú×ãsafetyÊôÐÔ£¬¼´ÔÚÈκÎÇé¿öÏ£¬ÏµÍ³¶¼²»ÄܳöÏÖ²»¿ÉÄæµÄ´íÎó£¬Ò²²»ÄÜÏò¿Í»§¶Ë·µ»Ø´íÎóµÄÄÚÈÝ¡£±ÈÈ磬raft±£Ö¤±»¸´ÖƵ½´ó¶àÊý½ÚµãµÄÈÕÖ¾²»»á±»»Ø¹ö£¬ÄÇô¾ÍÊÇsafetyÊôÐÔ¡£¶øraft×îÖÕ»áÈÃËùÓнڵã״̬һÖ£¬ÕâÊôÓÚlivenessÊôÐÔ¡£

raftЭÒé»á±£Ö¤ÒÔÏÂÊôÐÔ

Election safety

Ñ¡¾Ù°²È«ÐÔ£¬¼´ÈÎÒ»ÈÎÆÚÄÚ×î¶àÒ»¸öleader±»Ñ¡³ö¡£ÕâÒ»µã·Ç³£ÖØÒª£¬ÔÚÒ»¸ö¸´ÖƼ¯ÖÐÈκÎʱ¿ÌÖ»ÄÜÓÐÒ»¸öleader¡£ÏµÍ³ÖÐͬʱÓжàÓàÒ»¸öleader£¬±»³ÆÖ®ÎªÄÔÁÑ£¨brain split£©£¬ÕâÊǷdz£ÑÏÖØµÄÎÊÌ⣬»áµ¼ÖÂÊý¾ÝµÄ¸²¸Ç¶ªÊ§¡£ÔÚraftÖУ¬Á½µã±£Ö¤ÁËÕâ¸öÊôÐÔ£º

Ò»¸ö½ÚµãijһÈÎÆÚÄÚ×î¶àÖ»ÄÜͶһƱ£»

Ö»ÓлñµÃmajorityͶƱµÄ½Úµã²Å»á³ÉΪleader¡£

Òò´Ë£¬Ä³Ò»ÈÎÆÚÄÚÒ»¶¨Ö»ÓÐÒ»¸öleader¡£

log matching

ºÜÓÐÒâ˼£¬logÆ¥ÅäÌØÐÔ£¬ ¾ÍÊÇ˵Èç¹ûÁ½¸ö½ÚµãÉϵÄij¸ölog entryµÄlog indexÏàͬÇÒtermÏàͬ£¬ÄÇôÔÚ¸Ãindex֮ǰµÄËùÓÐlog entryÓ¦¸Ã¶¼ÊÇÏàͬµÄ¡£ÈçºÎ×öµ½µÄ£¿ÒÀÀµÓÚÒÔÏÂÁ½µã

If two entries in different logs have the same index and term, then they store the same command.

If two entries in different logs have the same index and term, then the logs are identical in all preceding entries.

Ê×ÏÈ£¬leaderÔÚijһtermµÄÈÎһλÖÃÖ»»á´´½¨Ò»¸ölog entry£¬ÇÒlog entryÊÇappend-only¡£Æä´Î£¬consistency check¡£leaderÔÚAppendEntriesÖаüº¬×îÐÂlog entry֮ǰµÄÒ»¸ölog µÄtermºÍindex£¬Èç¹ûfollowerÔÚ¶ÔÓ¦µÄterm indexÕÒ²»µ½ÈÕÖ¾£¬ÄÇô¾Í»á¸æÖªleader²»Ò»Ö¡£

ÔÚûÓÐÒì³£µÄÇé¿öÏ£¬log matchingÊǺÜÈÝÒ×Âú×ãµÄ£¬µ«Èç¹û³öÏÖÁËnode crash£¬Çé¿ö¾Í»á±äµÃ¸ºÔð¡£±ÈÈçÏÂͼ

×¢Ò⣺ÉÏͼµÄa-f²»ÊÇ6¸öfollower£¬¶øÊÇij¸öfollower¿ÉÄÜ´æÔÚµÄÁù¸ö״̬

leader¡¢follower¶¼¿ÉÄÜcrash£¬ÄÇôfollowerά»¤µÄÈÕÖ¾ÓëleaderÏà±È¿ÉÄܳöÏÖÒÔÏÂÇé¿ö

±ÈleaderÈÕÖ¾ÉÙ£¬ÈçÉÏͼÖеÄab

±ÈleaderÈÕÖ¾¶à£¬ÈçÉÏͼÖеÄcd

ijЩλÖñÈleader¶à£¬Ä³Ð©ÈÕÖ¾±ÈleaderÉÙ£¬Èçef£¨¶àÉÙÊÇÕë¶ÔijһÈÎÆÚ¶øÑÔ£©

µ±³öÏÖÁËleaderÓëfollower²»Ò»ÖµÄÇé¿ö£¬leaderÇ¿ÖÆfollower¸´ÖÆ×Ô¼ºµÄlog

To bring a follower¡¯s log into consistency with its own, the leader must find the latest log entry where the two logs agree, delete any entries in the follower¡¯s log after that point, and send the follower all of the leader¡¯s entries after that point.

leader»áά»¤Ò»¸önextIndex[]Êý×飬¼Ç¼ÁËleader¿ÉÒÔ·¢ËÍÿһ¸öfollowerµÄlog index£¬³õʼ»¯Îªeader×îºóÒ»¸ölog index¼Ó1£¬ Ç°ÃæÒ²Ìáµ½£¬leaderÑ¡¾Ù³É¹¦Ö®ºó»áÁ¢¼´¸øËùÓÐfollower·¢ËÍAppendEntries RPC£¨²»°üº¬ÈκÎlog entry£¬ Ò²³äµ±ÐÄÌøÏûÏ¢£©,ÄÇôÁ÷³Ì×ܽáΪ£º

s1 leader ³õʼ»¯nextIndex[x]Ϊ leader×îºóÒ»¸ölog index + 1

s2 AppendEntriesÀïprevLogTerm prevLogIndexÀ´×Ô logs[nextIndex[x] - 1]

s3 Èç¹ûfollowerÅжÏprevLogIndexλÖõÄlog term²»µÈÓÚprevLogTerm£¬ÄÇô·µ»Ø False£¬·ñÔò·µ»ØTrue

s4 leaderÊÕµ½followerµÄ»Ø¸´£¬Èç¹û·µ»ØÖµÊÇFalse£¬ÔònextIndex[x] -= 1, Ìø×ªµ½s2. ·ñÔò

s5 ͬ²½nextIndex[x]ºóµÄËùÓÐlog entries

leader completeness vs elcetion restriction

leaderÍêÕûÐÔ£ºÈç¹ûÒ»¸ölog entryÔÚij¸öÈÎÆÚ±»Ìá½»£¨committed£©£¬ÄÇôÕâÌõÈÕÖ¾Ò»¶¨»á³öÏÖÔÚËùÓиü¸ßtermµÄleaderµÄÈÕÖ¾ÀïÃæ¡£Õâ¸ö¸úleader election¡¢log replication¶¼Óйء£

Ò»¸öÈÕÖ¾±»¸´ÖƵ½majority½Úµã²ÅËãcommitted

Ò»¸ö½ÚµãµÃµ½majorityµÄͶƱ²ÅÄܳÉΪleader£¬¶ø½ÚµãA¸ø½ÚµãBͶƱµÄÆäÖÐÒ»¸öǰÌáÊÇ£¬BµÄÈÕÖ¾²»ÄܱÈAµÄÈÕÖ¾¾É¡£ÏÂÃæµÄÒýÎÄÖ¸´¦ÁËÈçºÎÅжÏÈÕÖ¾µÄоÉ

voter denies its vote if its own log is more up-to-date than that of the candidate.

If the logs have last entries with different terms, then the log with the later term is more up-to-date. If the logs end with the same term, then whichever log is longer is more up-to-date.

ÉÏÃæÁ½µã¶¼Ìáµ½ÁËmajority£ºcommit majority and vote majority£¬¸ù¾ÝQuorum£¬ÕâÁ½¸ömajorityÒ»¶¨ÊÇÓÐÖØºÏµÄ£¬Òò´Ë±»Ñ¡¾Ù³öµÄleaderÒ»¶¨°üº¬ÁË×îеÄcommittedµÄÈÕÖ¾¡£

raftÓëÆäËûЭÒ飨Viewstamped Replication¡¢mongodb£©²»Í¬£¬raftʼÖÕ±£Ö¤leade°üº¬×îеÄÒÑÌá½»µÄÈÕÖ¾£¬Òò´Ëleader²»»á´Ófollower catchupÈÕÖ¾£¬ÕâÒ²´ó´ó¼ò»¯ÁËϵͳµÄ¸´ÔÓ¶È¡£

corner case

stale leader

raft±£Ö¤Election safety£¬¼´Ò»¸öÈÎÆÚÄÚ×î¶àÖ»ÓÐÒ»¸öleader£¬µ«ÔÚÍøÂç·Ö¸î£¨network partition£©µÄÇé¿öÏ£¬¿ÉÄÜ»á³öÏÖÁ½¸öleader£¬µ«Á½¸öleaderËù´¦µÄÈÎÆÚÊDz»Í¬µÄ¡£ÈçÏÂͼËùʾ

ϵͳÓÐ5¸ö½ÚµãABCDE×é³É£¬ÔÚterm1£¬Node BÊÇleader£¬µ«Node A¡¢BºÍNode C¡¢D¡¢EÖ®¼ä³öÏÖÁËÍøÂç·Ö¸î£¬Òò´ËNode C¡¢D¡¢EÎÞ·¨ÊÕµ½À´×Ôleader£¨Node B£©µÄÏûÏ¢£¬ÔÚelection timeÖ®ºó£¬Node C¡¢D¡¢E»á·ÖÆÚÑ¡¾Ù£¬ÓÉÓÚÂú×ãmajorityÌõ¼þ£¬Node E³ÉΪÁËterm 2µÄleader¡£Òò´Ë£¬ÔÚϵͳÖÐÃ²ËÆ³öÏÖÁËÁ½¸öleader£ºterm 1µÄNode B£¬ term 2µÄNode E, Node BµÄterm¸ü¾É£¬µ«ÓÉÓÚÎÞ·¨ÓëMajority½ÚµãͨÐÅ£¬NodeBÈÔÈ»»áÈÏΪ×Ô¼ºÊÇleader¡£

ÔÚÕâÑùµÄÇé¿öÏ£¬ÎÒÃÇÀ´¿¼ÂǶÁд¡£

Ê×ÏÈ£¬Èç¹û¿Í»§¶Ë½«ÇëÇó·¢Ë͵½ÁËNodeB£¬NodeBÎÞ·¨½«log entry ¸´ÖƵ½majority½Úµã£¬Òò´Ë²»»á¸æË߿ͻ§¶ËдÈë³É¹¦£¬Õâ¾Í²»»áÓÐÎÊÌâ¡£

¶ÔÓÚ¶ÁÇëÇó£¬stale leader¿ÉÄÜ·µ»Østale data£¬±ÈÈçÔÚread-after-writeµÄÒ»ÖÂÐÔÒªÇóÏ£¬¿Í»§¶ËдÈëµ½ÁËterm2ÈÎÆÚµÄleader Node E£¬µ«¶ÁÇëÇó·¢Ë͵½ÁËNode B¡£Èç¹ûÒª±£Ö¤²»·µ»Østale data£¬leaderÐèÒªcheck×Ô¼ºÊÇ·ñ¹ýʱÁË£¬°ì·¨¾ÍÊÇÓë´ó¶àÊý½ÚµãͨÐÅÒ»´Î£¬Õâ¸ö¿ÉÄÜ»á³öÏÖЧÂÊÎÊÌâ¡£ÁíÒ»ÖÖ·½Ê½ÊÇʹÓÃlease£¬µ«Õâ¾Í»áÒÀÀµÎïÀíʱÖÓ¡£

´ÓraftµÄÂÛÎÄÖпÉÒÔ¿´µ½£¬leaderת»»³ÉfollowerµÄÌõ¼þÊÇÊÕµ½À´×Ô¸ü¸ßtermµÄÏûÏ¢£¬Èç¹ûÍøÂç·Ö¸îÒ»Ö±³ÖÐø£¬ÄÇôstale leader¾Í»áÒ»Ö±´æÔÚ¡£¶øÔÚraftµÄһЩʵÏÖ»òÕßraft-likeЭÒéÖУ¬leaderÈç¹ûÊÕ²»µ½majority½ÚµãµÄÏûÏ¢£¬ÄÇô¿ÉÒÔ×Ô¼ºstep down£¬×ÔÐÐת»»µ½follower״̬¡£

State Machine Safety

Ç°ÃæÔÚ½éÉÜsafetyµÄʱºòÓÐÒ»ÌõÊôÐÔûÓÐÏêϸ½éÉÜ£¬ÄǾÍÊÇState Machine Safety£º

State Machine Safety: if a server has applied a log entry at a given index to its state machine, no other server will ever apply a different log entry for the same index.

Èç¹û½Úµã½«Ä³Ò»Î»ÖõÄlog entryÓ¦Óõ½ÁË״̬»ú£¬ÄÇôÆäËû½ÚµãÔÚͬһλÖò»ÄÜÓ¦Óò»Í¬µÄÈÕÖ¾¡£¼òµ¥µãÀ´Ëµ£¬ËùÓнڵãÔÚͬһλÖã¨index in log entries£©Ó¦¸ÃÓ¦ÓÃͬÑùµÄÈÕÖ¾¡£µ«ÊÇËÆºõÓÐijЩÇé¿ö»áÎ¥±³Õâ¸öÔ­Ôò£º

ÉÏͼÊÇÒ»¸ö½ÏΪ¸´ÔÓµÄÇé¿ö¡£ÔÚʱ¿Ì(a), s1ÊÇleader£¬ÔÚterm2Ìá½»µÄÈÕÖ¾Ö»¸³Öµµ½ÁËs1 s2Á½¸ö½Úµã¾ÍcrashÁË¡£ÔÚʱ¿Ì£¨b), s5³ÉΪÁËterm 3µÄleader£¬ÈÕÖ¾Ö»¸³Öµµ½ÁËs5£¬È»ºócrash¡£È»ºóÔÚ(c)ʱ¿Ì£¬s1ÓÖ³ÉΪÁËterm 4µÄleader£¬¿ªÊ¼¸³ÖµÈÕÖ¾£¬ÓÚÊǰÑterm2µÄÈÕÖ¾¸´ÖƵ½ÁËs3£¬´Ë¿Ì£¬¿ÉÒÔ¿´³öterm2¶ÔÓ¦µÄÈÕÖ¾ÒѾ­±»¸´ÖƵ½ÁËmajority£¬Òò´ËÊÇcommitted£¬¿ÉÒÔ±»×´Ì¬»úÓ¦Óᣲ»ÐÒµÄÊÇ£¬½ÓÏÂÀ´£¨d£©Ê±¿Ì£¬s1ÓÖcrashÁË£¬s5ÖØÐµ±Ñ¡£¬È»ºó½«term3µÄÈÕÖ¾¸´ÖƵ½ËùÓнڵ㣬Õâ¾Í³öÏÖÁËÒ»ÖÖÆæ¹ÖµÄÏÖÏ󣺱»¸´ÖƵ½´ó¶àÊý½Úµã£¨»òÕß˵¿ÉÄÜÒѾ­Ó¦Ó㩵ÄÈÕÖ¾±»»Ø¹ö¡£

¾¿Æä¸ù±¾£¬ÊÇÒòΪterm4ʱµÄleader s1ÔÚ£¨C£©Ê±¿ÌÌá½»ÁË֮ǰterm2ÈÎÆÚµÄÈÕÖ¾¡£ÎªÁ˶žøÕâÖÖÇé¿öµÄ·¢Éú£º

Raft never commits log entries from previous terms by counting replicas.

Only log entries from the leader¡¯s current term are committed by counting replicas; once an entry from the current term has been committed in this way, then all prior entries are committed indirectly because of the Log Matching Property.

Ò²¾ÍÊÇ˵£¬Ä³¸öleaderÑ¡¾Ù³É¹¦Ö®ºó£¬²»»áÖ±½ÓÌύǰÈÎleaderʱÆÚµÄÈÕÖ¾£¬¶øÊÇͨ¹ýÌá½»µ±Ç°ÈÎÆÚµÄÈÕÖ¾µÄʱºò¡°Ë³ÊÖ¡±°Ñ֮ǰµÄÈÕÖ¾Ò²Ìá½»ÁË£¬¾ßÌåÔõôʵÏÖÁË£¬ÔÚlog matching²¿·ÖÓÐÏêϸ½éÉÜ¡£ÄÇôÎÊÌâÀ´ÁË£¬Èç¹ûleader±»Ñ¡¾ÙºóûÓÐÊÕµ½¿Í»§¶ËµÄÇëÇóÄØ£¬ÂÛÎÄÖÐÓÐÌáµ½£¬ÔÚÈÎÆÚ¿ªÊ¼µÄʱºò·¢Á¢¼´³¢ÊÔ¸´ÖÆ¡¢Ìá½»Ò»Ìõ¿ÕµÄlog

Raft handles this by having each leader commit a blank no-op entry into the log at the start of its term.

Òò´Ë£¬ÔÚÉÏͼÖУ¬²»»á³öÏÖ£¨C£©Ê±¿ÌµÄÇé¿ö£¬¼´term4ÈÎÆÚµÄleader s1²»»á¸´ÖÆterm2µÄÈÕÖ¾µ½s3¡£¶øÊÇÈçͬ(e)ÃèÊöµÄÇé¿ö£¬Í¨¹ý¸´ÖÆ-Ìá½» term4µÄÈÕ־˳±ãÌá½»term2µÄÈÕÖ¾¡£Èç¹ûterm4µÄÈÕÖ¾Ìá½»³É¹¦£¬ÄÇôterm2µÄÈÕÖ¾Ò²Ò»¶¨Ìá½»³É¹¦£¬´Ëʱ¼´Ê¹s1crash£¬s5Ò²²»»áÖØÐµ±Ñ¡¡£

leader crash

followerµÄcrash´¦Àí·½Ê½Ïà¶Ô¼òµ¥£¬leaderÖ»Òª²»Í£µÄ¸øfollower·¢ÏûÏ¢¼´¿É¡£µ±leader crashµÄʱºò£¬ÊÂÇé¾Í»á±äµÃ¸´ÔÓ¡£ÔÚÕâÆªÎÄÕÂÖУ¬×÷Õ߾͸ø³öÁËÒ»¸ö¸üÐÂÇëÇóµÄÁ÷³Ìͼ¡£

Àý×Ó

ÎÒÃÇ¿ÉÒÔ·ÖÎöleaderÔÚÈÎÒâʱ¿ÌcrashµÄÇé¿ö£¬ÓÐÖúÓÚÀí½âraftËã·¨µÄÈÝ´íÐÔ¡£

×ܽá

raft½«¹²Ê¶ÎÊÌâ·Ö½â³ÉÁ½¸öÏà¶Ô¶ÀÁ¢µÄÎÊÌ⣬leader election£¬log replication¡£Á÷³ÌÊÇÏÈÑ¡¾Ù³öleader£¬È»ºóleader¸ºÔð¸´ÖÆ¡¢Ìá½»log£¨logÖаüº¬command£©

ΪÁËÔÚÈκÎÒì³£Çé¿öÏÂϵͳ²»³ö´í£¬¼´Âú×ãsafetyÊôÐÔ£¬¶Ôleader election£¬log replicationÁ½¸ö×ÓÎÊÌâÓÐÖî¶àÔ¼Êø

leader electionÔ¼Êø£º

ͬһÈÎÆÚÄÚ×î¶àÖ»ÄÜͶһƱ£¬ÏÈÀ´ÏȵÃ

Ñ¡¾ÙÈ˱ØÐë±È×Ô¼ºÖªµÀµÄ¸ü¶à£¨±È½Ïterm£¬log index£©

log replicationÔ¼Êø£º

Ò»¸ölog±»¸´ÖƵ½´ó¶àÊý½Úµã£¬¾ÍÊÇcommitted£¬±£Ö¤²»»á»Ø¹ö

leaderÒ»¶¨°üº¬×îеÄcommitted log£¬Òò´ËleaderÖ»»á×·¼ÓÈÕÖ¾£¬²»»áɾ³ý¸²¸ÇÈÕÖ¾

²»Í¬½Úµã£¬Ä³¸öλÖÃÉÏÈÕÖ¾Ïàͬ£¬ÄÇôÕâ¸öλÖÃ֮ǰµÄËùÓÐÈÕÖ¾Ò»¶¨ÊÇÏàͬµÄ

Raft never commits log entries from previous terms by counting replicas.

±¾ÎÄÊÇÔÚ¿´ÍêraftÂÛÎĺó×Ô¼ºµÄ×ܽᣬ²»Ò»¶¨È«Ãæ¡£¸öÈ˾õµÃ£¬Èç¹ûÖ»ÊÇÏà¶ÔraftЭÒéÓÐÒ»¸ö¼òµ¥Á˽⣬¿´Õâ¸ö¶¯»­ÑÝʾ¾Í×ã¹»ÁË£¬Èç¹ûÏëÉîÈëÁ˽⣬»¹ÊÇÒª¿´ÂÛÎÄ£¬ÂÛÎÄÖÐFigure 2¶ÔraftËã·¨½øÐÐÁ˸ÅÀ¨¡£×îºó£¬»¹ÊÇÕÒÒ»¸öʵÏÖÁËraftËã·¨µÄϵͳÀ´¿´¿´¸üºÃ¡£

   
1933 ´Îä¯ÀÀ       27
????

HTTP????
nginx??????
SD-WAN???
5G?????
 
????

??????????
IPv6???????
??????????
???????
????

????????
????????
???????????????
??????????
×îл¼Æ»®
DeepSeekÔÚÈí¼þ²âÊÔÓ¦ÓÃʵ¼ù 4-12[ÔÚÏß]
DeepSeek´óÄ£ÐÍÓ¦Óÿª·¢Êµ¼ù 4-19[ÔÚÏß]
UAF¼Ü¹¹ÌåϵÓëʵ¼ù 4-11[±±¾©]
AIÖÇÄÜ»¯Èí¼þ²âÊÔ·½·¨Óëʵ¼ù 5-23[ÉϺ£]
»ùÓÚ UML ºÍEA½øÐзÖÎöÉè¼Æ 4-26[±±¾©]
ÒµÎñ¼Ü¹¹Éè¼ÆÓ뽨ģ 4-18[±±¾©]
 
×îÐÂÎÄÕÂ
ÔÆÔ­Éú¼Ü¹¹¸ÅÊö
K8S¸ß¿ÉÓü¯Èº¼Ü¹¹ÊµÏÖ
ÈÝÆ÷ÔÆ¹ÜÀíÖ®K8S¼¯Èº¸ÅÊö
k8s-ÕûÌå¸ÅÊöºÍ¼Ü¹¹
Ê®·ÖÖÓѧ»áÓÃdocker²¿Êð΢·þÎñ
×îпγÌ
ÔÆ¼ÆË㡢΢·þÎñÓë·Ö²¼Ê½¼Ü¹¹
Æóҵ˽ÓÐÔÆÔ­ÀíÓë¹¹½¨
»ùÓÚKubernetesµÄDevOpsʵ¼ù
ÔÆÆ½Ì¨¼Ü¹¹ÓëÓ¦Ó㨰¢ÀïÔÆ£©
Docker²¿Êð±»²âϵͳÓë×Ô¶¯»¯¿ò¼Üʵ¼ù
³É¹¦°¸Àý
±±¾© ÔÆÆ½Ì¨Óë΢·þÎñ¼Ü¹¹Éè¼Æ
ͨÓù«Ë¾GE DockerÔ­ÀíÓëʵ¼ùÅàѵ
ij¾ü¹¤Ñо¿µ¥Î» MDA£¨Ä£ÐÍÇý¶¯¼Ü¹¹£©
ÖªÃûÏû·Ñ½ðÈÚ¹«Ë¾ ÁìÓòÇý¶¯Éè¼Æ
ÉîÛÚijÆû³µÆóÒµ Ä£ÐÍÇý¶¯µÄ·ÖÎöÉè¼Æ