Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Modeler   Code  
»áÔ±   
 
   
 
 
     
   
 ¶©ÔÄ
  ¾èÖú
ÂÛSparkStreamingµÄÊý¾Ý¿É¿¿ÐÔºÍÒ»ÖÂÐÔ
 
×÷Õß Ò¶ç÷  À´Ô´:³ÌÐòÔ±ÔÓÖ¾  »ðÁú¹ûÈí¼þ  ·¢²¼ÓÚ 2015-6-26
  2145  次浏览      29
 

ÕªÒª£ºÑÛÏ´óÊý¾ÝÁìÓò×îÈÈÃŵĴʻãÖ®Ò»±ãÊÇÁ÷¼ÆËãÁË£¬¶øÆäÖÐ×îÒ«ÑÛµÄÎÞÒÉÊÇÀ´×ÔSparkÉçÇøµÄSparkStreamingÏîÄ¿¡£ ¶ÔÓÚÁ÷¼ÆËã¶øÑÔ£¬×îºËÐĵÄÌØµãºÁÎÞÒÉÎʾÍÊÇËü¶ÔµÍʱµÄÐèÇ󣬵«ÕâÒ²´øÀ´ÁËÏà¹ØµÄÊý¾Ý¿É¿¿ÐÔÎÊÌâ¡£

2Driver HA

ÓÉÓÚÁ÷¼ÆËãϵͳÊdz¤ÆÚÔËÐС¢ÇÒ²»¶ÏÓÐÊý¾ÝÁ÷È룬Òò´ËÆäSparkÊØ»¤½ø³Ì£¨Driver£©µÄ¿É¿¿ÐÔÖÁ¹ØÖØÒª£¬Ëü¾ö¶¨ÁËStreaming³ÌÐòÄÜ·ñÒ»Ö±ÕýÈ·µØÔËÐÐÏÂÈ¥¡£

DriverʵÏÖHAµÄ½â¾ö·½°¸¾ÍÊǽ«ÔªÊý¾Ý³Ö¾Ã»¯£¬ÒÔ±ãÖØÆôºóµÄ״̬»Ö¸´¡£ÈçͼһËùʾ£¬Driver³Ö¾Ã»¯µÄÔªÊý¾Ý°üÀ¨£º

BlockÔªÊý¾Ý£¨Í¼1ÖеÄÂÌÉ«¼ýÍ·£©£ºReceiver´ÓÍøÂçÉϽÓÊÕµ½µÄÊý¾Ý£¬×é×°³ÉBlockºó²úÉúµÄBlockÔªÊý¾Ý£»

CheckpointÊý¾Ý£¨Í¼1ÖеijÈÉ«¼ýÍ·£©£º°üÀ¨ÅäÖÃÏî¡¢DStream²Ù×÷¡¢Î´Íê³ÉµÄBatch״̬¡¢ºÍÉú³ÉµÄRDDÊý¾ÝµÈ£»

Driverʧ°ÜÖØÆôºó£º

»Ö¸´¼ÆË㣨ͼ2ÖеijÈÉ«¼ýÍ·£©£ºÊ¹ÓÃCheckpointÊý¾ÝÖØÆôdriver£¬ÖØÐ¹¹ÔìÉÏÏÂÎIJ¢ÖØÆô½ÓÊÕÆ÷¡£

»Ö¸´ÔªÊý¾Ý¿é£¨Í¼2ÖеÄÂÌÉ«¼ýÍ·£©£º»Ö¸´BlockÔªÊý¾Ý¡£

»Ö¸´Î´Íê³ÉµÄ×÷Òµ£¨Í¼2ÖеĺìÉ«¼ýÍ·£©£ºÊ¹Óûָ´³öÀ´µÄÔªÊý¾Ý£¬ÔٴβúÉúRDDºÍ¶ÔÓ¦µÄjob£¬È»ºóÌá½»µ½Spark¼¯ÈºÖ´ÐС£

ͨ¹ýÈçÉϵÄÊý¾Ý±¸·ÝºÍ»Ö¸´»úÖÆ£¬DriverʵÏÖÁ˹ÊÕϺóÖØÆô¡¢ÒÀÈ»Äָܻ´StreamingÈÎÎñ¶ø²»¶ªÊ§Êý¾Ý£¬Òò´ËÌṩÁËϵͳ¼¶µÄÊý¾Ý¸ß¿É¿¿¡£

¿É¿¿µÄÉÏÏÂÓÎIOϵͳ

Á÷¼ÆËãÖ÷Ҫͨ¹ýÍøÂçsocketͨÐÅÀ´ÊµÏÖÓëÍⲿIOϵͳµÄÊý¾Ý½»»¥¡£ÓÉÓÚÍøÂçͨÐŵIJ»¿É¿¿Ìص㣬·¢ËͶËÓë½ÓÊÕ¶ËÐèҪͨ¹ýÒ»¶¨µÄЭÒéÀ´±£Ö¤Êý¾Ý°üµÄ½ÓÊÕÈ·ÈϺÍʧ°ÜÖØ·¢»úÖÆ¡£

²»ÊÇËùÓеÄIOϵͳ¶¼Ö§³ÖÖØ·¢£¬ÕâÖÁÉÙÐèҪʵÏÖÊý¾ÝÁ÷µÄ³Ö¾Ã»¯£¬Í¬Ê±»¹ÒªÊµÏÖ¸ßÍÌͺ͵ÍʱÑÓ¡£ÔÚSparkStreaming¹Ù·½Ö§³ÖµÄdata sourceÀïÃæ£¬ÄÜͬʱÂú×ãÕâЩҪÇóµÄÖ»ÓÐKafka£¬Òò´ËÔÚ×î½üµÄSparkStreaming releaseÀïÃæ£¬Ò²ÊǰÑKafkaµ±³ÉÍÆ¼öµÄÍⲿÊý¾Ýϵͳ¡£

³ýÁ˰ÑKafkaµ±³ÉÊäÈëÊý¾ÝÔ´£¨inbound data source£©Ö®Í⣬ͨ³£Ò²½«Æä×÷ΪÊä³öÊý¾ÝÔ´£¨outbound data source£©¡£ËùÓеÄʵʱϵͳ¶¼Í¨¹ýKafkaÕâ¸öMQÀ´×öÊý¾ÝµÄ¶©Ôĺͷַ¢£¬´Ó¶øÊµÏÖÁ÷Êý¾ÝÉú²úÕߺÍÏû·ÑÕߵĽâñî¡£

Ò»¸öµäÐÍµÄÆóÒµ´óÊý¾ÝÖÐÐÄÊý¾ÝÁ÷ÏòÊÓͼÈçͼ3Ëùʾ£º

³ýÁË´ÓÔ´Í·±£Ö¤Êý¾Ý¿ÉÖØ·¢Ö®Í⣬Kafka¸üÊÇÁ÷Êý¾ÝExact OnceÓïÒåµÄÖØÒª±£ÕÏ¡£KafkaÌṩÁËÒ»Ì׵ͼ¶API£¬Ê¹µÃclient¿ÉÒÔ·ÃÎÊtopicÊý¾ÝÁ÷µÄͬʱҲÄÜ·ÃÎÊÆäÔªÊý¾Ý¡£SparkStreamingÿ¸ö½ÓÊÕµÄÈÎÎñ¶¼¿ÉÒÔ´ÓÖ¸¶¨µÄKafka topic¡¢partitionºÍoffsetÈ¥»ñÈ¡Êý¾ÝÁ÷£¬¸÷¸öÈÎÎñµÄÊý¾Ý±ß½çºÜÇåÎú£¬ÈÎÎñʧ°Üºó¿ÉÒÔÖØÐÂÈ¥½ÓÊÕÕⲿ·ÖÊý¾Ý¶ø²»»á²úÉú¡°ÖصþµÄ¡±Êý¾Ý£¬Òò¶ø±£Ö¤ÁËÁ÷Êý¾Ý¡°ÓÐÇÒ½ö´¦ÀíÒ»´Î¡±¡£

¿É¿¿µÄ½ÓÊÕÆ÷

ÔÚSpark 1.3°æ±¾Ö®Ç°£¬SparkStreamingÊÇͨ¹ýÆô¶¯×¨ÓõÄReceiverÈÎÎñÀ´Íê³É´ÓKafka¼¯ÈºµÄÊý¾ÝÁ÷À­È¡¡£

ReceiverÈÎÎñÆô¶¯ºó£¬»áʹÓÃKafkaµÄ¸ß¼¶APIÀ´´´½¨topicMessageStreams¶ÔÏ󣬲¢ÖðÌõ¶ÁÈ¡Êý¾ÝÁ÷»º´æ£¬Ã¿¸öbatchInervalʱ¿Ìµ½À´Ê±ÓÉJobGeneratorÌá½»Éú³ÉÒ»¸öspark¼ÆËãÈÎÎñ¡£

ÓÉÓÚReceiverÈÎÎñ´æÔÚå´»ú·çÏÕ£¬Òò´ËSparkÌṩÁËÒ»¸ö¸ß¼¶µÄ¿É¿¿½ÓÊÕÆ÷-ReliableKafkaReceiverÀàÐÍÀ´ÊµÏÖ¿É¿¿µÄÊý¾ÝÊÕÈ¡£¬ËüÀûÓÃÁËSpark 1.2ÌṩµÄWAL£¨Write Ahead Log£©¹¦ÄÜ£¬°Ñ½ÓÊÕµ½µÄÿһÅúÊý¾Ý³Ö¾Ã»¯µ½´ÅÅ̺󣬸üÐÂtopic-partitionµÄoffsetÐÅÏ¢£¬ÔÙÈ¥½ÓÊÕÏÂÒ»ÅúKafkaÊý¾Ý¡£ÍòÒ»Receiverʧ°Ü£¬ÖØÆôºó»¹ÄÜ´ÓWALÀïÃæ»Ö¸´³öÒѽÓÊÕµÄÊý¾Ý£¬´Ó¶ø±ÜÃâÁËReceiver½Úµãå´»úÔì³ÉµÄÊý¾Ý¶ªÊ§£¨ÒÔÏ´úÂëɾ³ýÁËϸ֦ĩ½ÚµÄÂß¼­£©£º

ÆôÓÃWALºó£¬ËäÈ»ReceiverµÄÊý¾Ý¿É¿¿ÐÔ·çÏÕ½µµÍÁË£¬µ«È´ÓÉÓÚ´ÅÅ̳־û¯´øÀ´µÄ¿ªÏú£¬ÏµÍ³ÕûÌåÍÌÍÂÂÊ»áÃ÷ÏÔϽµ¡£Òò´Ë£¬×îз¢²¼µÄSpark 1.3°æ±¾£¬SparkStreamingÔö¼ÓÁËʹÓÃDirect APIµÄ·½Ê½À´ÊµÏÖKafkaÊý¾ÝÔ´µÄ·ÃÎÊ¡£

ÒýÈëÁËDirect APIºó£¬SparkStreaming²»ÔÙÆô¶¯³£×¤µÄReceiver½ÓÊÕÈÎÎñ£¬¶øÊÇÖ±½Ó·ÖÅä¸øÃ¿¸öBatch¼°RDD×îеÄtopic partition offset¡£jobÆô¶¯ÔËÐкóExecutorʹÓÃKafkaµÄsimple consumer APIÈ¥»ñÈ¡ÄÇÒ»¶ÎoffsetµÄÊý¾Ý¡£

ÕâÑù×öµÄºÃ´¦²»½ö±ÜÃâÁËReceiverå´»ú´øÀ´Êý¾Ý¿É¿¿ÐԵķçÏÕ£¬Ò²ÓÉÓÚ±ÜÃâʹÓÃZooKeeper×öoffset¸ú×Ù£¬¶øÊµÏÖÁËÊý¾ÝµÄ¾«È·Ò»´ÎÐÔ£¨ÒÔÏ´úÂëɾ³ýÁËϸ֦ĩ½ÚµÄÂß¼­£©£º

ԤдÈÕÖ¾ Write Ahead Log

Spark 1.2¿ªÊ¼ÌṩԤдÈÕÖ¾ÄÜÁ¦£¬ÓÃÓÚReceiverÊý¾Ý¼°DriverÔªÊý¾ÝµÄ³Ö¾Ã»¯ºÍ¹ÊÕϻָ´¡£WALÖ®ËùÒÔÄÜÌṩ³Ö¾Ã»¯ÄÜÁ¦£¬ÊÇÒòΪËüÀûÓÃÁ˿ɿ¿µÄHDFS×öÊý¾Ý´æ´¢¡£

SparkStreamingԤдÈÕÖ¾»úÖÆµÄºËÐÄAPI°üÀ¨£º

¹ÜÀíWALÎļþµÄWriteAheadLogManager

¶Á/дWALµÄWriteAheadLogWriterºÍWriteAheadLogReader

»ùÓÚWALµÄRDD£ºWriteAheadLogBackedBlockRDD

»ùÓÚWALµÄPartition£ºWriteAheadLogBackedBlockRDDPartition

ÒÔÉϺËÐÄAPIÔÚÊý¾Ý½ÓÊպͻָ´½×¶ÎµÄ½»»¥Ê¾ÒâͼÈçͼ4Ëùʾ¡£

´ÓWriteAheadLogWriterµÄÔ´ÂëÀï¿ÉÒÔÇå³þ¿´µ½£¬Ã¿´ÎдÈëÒ»¿éÊý¾Ýbufferµ½HDFSºó¶¼»áµ÷ÓÃflush·½·¨È¥Ç¿ÖÆË¢Èë´ÅÅÌ£¬È»ºó²ÅȥȡÏÂÒ»¿éÊý¾Ý¡£Òò´Ëreceiver½ÓÊÕµÄÊý¾ÝÊÇ¿ÉÒÔ±£Ö¤³Ö¾Ã»¯µ½´ÅÅÌÁË£¬Òò¶ø×öµ½½ÏºÃµÄÊý¾Ý¿É¿¿ÐÔ¡£

½áÊøÓï

µÃÒæÓÚKafkaÕâÀà¿É¿¿µÄdata sourceÒÔ¼°×ÔÉíµÄcheckpoint/WALµÈ»úÖÆ£¬SparkStreamingµÄÊý¾Ý¿É¿¿ÐԵõ½Á˺ܺõı£Ö¤£¬Êý¾ÝÄܱ£Ö¤¡°ÖÁÉÙÒ»´Î¡±£¨at least once£©±»´¦Àí¡£µ«ÓÉÓÚÆäoutbound¶ËµÄÒ»ÖÂÐÔʵÏÖ»¹Î´ÍêÉÆ£¬Òò´ËExact onceÓïÒåÈÔÈ»²»Äܶ˵½¶Ë±£Ö¤¡£SparkStreamingÉçÇøÒѾ­ÔÚ¸ú½øÕâ¸öÌØÐÔµÄʵÏÖ£¨SPARK-4122£©£¬Ô¤¼ÆºÜ¿ì½«ºÏÈëtrunk·¢²¼¡£

×÷Õß¼ò½é£ºÒ¶ç÷£¬»ªÎªÈí¼þ¹«Ë¾Universe²úÆ·²¿¸ß¼¶¼Ü¹¹Ê¦£¬×¨×¢ÓÚ´óÊý¾Ýµ×²ã·Ö²¼Ê½´æ´¢ºÍ¼ÆËã»ù´¡ÉèÊ©£¬ÊÇ»ªÎªÈí¼þ¹«Ë¾Hadoop·¢ÐаæµÄÖ÷Òª¼Ü¹¹Ê¦£¬Ä¿Ç°ÐËȤµãÔÚÁ÷¼ÆËãÓëSpark¡£

   
2145 ´Îä¯ÀÀ       29
Ïà¹ØÎÄÕÂ

»ùÓÚEAµÄÊý¾Ý¿â½¨Ä£
Êý¾ÝÁ÷½¨Ä££¨EAÖ¸ÄÏ£©
¡°Êý¾Ýºþ¡±£º¸ÅÄî¡¢ÌØÕ÷¡¢¼Ü¹¹Óë°¸Àý
ÔÚÏßÉ̳ÇÊý¾Ý¿âϵͳÉè¼Æ ˼·+Ч¹û
 
Ïà¹ØÎĵµ

GreenplumÊý¾Ý¿â»ù´¡Åàѵ
MySQL5.1ÐÔÄÜÓÅ»¯·½°¸
ijµçÉÌÊý¾ÝÖÐ̨¼Ü¹¹Êµ¼ù
MySQL¸ßÀ©Õ¹¼Ü¹¹Éè¼Æ
Ïà¹Ø¿Î³Ì

Êý¾ÝÖÎÀí¡¢Êý¾Ý¼Ü¹¹¼°Êý¾Ý±ê×¼
MongoDBʵս¿Î³Ì
²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿âÉè¼ÆÓëÓÅ»¯
PostgreSQLÊý¾Ý¿âʵսÅàѵ
×îл¼Æ»®
DeepSeekÔÚÈí¼þ²âÊÔÓ¦ÓÃʵ¼ù 4-12[ÔÚÏß]
DeepSeek´óÄ£ÐÍÓ¦Óÿª·¢Êµ¼ù 4-19[ÔÚÏß]
UAF¼Ü¹¹ÌåϵÓëʵ¼ù 4-11[±±¾©]
AIÖÇÄÜ»¯Èí¼þ²âÊÔ·½·¨Óëʵ¼ù 5-23[ÉϺ£]
»ùÓÚ UML ºÍEA½øÐзÖÎöÉè¼Æ 4-26[±±¾©]
ÒµÎñ¼Ü¹¹Éè¼ÆÓ뽨ģ 4-18[±±¾©]

MySQLË÷Òý±³ºóµÄÊý¾Ý½á¹¹
MySQLÐÔÄܵ÷ÓÅÓë¼Ü¹¹Éè¼Æ
SQL ServerÊý¾Ý¿â±¸·ÝÓë»Ö¸´
ÈÃÊý¾Ý¿â·ÉÆðÀ´ 10´óDB2ÓÅ»¯
oracleµÄÁÙʱ±í¿Õ¼äдÂú´ÅÅÌ
Êý¾Ý¿âµÄ¿çƽ̨Éè¼Æ


²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿â
¸ß¼¶Êý¾Ý¿â¼Ü¹¹Éè¼ÆÊ¦
HadoopÔ­ÀíÓëʵ¼ù
Oracle Êý¾Ý²Ö¿â
Êý¾Ý²Ö¿âºÍÊý¾ÝÍÚ¾ò
OracleÊý¾Ý¿â¿ª·¢Óë¹ÜÀí


GE Çø¿éÁ´¼¼ÊõÓëʵÏÖÅàѵ
º½Ìì¿Æ¹¤Ä³×Ó¹«Ë¾ Nodejs¸ß¼¶Ó¦Óÿª·¢
ÖÐÊ¢Òæ»ª ׿Խ¹ÜÀíÕß±ØÐë¾ß±¸µÄÎåÏîÄÜÁ¦
ijÐÅÏ¢¼¼Êõ¹«Ë¾ PythonÅàѵ
ij²©²ÊITϵͳ³§ÉÌ Ò×ÓÃÐÔ²âÊÔÓëÆÀ¹À
ÖйúÓÊ´¢ÒøÐÐ ²âÊÔ³ÉÊì¶ÈÄ£Ðͼ¯³É(TMMI)
ÖÐÎïÔº ²úÆ·¾­ÀíÓë²úÆ·¹ÜÀí