Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Model Center   Code  
»áÔ±   
   
 
     
   
 ¶©ÔÄ
  ¾èÖú
Spark StreamingÓ¦ÓÃÓëʵսȫ¹¥ÂÔ£¨¢ò£©
 
À´Ô´£ºcsdn ·¢²¼ÓÚ£º2017-7-19
  2318  次浏览      30
 

Spark StreamingÓ¦ÓÃÓëʵսϵÁаüÀ¨ÒÔÏÂÁù²¿·ÖÄÚÈÝ£º

1.±³¾°Óë¼Ü¹¹¸ÄÔì

2.ͨ¹ý´úÂëʵÏÖ¾ßÌåϸ½Ú£¬²¢ÔËÐÐÏîÄ¿

3.¶ÔStreaming¼à¿ØµÄ½éÉÜÒÔ¼°½â¾öʵ¼ÊÎÊÌâ

4.¶ÔÏîÄ¿×öѹ²âÓëÏà¹ØµÄÓÅ»¯

5.Streaming³ÖÐøÓÅ»¯Ö®HBase

6.¹ÜÀíStreamingÈÎÎñ

µã´ËÔĶÁµÚÒ»²¿·ÖÄÚÈÝ£¬±¾ÆªÎªµÚ¶þ²¿·Ö£¬°üÀ¨ Streaming ³ÖÐøÓÅ»¯Ö® HBase ÒÔ¼°¹ÜÀí Streaming ÈÎÎñ¡£

Îå¡¢Streaming³ÖÐøÓÅ»¯Ö®HBase

5.1 ÉèÖÃWALog

¹Ø±ÕWALogºóдÈëÄܵ½20Íò£¬µ«ÊÇ·¢ÏÖ»¹ÊDz»ÊÇÌØ±ðÎȶ¨£¬ÓÐʱºÄʱ»¹ÊDZȽϳ¤µÄ£¬·¢Ïִ˽׶ÎÕýÔÚ×öCompaction!!!

²é¿´streamingͳ¼Æ,·¢ÏÖºÄʱ²»Îȶ¨

HBase½çÃæÍ³¼ÆÐÅÏ¢

HBaseÊÇÒ»ÖÖ Log-Structured Merge Tree ¼Ü¹¹Ä£Ê½£¬Óû§Êý¾ÝдÈëÏÈдWAL£¬ÔÙд»º´æ£¬Âú×ãÒ»¶¨Ìõ¼þºó»º´æÊý¾Ý»áÖ´ÐÐflush²Ù×÷ÕæÕýÂäÅÌ£¬ÐγÉÒ»¸öÊý¾ÝÎļþHFile¡£Ëæ×ÅÊý¾ÝдÈë²»¶ÏÔö¶à£¬flush´ÎÊýÒ²»á²»¶ÏÔö¶à£¬½ø¶øHFileÊý¾ÝÎļþ¾Í»áÔ½À´Ô½¶à¡£È»¶ø£¬Ì«¶àÊý¾ÝÎļþ»áµ¼ÖÂÊý¾Ý²éѯIO´ÎÊýÔö¶à£¬Òò´ËHBase³¢ÊÔ×Ų»¶Ï¶ÔÕâЩÎļþ½øÐкϲ¢£¬Õâ¸öºÏ²¢¹ý³Ì³ÆÎªCompaction¡£

Compaction»á´ÓÒ»¸ö region µÄÒ»¸ö store ÖÐÑ¡ÔñһЩ hfile Îļþ½øÐкϲ¢¡£ºÏ²¢ËµÀ´Ô­ÀíºÜ¼òµ¥£¬ÏÈ´ÓÕâЩ´ýºÏ²¢µÄÊý¾ÝÎļþÖжÁ³öKeyValues£¬ÔÙ°´ÕÕÓÉСµ½´óÅÅÁкóдÈëÒ»¸öеÄÎļþÖС£Ö®ºó£¬Õâ¸öÐÂÉú³ÉµÄÎļþ¾Í»áÈ¡´ú֮ǰ´ýºÏ²¢µÄËùÓÐÎļþ¶ÔÍâÌṩ·þÎñ¡£

HBase¸ù¾ÝºÏ²¢¹æÄ£½« Compaction ·ÖΪÁËÁ½Àࣺ inorCompaction ºÍ MajorCompaction ¡£

1. Minor Compaction ÊÇָѡȡһЩСµÄ¡¢ÏàÁÚµÄ StoreFile ½«ËûÃǺϲ¢³ÉÒ»¸ö¸ü´óµÄ StoreFile £¬ÔÚÕâ¸ö¹ý³ÌÖв»»á´¦ÀíÒѾ­ Deleted »ò Expired µÄ Cell ¡£Ò»´Î Minor Compaction µÄ½á¹ûÊǸüÉÙ²¢ÇÒ¸ü´óµÄ StoreFile ¡£

2. Major Compaction ÊÇÖ¸½«ËùÓÐµÄ StoreFile ºÏ²¢³ÉÒ»¸ö StoreFile £¬Õâ¸ö¹ý³Ì»¹»áÇåÀíÈýÀàÎÞÒâÒåÊý¾Ý£º±»É¾³ýµÄÊý¾Ý¡¢TTL¹ýÆÚÊý¾Ý¡¢°æ±¾ºÅ³¬¹ýÉ趨°æ±¾ºÅµÄÊý¾Ý¡£ÁíÍ⣬һ°ãÇé¿öÏ£¬ Major Compactionʱ¼ä»á³ÖÐø±È½Ï³¤£¬Õû¸ö¹ý³Ì»áÏûºÄ´óÁ¿ÏµÍ³×ÊÔ´£¬¶ÔÉϲãÒµÎñÓбȽϴóµÄÓ°Ïì¡£Òò´ËÏßÉÏÒµÎñ¶¼»á½«¹Ø±Õ×Ô¶¯´¥·¢Major Compaction¹¦ÄÜ£¬¸ÄΪÊÖ¶¯ÔÚÒµÎñµÍ·åÆÚ´¥·¢¡£

5.2 µ÷ÕûѹËõ

ͨ³£Éú²ú»·¾³»á¹Ø±Õ×Ô¶¯ major_compact (ÅäÖÃÎļþÖÐ hbase . hregion . majorcompaction Éè Ϊ 0 )£¬Ñ¡ÔñÒ»¸öÍíÉÏÓû§ÉÙµÄʱ¼ä´°¿ÚÊÖ¹¤ major _ compact ¡£

ÊÖ¶¯ £º major_compact ¡® testtable ¡¯

Èç¹û hbase ¸üв»ÊÇ̫Ƶ·±£¬¿ÉÒÔÒ»¸öÐÇÆÚ¶ÔËùÓбí×öÒ»´Î major_compact£¬Õâ¸ö¿ÉÒÔÔÚ×öÍêÒ»´Îmajor_compactºó£¬¹Û¿´ËùÓÐµÄ storefil eÊýÁ¿£¬Èç¹û storefile ÊýÁ¿Ôö¼Óµ½ major_compact ºóµÄ storefile µÄ½ü¶þ±¶Ê±£¬¿ÉÒÔ¶ÔËùÓбí×öÒ»´Î major_compact £¬Ê±¼ä±È½Ï³¤£¬²Ù×÷¾¡Á¿±ÜÃâ¸ß·æÆÚ¡£

²é¿´Í³¼ÆÐÅÏ¢

Compact´¥·¢Ìõ¼þ£º

1.memstore flushÖ®ºó´¥·¢

2.¿Í»§¶Ëͨ¹ýshell»òÕßAPI´¥·¢

3.ºǫ́Ïß³ÌCompactionChecker¶¨ÆÚ´¥·¢

²é¿´Í³¼ÆÐÅÏ¢

²é¿´Í³¼ÆÐÅÏ¢

ÖÜÆÚΪ£º Hbase . server . thread . wakefrequencyhbase . server . compactchecker . interval . multiplier ´¥·¢ compaction £¬ºóÃæ»¹ÓÐһЩÆäËûµÄÌõ¼þÒ²¿ÉÒÔÔÚÔ´ÂëÀïÃæ¿´¿´

Ìõ¼þµÄÑéÖ¤Âß¼­¾ÍÊÇÔÚÕâ¸öʱ¼ä·¶Î§£ºmcTime = 7-70.5Ìì,7+70.5Ìì=3.5-10.5;

ÊÇ·ñÓÐÎļþÐ޸ľßÌåÂß¼­¿É¼û RatioBasedCompactionPolicy # isMajorCompaction ·½·¨¡£

5.3 Split

ͨ¹ýÉÏÃæµÄ½ØÍ¼ÎÒÃÇ¿ÉÒÔ¿´µ½£¬¸Ã±íÖ»ÓÐÒ»¸ö region £¬Ð´ÈëÊý¾Ý¶¼¼¯Öе½ÁËһ̨·þÎñÆ÷£¬Õâ¸öԶԶûÓз¢»Ó³ö HBase ¼¯ÈºµÄÄÜÁ¦Ñ½£¬ÊÖ¶¯²ð·Ö°É£¡

ͨ¹ýhbase ui½çÃæ²ð·ÖRegion

²ð·Öºó£º

Region²ð·Öºó

Áù¡¢¹ÜÀíStreamingÈÎÎñ

ÕâÊÇ Spark Streaming ϵÁв©¿ÍµÄ×îºóÒ»²¿·Ö£¬Ö÷Òª½²Ò»ÏÂÎÒ×Ô¼º¶Ô Spark Streaming ÈÎÎñµÄһЩ»®·Ö£¬»¹ÓÐÒ»¸öSpark Streaming ÈÎÎñµÄÓʼþ¼à¿Ø¡£

6.1 Streaming ÈÎÎñµÄ»®·Ö

µ± Spark Streaming ¿ª·¢Íê³É£¬²âÊÔÍê³ÉÖ®ºó£¬¾Í·¢²¼ÉÏÏßÁË£¬ Spark Streaming ÈÎÎñµÄ»®·Ö£¬ÒÔ¼°Ê±¼ä´°¿Úµ÷ÊÔ¶àÉÙÕâЩ¶¼ÊǸü¾ßÒµÎñ»®·ÖµÄ¡£

kafka Ò»¸ötopic¶ÔÓ¦HBaseÀïÃæµÄÒ»Õűí

Kafka topic ÀïÃæµÄpartition£¨3-5¸ö²»µÈ£©

Strea Streaming Ïû·ÑÕßµ½µ×È¥¶ÔÓ¦ÄÄЩ topic ÄØ£¿»¹ÓÐΪʲôÕâô»®·Ö£¬ÒÔ¼°ÕâÑù»®·ÖÓÐʲôºÃ´¦ÄØ£¿

ÒòΪ kafka topic ¶ÔÓ¦ÁËÒµÎñÖеľßÌå HBase ±í£¬È»ºó¾Íͨ¹ý¼à¿Ø HBase ±í²åÈëÁ÷Á¿À´Åжϸñí²åÈëÇé¿ö

¶ÔÓÚ HBase ±íÊý¾ÝµÄ²åÈëÁ¿»®·ÖÁË5ÖÖ£¬²åÈëÁ¿Ìرð´ó¡¢²åÈëÌõÊý¶àÿÌõÊý¾ÝÁ¿²»´ó¡¢Ã¿´Î²åÈëÊý¾ÝÁ¿ÉÙÊý¾Ý´ó¡¢±È½Ï¾ùÔÈ¡¢²åÈëÉÙ²»Æµ·±

¶ÔÓÚ²åÈëÁ¿Ìرð´ó£¬±ÈÈç¸Ã±í¶¼Õ¼Á˲åÈë×ÜÁ¿µÄ10%¡¢20%µÄÕâÖ־ͶÀÁ¢³öÀ´Ò»Õűí¶ÔÓ¦Ò»¸östreamingÏû·ÑÕß

²åÈëÌõÊý¶àÿÌõÊý¾ÝÁ¿²»´ó£¬¾ÍÊǰѲåÈë±È½ÏƵ·±µÄ¿ÉÒÔ·ÅÔÚÒ»Æð£¬Õâʱºò¿ÉÒÔµ÷С timeWindow

ÿ´Î²åÈëÊý¾ÝÁ¿ÉÙÊý¾Ý´ó£¬¾ÍÊÇ¿ÉÒÔ¿´¼û²åÈëÿ´Î¶¼ÊÇ1000Ìõ£¬2000Ìõ£¬ÓÐЩʱ¼ä¼ä¸ô£¬¾Í¿ÉÒÔµ÷´ó timeWindow ʱ¼ä¼ä¸ô£¬ maxRatePerPartition ÉèÖôóÒ»µã

±È½Ï¾ùÔȾͺðìÁË£¬ºÜºÃÉèÖòÎÊý

²åÈëÉÙ²»Æµ·±£¬¿ÉÒÔµ÷´ótimeWindowµ½¼¸Ã룬ÉõÖÁÌ«ÉÙ£¬Ì«²»Æµ·±¿ÉÒÔ¼ÌÐøµ÷´ó

ºÃ´¦´ó¼ÒÓ¦¸ÃÒ²¿´³öÀ´Á˰ɣ¬×ÊÔ´µÄºÏÀíÀûÓã¬¶Ô streaming µÄÓÅ»¯£¬ timeWindow ¡¢ maxRatePerPartition ¶ÔÓ¦²»Í¬±í£¬Ôö¼ÓºÍ¿ØÖÆÁ˲¢·¢Á¿

6.2 StreamingÈÎÎñµÄ¼à¿Ø

¶ÔÓÚSpark Streaming jobµÄ¼à¿Ø£¬×Ô´øµÄStreaming UIÄÜ¿´µ½¾ßÌåµÄһЩÁ÷Á¿£¬Ê±¼äµÈÐÅÏ¢£¬µ«ÊÇȱÉÙÁËÒ»¸ö֪ͨ£¬ÓÚÊǼòµ¥µÄ¿ª·¢ÁËÒ»¸ö¡£ÔÚ¼à¿ØÕâÒ»¿éÒ²ÏëÁ˲»ÉÙ·½°¸£¬±ÈÈç¼à¿Øpid£¬Í¨¹ýshellÈ¥¼à¿Ø£¬»òÕßÖ±½Óµ÷ÓÃÔ´ÂëÀïÃæµÄ·½·¨£¬¶¼³¢ÊÔ¹ý£¬ÓеÄҪôû´ïµ½Ô¤ÆÚµÄЧ¹û£¬ÒªÃ´ÓеIJ»ÊǺܺÃά»¤¿ª·¢³É±¾¸ß¡£

×îÖÕÑ¡ÁËÒ»¸ö±È½Ï¼òµ¥µÄ£¬µ«ÊÇÓÖÄÜ´ïµ½Ò»¶¨Ð§¹ûµÄ£¬Í¨¹ýpyÅÀ³æ£¬µ½Ô­Ê¼µÄ streaming UI ½çÃæÈ¥»ñÈ¡µ½¾ßÌåµÄÐÅÏ¢£¬À´¼à¿Ø£¬µ½´ïãÐÖµ¾Í·¢ËÍÓʼþ£¬×ÜÌå²½ÖèÈçÏ£º

ͨ¹ý job name ÔÚ yarn 8088 ½çÃæ/cluster/apps/RUNNINGÕÒ ApplicationMasterURL µØÖ·

È»ºóͨ¹ý¸ÃµØÖ·µ½ streaming ½çÃæ¼à¿Ø¾ßÌå Streaming jobµÄScheduling Delay ¡¢ Processing Time Öµ

yarn 8088½çÃæ/cluster/apps/RUNNING

¾ßÌå´úÂ룺

Python ¼à¿ØÅÀ³æ Óʼþ֪ͨ

   
2318 ´Îä¯ÀÀ       30
Ïà¹ØÎÄÕÂ

»ùÓÚEAµÄÊý¾Ý¿â½¨Ä£
Êý¾ÝÁ÷½¨Ä££¨EAÖ¸ÄÏ£©
¡°Êý¾Ýºþ¡±£º¸ÅÄî¡¢ÌØÕ÷¡¢¼Ü¹¹Óë°¸Àý
ÔÚÏßÉ̳ÇÊý¾Ý¿âϵͳÉè¼Æ ˼·+Ч¹û
 
Ïà¹ØÎĵµ

GreenplumÊý¾Ý¿â»ù´¡Åàѵ
MySQL5.1ÐÔÄÜÓÅ»¯·½°¸
ijµçÉÌÊý¾ÝÖÐ̨¼Ü¹¹Êµ¼ù
MySQL¸ßÀ©Õ¹¼Ü¹¹Éè¼Æ
Ïà¹Ø¿Î³Ì

Êý¾ÝÖÎÀí¡¢Êý¾Ý¼Ü¹¹¼°Êý¾Ý±ê×¼
MongoDBʵս¿Î³Ì
²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿âÉè¼ÆÓëÓÅ»¯
PostgreSQLÊý¾Ý¿âʵսÅàѵ