Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Modeler   Code  
»áÔ±   
 
   
 
 
     
   
 ¶©ÔÄ
  ¾èÖú
»ùÓÚELK StackºÍSpark StreamingµÄÈÕÖ¾´¦ÀíÆ½Ì¨Éè¼Æ
 
×÷ÕߣºÕŲ¨,ÕÅÒã,Àî¶° À´Ô´£ºIBM ·¢²¼ÓÚ 2016-1-20
  4595  次浏览      27
 

±¾ÎÄͨ¹ý¶Ô ELK Stack¡¢Kafka¡¢Spark Streaming ÕûºÏ·½°¸µÄ½éÉÜÃèÊöÁËϵͳƽ̨ÈÕÖ¾´¦ÀíÁ÷³Ì£¬Ï£Íû¶ÔϵͳÔËά¹¤³Ìʦ¡¢Êý¾Ý¿â¹¤³ÌʦÔÚʵÏÖÊý¾Ýƽ̨¼¯ÖÐʽÔËά¹¤×÷ÖÐÓÐËù°ïÖú£¬¶ÁÕß»¹¿ÉÒԲο¼ÎÄÕÂĩβ¸ø³öµÄ»¥ÁªÍø¹«Ë¾ÔÚϵͳÔËά¡¢ÈÕÖ¾´¦Àí¹¤×÷ÖеÄʵ¼ù¾­Ñ飬½áºÏ×ÔÉíÇé¿öÉè¼Æ×Ô¼ºµÄ·½°¸¡£

¸ÅÊö

´óÊý¾Ýʱ´ú£¬Ëæ×ÅÊý¾ÝÁ¿²»¶ÏÔö³¤£¬´æ´¢Óë¼ÆË㼯ȺµÄ¹æÄ£Ò²Öð½¥À©´ó£¬¼¸°ÙÉÏǧ̨µÄÔÆ¼ÆËã»·¾³ÒѲ»Ïʼû¡£ÏÖÔڵļ¯ÈºËùÐèÒª½â¾öµÄÎÊÌâ²»½ö½öÊǸßÐÔÄÜ¡¢¸ß¿É¿¿ÐÔ¡¢¸ß¿ÉÀ©Õ¹ÐÔ£¬»¹ÐèÒªÃæ¶ÔÒ×ά»¤ÐÔÒÔ¼°Êý¾Ýƽ̨ÄÚ²¿µÄÊý¾Ý¹²ÏíÐÔµÈÖî¶àÌôÕ½¡£ÓÅÐãµÄϵͳÔËάƽ̨¼ÈÄÜʵÏÖÊý¾Ýƽ̨¸÷×é¼þµÄ¼¯ÖÐʽ¹ÜÀí¡¢·½±ãϵͳÔËάÈËÔ±ÈÕ³£¼à²â¡¢ÌáÉýÔËάЧÂÊ£¬ÓÖÄÜ·´À¡ÏµÍ³ÔËÐÐ״̬¸øÏµÍ³¿ª·¢ÈËÔ±¡£ÀýÈç²É¼¯Êý¾Ý²Ö¿âµÄÈÕÖ¾¿ÉÒÔ°´ÕÕʱ¼äÐòÁв鿴¸÷Êý¾Ý¿âʵÀý¸÷ÖÖ¼¶±ðµÄÈÕÖ¾ÊýÁ¿ÓëÕ¼±È£¬²É¼¯ DB2 ±í¿Õ¼äÊý¾Ý·ÖÎö¿ÉµÃµ½Êý¾Ý¿â¼¯Èº½¡¿µ×´Ì¬£¬·ÖÎöÓ¦Ó÷þÎñÆ÷µÄÈÕÖ¾¿ÉÒԲ鿴³ö´í×î¶àµÄÄ£¿é¡¢ÏÂÔØ×î¶àµÄÎļþ¡¢Ê¹ÓÃ×î¶àµÄ¹¦Äܵȡ£´óÊý¾Ýʱ´úµÄÒµÎñÓëÔËά½«½ôÃܵĽáºÏÔÚÒ»Æð¡£

ÈÕÖ¾

1. ʲôÊÇÈÕÖ¾

ÈÕÖ¾ÊÇ´øÊ±¼ä´ÁµÄ»ùÓÚʱ¼äÐòÁеĻúÆ÷Êý¾Ý£¬°üÀ¨ IT ϵͳÐÅÏ¢£¨·þÎñÆ÷¡¢ÍøÂçÉ豸¡¢²Ù×÷ϵͳ¡¢Ó¦ÓÃÈí¼þ£©¡¢ÎïÁªÍø¸÷ÖÖ´«¸ÐÆ÷ÐÅÏ¢¡£ÈÕÖ¾¿ÉÒÔ·´Ó³Óû§Êµ¼ÊÐÐΪ£¬ÊÇÕæÊµµÄÊý¾Ý¡£

2. ÈÕÖ¾´¦Àí·½°¸Ñݽø

ͼ 1. ÈÕÖ¾´¦Àí·½°¸¾­ÀúµÄ°æ±¾µü´ú

ÈÕÖ¾´¦Àí v1.0£ºÈÕ־ûÓм¯ÖÐʽ´¦Àí£»Ö»×öʺó×·²é£¬ºÚ¿ÍÈëÇÖºóɾ³ýÈÕÖ¾ÎÞ·¨²ì¾õ£»Ê¹ÓÃÊý¾Ý¿â´æ´¢ÈÕÖ¾£¬ÎÞ·¨Ê¤Èθ´ÔÓÊÂÎñ´¦Àí¡£

ÈÕÖ¾´¦Àí v2.0£ºÊ¹Óà Hadoop ƽ̨ʵÏÖÈÕÖ¾ÀëÏßÅú´¦Àí£¬È±µãÊÇʵʱÐԲʹÓà Storm Á÷´¦Àí¿ò¼Ü¡¢Spark ÄÚ´æ¼ÆËã¿ò¼Ü´¦ÀíÈÕÖ¾£¬µ« Hadoop/Storm/Spark ¶¼ÊDZà³Ì¿ò¼Ü£¬²¢²»ÊÇÄÃÀ´¼´ÓÃµÄÆ½Ì¨¡£

ÈÕÖ¾´¦Àí v3.0£ºÊ¹ÓÃÈÕ־ʵʱËÑË÷ÒýÇæ·ÖÎöÈÕÖ¾£¬Ìص㣺µÚÒ»Êǿ죬ÈÕÖ¾´Ó²úÉúµ½ËÑË÷·ÖÎö³ö½á¹ûÖ»ÓÐÊýÃëÑÓʱ£»µÚ¶þÊÇ´ó£¬Ã¿Ìì´¦Àí TB ÈÕÖ¾Á¿£»µÚÈýÊÇÁé»î£¬¿ÉËÑË÷·ÖÎöÈκÎÈÕÖ¾¡£×÷Ϊ´ú±íµÄ½â¾ö·½°¸ÓÐ Splunk¡¢ELK¡¢SILK¡£

ͼ 2. Éî¶ÈÕûºÏ ELK¡¢Spark¡¢Hadoop ¹¹½¨ÈÕÖ¾·ÖÎöϵͳ

ELK Stack

ELK Stack ÊÇ¿ªÔ´ÈÕÖ¾´¦ÀíÆ½Ì¨½â¾ö·½°¸£¬±³ºóµÄÉÌÒµ¹«Ë¾ÊÇ Elastic(https://www.elastic.co/)¡£ËüÓÉÈÕÖ¾²É¼¯½âÎö¹¤¾ß Logstash¡¢»ùÓÚ Lucene µÄÈ«ÎÄËÑË÷ÒýÇæ Elasticsearch¡¢·ÖÎö¿ÉÊÓ»¯Æ½Ì¨ Kibana ×é³É¡£Ä¿Ç° ELK µÄÓû§ÓÐ Adobe¡¢Microsoft¡¢Mozilla¡¢Facebook¡¢Stackoverflow¡¢Cisco¡¢ebay¡¢Uber µÈÖî¶àÖªÃû³§ÉÌ¡£

1. Logstash

Logstash ÊÇÒ»ÖÖ¹¦ÄÜÇ¿´óµÄÐÅÏ¢²É¼¯¹¤¾ß£¬ÀàËÆÓÚ Hadoop Éú̬ȦÀïµÄ Flume¡£Í¨³£ÔÚÆäÅäÖÃÎļþ¹æ¶¨ Logstash ÈçºÎ´¦Àí¸÷ÖÖÀàÐ͵ÄʼþÁ÷£¬Ò»°ã°üº¬ input¡¢filter¡¢output Èý¸ö²¿·Ö¡£Logstash Ϊ¸÷¸ö²¿·ÖÌṩÏàÓ¦µÄ²å¼þ£¬Òò¶øÓÐ input¡¢filter¡¢output ÈýÀà²å¼þÍê³É¸÷ÖÖ´¦ÀíºÍת»»£»ÁíÍâ codec ÀàµÄ²å¼þ¿ÉÒÔ·ÅÔÚ input ºÍ output ²¿·Öͨ¹ý¼òµ¥±àÂëÀ´¼ò»¯´¦Àí¹ý³Ì¡£ÏÂÃæÒÔ DB2 µÄÒ»ÌõÈÕ־ΪÀý¡£

ͼ 3.DB2 Êý¾Ý¿â²úÉúµÄ°ë½á¹¹»¯ÈÕÖ¾ÑùÀý

ÕâÊÇÒ»ÖÖ¶àÐеÄÈÕÖ¾£¬Ã¿Ò»ÌõÈÕÖ¾ÒÔ£º¡°2014-10-19-12.19.46.033037-300¡±¸ñʽµÄʱ¼ä´ÁΪÆðʼ±êÖ¾¡£¿ÉÒÔÔÚ input ²¿·ÖÒýÈë codec ²å¼þ multiline£¬À´½«Ò»ÌõÈÕÖ¾µÄ¶àÐÐÎı¾·â×°µ½Ò»ÌõÏûÏ¢£¨message£©ÖС£

input {
file {
path => "path/to/filename"
codec => multiline {
pattern => "^\d{4}-\d{2}-\d{2}-\d{2}\.\d{2}\.\d{2}\.\d{6}[\+-]\d{3}"
negate => true
what => previous
}
}
}

ʹÓà file ²å¼þµ¼ÈëÎļþÐÎʽµÄÈÕÖ¾£¬¶øÇ¶ÈëµÄ codec ²å¼þ multiline µÄ²ÎÊý pattern ¾Í¹æ¶¨ÁË·Ö¸îÈÕÖ¾ÌõÄ¿µÄʱ¼ä´Á¸ñʽ¡£ÔÚ DataStage ¶àÐÐÈÕÖ¾µÄʵ¼ÊÓ¦ÓÃÖУ¬ÓÐʱһÌõÈÕÖ¾»á³¬¹ý 500 ÐУ¬Õⳬ³öÁË multiline ×é¼þĬÈϵÄʼþ·â×°µÄ×î´óÐÐÊý£¬ÕâÐèÒªÎÒÃÇÔÚ multiline ÖÐÉèÖà max_lines ÊôÐÔ¡£

¾­¹ý input ²¿·Ö¶ÁÈëÔ¤´¦ÀíºóµÄÊý¾ÝÁ÷Èë filter ²¿·Ö£¬ÆäʹÓà grok¡¢mutate µÈ²å¼þÀ´¹ýÂËÎı¾ºÍÆ¥Åä×ֶΣ¬²¢ÇÒÎÒÃÇ×Ô¼º¿ÉÒÔΪʼþÁ÷Ìí¼Ó¶îÍâµÄ×Ö¶ÎÐÅÏ¢£º

filter {
mutate{
gsub => ['message', "\n", " "]
}
grok {
match => { "message" =>
"(?<timestamp>%{YEAR}-%{MONTHNUM}-%{MONTHDAY}-%{HOUR}\.%{MINUTE}\.%{SECOND}) %{INT:timezone}(?:%{SPACE}%{WORD:recordid}%{SPACE})(?:LEVEL%{SPACE}:%{SPACE} %{DATA:level}%{SPACE})(?:PID%{SPACE}:%{SPACE}%{INT:processid}%{SPACE})(?:TID%{SPACE}:%{SPACE} %{INT:threadid}%{SPACE})(?:PROC%{SPACE}:%{SPACE}%{DATA:process}%{SPACE})?(?:INSTANCE%{SPACE}: %{SPACE}%{WORD:instance}%{SPACE})?(?:NODE%{SPACE}:%{SPACE}%{WORD:node}%{SPACE})?(?:DB%{SPACE} :%{SPACE}%{WORD:dbname}%{SPACE})?(?:APPHDL%{SPACE}:%{SPACE} %{NOTSPACE:apphdl}%{SPACE})?(?:APPID%{SPACE}:%{SPACE}%{NOTSPACE:appid} %{SPACE})?(?:AUTHID%{SPACE}:%{SPACE}%{WORD:authid}%{SPACE})?(?:HOSTNAME%{SPACE}: %{SPACE}%{HOSTNAME:hostname}%{SPACE})?(?:EDUID%{SPACE}:%{SPACE}%{INT:eduid} %{SPACE})?(?:EDUNAME%{SPACE}:%{SPACE}%{DATA:eduname}%{SPACE})?(?:FUNCTION%{SPACE}: %{SPACE}%{DATA:function}%{SPACE})(?:probe:%{SPACE}%{INT:probe}%{SPACE})%{GREEDYDATA:functionlog}"
}
}
date {
match => [ "timestamp", "YYYY-MM-dd-HH.mm.ss.SSSSSS" ]
}
}

Ç°Ãæ input ²¿·ÖµÄ multiline ²å¼þ½«Ò»Ìõ¶àÐÐÈÕÖ¾Ïîת»¯ÎªÒ»ÐУ¬²¢ÒÔ¡°\n¡±Ìæ´úʵ¼ÊµÄ»»Ðзû¡£ÎªÁ˱ãÓÚºóÃæ´¦Àí£¬ÕâÀïµÄ mutate ²å¼þ¾ÍÊǽ«ÕâЩ¡°\n¡±Ì滻Ϊ¿Õ¸ñ¡£¶ø grok ²å¼þÓÃÓÚÆ¥ÅäÌáÈ¡ÈÕÖ¾ÏîÖÐÓÐÒâÒåµÄ×Ö¶ÎÐÅÏ¢¡£×îºóµÄ date ²å¼þÔòÊǰ´¸ñʽ¡°YYYY-MM-dd-HH.mm.ss.SSSSSS¡±½âÎöÌáÈ¡µÄʱ¼ä´Á×ֶΣ¬²¢¸³¸øÏµÍ³Ä¬ÈϵÄʱ¼ä´Á×ֶΡ°@timestamp¡±¡£Output ²å¼þÓÃÓÚÖ¸¶¨Ê¼þÁ÷µÄÈ¥Ïò£¬¿ÉÒÔÊÇÏûÏ¢¶ÓÁС¢È«ÎÄËÑË÷ÒýÇæ¡¢TCP Socket¡¢Email µÈ¼¸Ê®ÖÖÄ¿±ê¶Ë¡£

2. Elasticsearch

Elasticsearch ÊÇ»ùÓÚ Lucene µÄ½üʵʱËÑË÷ƽ̨£¬ËüÄÜÔÚÒ»ÃëÄÚ·µ»ØÄãÒª²éÕÒµÄÇÒÒѾ­ÔÚ Elasticsearch ×öÁËË÷ÒýµÄÎĵµ¡£ËüĬÈÏ»ùÓÚ Gossip ·ÓÉËã·¨µÄ×Ô¶¯·¢ÏÖ»úÖÆ¹¹½¨ÅäÖÃÓÐÏàͬ cluster name µÄ¼¯Èº£¬µ«ÊÇÓеÄʱºòÕâÖÖ»úÖÆ²¢²»¿É¿¿£¬»á·¢ÉúÄÔÁÑÏÖÏó¡£¼øÓÚÖ÷¶¯·¢ÏÖ»úÖÆµÄ²»Îȶ¨ÐÔ£¬Óû§¿ÉÒÔÑ¡ÔñÔÚÿһ¸ö½ÚµãÉÏÅäÖü¯ÈºÆäËû½ÚµãµÄÖ÷»úÃû£¬ÔÚÆô¶¯¼¯ÈºÊ±½øÐб»¶¯·¢ÏÖ¡£

Elasticsearch ÖÐµÄ Index ÊÇÒ»×é¾ßÓÐÏàËÆÌØÕ÷µÄÎĵµ¼¯ºÏ£¬ÀàËÆÓÚ¹ØÏµÊý¾Ý¿âÄ£ÐÍÖеÄÊý¾Ý¿âʵÀý£¬Index ÖпÉÒÔÖ¸¶¨ Type Çø·Ö²»Í¬µÄÎĵµ£¬ÀàËÆÓÚÊý¾Ý¿âʵÀýÖеĹØÏµ±í£¬Document ÊÇ´æ´¢µÄ»ù±¾µ¥Î»£¬¶¼ÊÇ JSON ¸ñʽ£¬ÀàËÆÓÚ¹ØÏµ±íÖÐÐм¶¶ÔÏó¡£ÎÒÃÇ´¦ÀíºóµÄ JSON Îĵµ¸ñʽµÄÈÕÖ¾¶¼ÒªÔÚ Elasticsearch ÖÐ×öË÷Òý£¬ÏàÓ¦µÄ Logstash ÓÐ Elasticsearch output ²å¼þ£¬¶ÔÓÚÓû§ÊÇ͸Ã÷µÄ¡£

Hadoop Éú̬ȦΪ´ó¹æÄ£Êý¾Ý¼¯µÄ´¦ÀíÌṩ¶àÖÖ·ÖÎö¹¦ÄÜ£¬µ«ÊµÊ±ËÑË÷Ò»Ö±ÊÇ Hadoop µÄÈíÀß¡£Èç½ñ£¬Elasticsearch for Apache Hadoop£¨ES-Hadoop£©ÃÖ²¹ÁËÕâһȱÏÝ£¬ÎªÓû§ÕûºÏÁË Hadoop µÄ´óÊý¾Ý·ÖÎöÄÜÁ¦ÒÔ¼° Elasticsearch µÄʵʱËÑË÷ÄÜÁ¦.

ͼ 4. Ó¦Óà es-hadoop ÕûºÏ Hadoop Ecosystem Óë Elasticsearch ¼Ü¹¹Í¼£¨https://www.elastic.co/products/hadoop£©

3. Kibana

Kibana ÊÇרÃÅÉè¼ÆÓÃÀ´Óë Elasticsearch Э×÷µÄ£¬¿ÉÒÔ×Ô¶¨Òå¶àÖÖ±í¸ñ¡¢Öù״ͼ¡¢±ý״ͼ¡¢ÕÛÏßͼ¶Ô´æ´¢ÔÚ Elasticsearch ÖеÄÊý¾Ý½øÐÐÉîÈëÍÚ¾ò·ÖÎöÓë¿ÉÊÓ»¯¡£ÏÂͼ¶¨ÖƵÄÒDZíÅÌ¿ÉÒÔ¶¯Ì¬¼à²âÊý¾Ý¿â¼¯ÈºÖÐÿ¸öÊý¾Ý¿âʵÀý²úÉúµÄ¸÷ÖÖ¼¶±ðµÄÈÕÖ¾¡£

ͼ 5. ʵʱ¼à²â DB2 ʵÀýÔËÐÐ״̬µÄ¶¯Ì¬ÒDZíÅÌ

Kafka

Kafka ÊÇ LinkedIn ¿ªÔ´µÄ·Ö²¼Ê½ÏûÏ¢¶ÓÁУ¬Ëü²ÉÓÃÁ˶ÀÌØµÄÏû·ÑÕß-Éú²úÕ߼ܹ¹ÊµÏÖÊý¾Ýƽ̨¸÷×é¼þ¼äµÄÊý¾Ý¹²Ïí¡£¼¯Èº¸ÅÄîÖÐµÄ server ÔÚ Kafka ÖгÆÖ®Îª broker£¬ËüʹÓÃÖ÷Ìâ¹ÜÀí²»Í¬Àà±ðµÄÊý¾Ý£¬±ÈÈç DB2 ÈÕÖ¾¹éΪһ¸öÖ÷Ì⣬tomcat ÈÕÖ¾¹éΪһ¸öÖ÷Ìâ¡£ÎÒÃÇʹÓà Logstash ×÷Ϊ Kafka ÏûÏ¢µÄÉú²úÕßʱ£¬output ²å¼þ¾ÍÐèÒªÅäÖúà Kafka broker µÄÁÐ±í£¬Ò²¾ÍÊÇ Kafka ¼¯ÈºÖ÷»úµÄÁÐ±í£»ÏàÓ¦µÄ£¬ÓÃ×÷ Kafka Ïû·ÑÕß½ÇÉ«µÄ Logstash µÄ input ²å¼þ¾ÍÒªÅäÖúÃÐèÒª¶©ÔÄµÄ Kafka ÖеÄÖ÷ÌâÃû³ÆºÍ ZooKeeper Ö÷»úÁÐ±í¡£Kafka ͨ¹ý½«Êý¾Ý³Ö¾Ã»¯µ½Ó²ÅÌµÄ Write Ahead Log£¨WAL£©±£Ö¤Êý¾Ý¿É¿¿ÐÔÓë˳ÐòÐÔ£¬µ«Õâ²¢²»»áÓ°ÏìʵʱÊý¾ÝµÄ´«ÊäËÙ¶È£¬ÊµÊ±Êý¾ÝÈÔÊÇͨ¹ýÄÚ´æ´«ÊäµÄ¡£Kafka ÊÇÒÀÀµÓÚ ZooKeeper µÄ£¬Ëü½«Ã¿×éÏû·ÑÕßÏû·ÑµÄÏàÓ¦ topic µÄÆ«ÒÆÁ¿±£´æÔÚ ZooKeeper ÖС£¾Ý³Æ LinkedIn ÄÚ²¿µÄ Kafka ¼¯ÈºÃ¿ÌìÒÑÄÜ´¦Àí³¬¹ý 1 ÍòÒÚÌõÏûÏ¢¡£

ͼ 6. »ùÓÚÏûÏ¢¶©ÔÄ»úÖÆµÄ Kafka ¼Ü¹¹

³ýÁ˿ɿ¿ÐԺͶÀÌØµÄ push&pull ¼Ü¹¹Í⣬Ïà½ÏÓÚÆäËûÏûÏ¢¶ÓÁУ¬Kafka »¹ÓµÓиü´óµÄÍÌÍÂÁ¿£º

ͼ 7. »ùÓÚÏûÏ¢³Ö¾Ã»¯»úÖÆµÄÏûÏ¢¶ÓÁÐÍÌÍÂÁ¿±È½Ï

Spark Streaming

Spark ÓɼÓÖÝ´óѧ²®¿ËÀû·ÖУ AMP ʵÑéÊÒ (Algorithms, Machines, and People Lab) ¿ª·¢£¬¿ÉÓÃÀ´¹¹½¨´óÐ͵ġ¢µÍÑÓ³ÙµÄÊý¾Ý·ÖÎöÓ¦ÓóÌÐò¡£Ëü½«Åú´¦Àí¡¢Á÷´¦Àí¡¢¼´Ï¯²éѯÈÚΪһÌå¡£Spark ÉçÇøÒ²ÊÇÏ൱»ð±¬£¬Æ½¾ùÿÈý¸öÔµü´úÒ»´Î°æ±¾¸üÊÇÌåÏÖÁËËüÔÚ´óÊý¾Ý´¦ÀíÁìÓòµÄµØÎ»¡£

Spark Streaming ²»Í¬ÓÚ Storm£¬Storm ÊÇ»ùÓÚʼþ¼¶±ðµÄÁ÷´¦Àí£¬Spark Streaming ÊÇ mini-batch ÐÎʽµÄ½üËÆÁ÷´¦ÀíµÄ΢ÐÍÅú´¦Àí¡£Spark Streaming ÌṩÁËÁ½ÖÖ´Ó Kafka ÖлñÈ¡ÏûÏ¢µÄ·½Ê½£º

µÚÒ»ÖÖÊÇÀûÓà Kafka Ïû·ÑÕ߸߼¶ API ÔÚ Spark µÄ¹¤×÷½ÚµãÉÏ´´½¨Ïû·ÑÕßỊ̈߳¬¶©ÔÄ Kafka ÖеÄÏûÏ¢£¬Êý¾Ý»á´«Êäµ½ Spark ¹¤×÷½ÚµãµÄÖ´ÐÐÆ÷ÖУ¬µ«ÊÇĬÈÏÅäÖÃÏÂÕâÖÖ·½·¨ÔÚ Spark Job ³ö´íʱ»áµ¼ÖÂÊý¾Ý¶ªÊ§£¬Èç¹ûÒª±£Ö¤Êý¾Ý¿É¿¿ÐÔ£¬ÐèÒªÔÚ Spark Streaming ÖпªÆôWrite Ahead Logs£¨WAL£©£¬Ò²¾ÍÊÇÉÏÎÄÌáµ½µÄ Kafka ÓÃÀ´±£Ö¤Êý¾Ý¿É¿¿ÐÔºÍÒ»ÖÂÐÔµÄÊý¾Ý±£´æ·½Ê½¡£¿ÉÒÔÑ¡ÔñÈà Spark ³ÌÐò°Ñ WAL ±£´æÔÚ·Ö²¼Ê½Îļþϵͳ£¨±ÈÈç HDFS£©ÖС£

µÚ¶þÖÖ·½Ê½²»ÐèÒª½¨Á¢Ïû·ÑÕßỊ̈߳¬Ê¹Óà createDirectStream ½Ó¿ÚÖ±½ÓÈ¥¶ÁÈ¡ Kafka µÄ WAL£¬½« Kafka ·ÖÇøÓë RDD ·ÖÇø×öÒ»¶ÔÒ»Ó³É䣬Ïà½ÏÓÚµÚÒ»ÖÖ·½·¨£¬²»ÐèÔÙά»¤Ò»·Ý WAL Êý¾Ý£¬Ìá¸ßÁËÐÔÄÜ¡£¶ÁÈ¡Êý¾ÝµÄÆ«ÒÆÁ¿ÓÉ Spark Streaming ³ÌÐòͨ¹ý¼ì²éµã»úÖÆ×ÔÉí´¦Àí£¬±ÜÃâÔÚ³ÌÐò³ö´íµÄÇé¿öÏÂÖØÏÖµÚÒ»ÖÖ·½·¨Öظ´¶ÁÈ¡Êý¾ÝµÄÇé¿ö£¬Ïû³ýÁË Spark Streaming Óë ZooKeeper/Kafka Êý¾Ý²»Ò»ÖµķçÏÕ¡£±£Ö¤Ã¿ÌõÏûÏ¢Ö»»á±» Spark Streaming ´¦ÀíÒ»´Î¡£ÒÔÏ´úÂëÆ¬Í¨¹ýµÚ¶þÖÖ·½Ê½¶ÁÈ¡ Kafka ÖеÄÊý¾Ý£º

// Create direct kafka stream with brokers and topics
JavaPairInputDStream<String, String> messages = KafkaUtils.createDirectStream(
jssc,
String.class,
String.class,
StringDecoder.class,
StringDecoder.class,
kafkaParams,
topicsSet);
messages.foreachRDD(new Function<JavaPairRDD<String,String>,Void>(){
public Void call(JavaPairRDD<String, String> v1)
throws Exception {
v1.foreach(new VoidFunction<Tuple2<String, String>>(){
public void call(Tuple2<String, String> tuple2) {
try{
JSONObject a = new JSONObject(tuple2._2);
...

Spark Streaming »ñÈ¡µ½ÏûÏ¢ºó±ã¿ÉÒÔͨ¹ý Tuple ¶ÔÏó×Ô¶¨Òå²Ù×÷ÏûÏ¢£¬ÈçÏÂͼÊÇÕë¶Ô DB2 Êý¾Ý¿âÈÕÖ¾µÄÓʼþ¸æ¾¯£¬Éú³É¸æ¾¯Óʼþ·¢Ë͵½ Notes ÓÊÏ䣺

ͼ 8. »ùÓÚ Spark Streaming ¶Ô DB2 Òì³£ÈÕ־ʵÏÖ Notes Óʼþ¸æ¾¯

»¥ÁªÍøÐÐÒµÈÕÖ¾´¦Àí·½°¸¾ÙÀý½éÉÜÓëÓ¦ÓÃ

1. ÐÂÀË

ÐÂÀ˲ÉÓõļ¼Êõ¼Ü¹¹Êdz£¼ûµÄ Kafka ÕûºÏ ELK Stack ·½°¸¡£Kafka ×÷ΪÏûÏ¢¶ÓÁÐÓÃÀ´»º´æÓû§ÈÕÖ¾£»Ê¹Óà Logstash ×öÈÕÖ¾½âÎö£¬Í³Ò»³É JSON ¸ñʽÊä³ö¸ø Elasticsearch£»Ê¹Óà Elasticsearch ÌṩʵʱÈÕÖ¾·ÖÎöÓëÇ¿´óµÄËÑË÷ºÍͳ¼Æ·þÎñ£»Kibana ÓÃ×÷Êý¾Ý¿ÉÊÓ»¯×é¼þ¡£¸Ã¼¼Êõ¼Ü¹¹Ä¿Ç°·þÎñµÄÓû§°üÀ¨Î¢²©¡¢Î¢ÅÌ¡¢ÔÆ´æ´¢¡¢µ¯ÐÔ¼ÆËãÆ½Ì¨µÈÊ®¶à¸ö²¿ÃŵĶà¸ö²úÆ·µÄÈÕÖ¾ËÑË÷·ÖÎöÒµÎñ£¬Ã¿Ìì´¦ÀíÔ¼ 32 ÒÚÌõ£¨2TB£©ÈÕÖ¾¡£

ÐÂÀ˵ÄÈÕÖ¾´¦ÀíÆ½Ì¨ÍÅ¶Ó¶Ô Elasticsearch ×öÁË´óÁ¿ÓÅ»¯£¨±ÈÈçµ÷Õû max open files µÈ£©£¬²¢ÇÒ¿ª·¢ÁËÒ»¸ö¶ÀÁ¢µÄ Elasticsearch Index ¹ÜÀíϵͳ£¬¸ºÔðË÷ÒýÈÕ³£Î¬»¤ÈÎÎñ£¨±ÈÈçË÷ÒýµÄ´´½¨¡¢ÓÅ»¯¡¢É¾³ý¡¢Óë·Ö²¼Ê½ÎļþϵͳµÄÊý¾Ý½»»»µÈ£©µÄµ÷¶È¼°Ö´ÐС£Îª Elasticsearch °²×°Á˹úÄÚÖÐÎķִʲå¼þ elasticsearch-analysis-ik£¬Âú×ã΢ÅÌËÑË÷¶ÔÖÐÎķִʵÄÐèÇó¡££¨¼û²Î¿¼×ÊÔ´ 2£©

2. ÌÚѶ

ÌÚѶÀ¶¾¨Êý¾Ýƽ̨¸æ¾¯ÏµÍ³µÄ¼¼Êõ¼Ü¹¹Í¬Ñù»ùÓÚ·Ö²¼Ê½ÏûÏ¢¶ÓÁкÍÈ«ÎÄËÑË÷ÒýÇæ¡£µ«ÌÚѶµÄ¸æ¾¯Æ½Ì¨²»½öÏÞÓÚ´Ë£¬ËüµÄ¸´ÔÓµÄÖ¸±êÊý¾Ýͳ¼ÆÈÎÎñͨ¹ýʹÓà Storm ×Ô¶¨ÒåÁ÷ʽ¼ÆËãÈÎÎñµÄ·½·¨ÊµÏÖ£¬Òì³£¼ì²âµÄʵÏÖÀûÓÃÁËÇúÏßµÄʱ¼äÖÜÆÚÐÔºÍÏà¹ØÇúÏßÖ®¼äµÄÏà¹ØÐÔÈ¥¶¨Ò嶯̬µÄãÐÖµ£¬²¢ÇÒ»ùÓÚ»úÆ÷ѧϰË㷨ʵÏÖÁ˸´ÔÓµÄÈÕÖ¾×Ô¶¯·ÖÀࣨ±ÈÈç summo logic£©¡£

¸æ¾¯Æ½Ì¨°Ñ²¦²â£¨¶¨Ê± curl Ò»ÏÂij¸ö url£¬ÓÐÎÊÌâ¾Í¸æ¾¯£©¡¢ÈÕÖ¾¼¯ÖмìË÷¡¢ÈÕÖ¾¸æ¾¯£¨5 ·ÖÖÓ Error ´óÓÚ X ´Î¸æ¾¯£©¡¢Ö¸±ê¸æ¾¯£¨cpu ʹÓÃÂÊ´óÓÚ X ¸æ¾¯£©ÕûºÏ½øÍ¬Ò»¸öÊý¾Ý¹ÜÏߣ¬¼ò»¯ÁËÕûÌåµÄ¼Ü¹¹¡££¨¼û²Î¿¼×ÊÔ´ 3£©

3. ÆßÅ£

ÆßÅ£²ÉÓõļ¼Êõ¼Ü¹¹Îª Flume£«Kafka£«Spark£¬»ì²¿ÔÚ 8 ̨¸ßÅä»úÆ÷¡£¸ù¾ÝÆßÅ£¼¼Êõ²©¿ÍÌṩµÄÊý¾Ý£¬¸ÃÈÕÖ¾´¦ÀíÆ½Ì¨Ã¿Ìì´¦Àí 500 ÒÚÌõÊý¾Ý£¬·åÖµ 80 Íò TPS¡£

Flume Ïà½ÏÓÚ Logstash Óиü´óµÄÍÌÍÂÁ¿£¬¶øÇÒÓë HDFS ÕûºÏµÄÐÔÄÜ±È Logstash Ç¿ºÜ¶à¡£ÆßÅ£¼¼Êõ¼Ü¹¹Ñ¡ÐÍÏÔÈ»¿¼ÂÇÁËÕâÒ»µã£¬ÆßÅ£ÔÆÆ½Ì¨µÄÈÕÖ¾Êý¾Ýµ½ Kafka ºó£¬Ò»Â·Í¬²½µ½ HDFS£¬ÓÃÓÚÀëÏßͳ¼Æ£¬Áíһ·ÓÃÓÚʹÓà Spark Streaming ½øÐÐʵʱ¼ÆË㣬¼ÆËã½á¹û±£´æÔÚ Mongodb ¼¯ÈºÖС£

Èκνâ¾ö·½°¸¶¼²»ÊÇʮȫʮÃÀµÄ£¬¾ßÌå²ÉÓÃÄÄЩ¼¼ÊõÒªÉîÈëÁ˽â×Ô¼ºµÄÓ¦Óó¡¾°¡£¾ÍĿǰÈÕÖ¾´¦ÀíÁìÓòµÄ¿ªÔ´×é¼þÀ´Ëµ£¬ÔÚÒÔϼ¸¸ö·½Ã滹±È½ÏǷȱ£º

Logstash µÄÄÚ²¿×´Ì¬»ñÈ¡²»µ½£¬Ä¿Ç°Ã»ÓкõijÉÊìµÄ¼à¿Ø·½°¸¡£

Elasticsearch ¾ßÓк£Á¿´æ´¢º£Á¿¾ÛºÏµÄÄÜÁ¦£¬µ«ÊÇͬ Mongodb Ò»Ñù£¬²¢²»ÊʺÏÓÚдÈëÊý¾Ý·Ç³£¶à£¨1 Íò TPS ÒÔÉÏ£©µÄ³¡¾°¡£

ȱ·¦ÕæÕýʵÓõÄÒì³£¼ì²â·½·¨£»ÊµÊ±Í³¼Æ·½ÃæÈ±·¦³ÉÊìµÄ½â¾ö·½°¸£¬Storm ¾ÍÊÇÒ»¸öµ×²ãµÄÖ´ÐÐÒýÇæ£¬¶ø Spark »¹È±ÉÙʱ¼ä´°¿ÚµÈ³éÏó¡£

¶ÔÓÚÈÕÖ¾×Ô¶¯·ÖÀ࣬»¹Ã»ÓпªÔ´¹¤¾ß¿ÉÒÔ×öµ½ summo logic ÄÇÑùµÄЧ¹û¡£

½áÊøÓï

´óÊý¾Ýʱ´úµÄÔËά¹ÜÀíÒâÒåÖØ´ó£¬ºÃµÄÈÕÖ¾´¦ÀíÆ½Ì¨¿ÉÒÔʰ빦±¶µÄÌáÉý¿ª·¢ÈËÔ±ºÍÔËάÈËÔ±µÄЧÂÊ¡£±¾ÎÄͨ¹ý¼òµ¥ÓÃÀý½éÉÜÁË ELK Stack¡¢Kafka ºÍ Spark Streaming ÔÚÈÕÖ¾´¦ÀíÆ½Ì¨Öи÷×ÔÔÚϵͳ¼Ü¹¹ÖеŦÄÜ¡£ÏÖʵÖÐÓ¦Óó¡¾°·±¶à¸´ÔÓ¡¢Êý¾ÝÐÎʽ¶àÖÖ¶àÑù£¬ÈÕÖ¾´¦Àí¹¤×÷²»ÊÇÒ»õí¶ø¾ÍµÄ£¬·ÖÎö´¦Àí¹ý³Ì»¹ÐèÒªÔÚʵ¼ùÖв»¶ÏÍÚ¾òºÍÓÅ»¯£¬±ÊÕßÒ²½«ÖÂÁ¦ÓÚ DB2 Êý¾Ý¿âÔËÐÐ״̬¸üϸ½ÚÊý¾ÝµÄÊÕ¼¯ºÍ¸üÈ«ÃæÏ¸ÖÂµÄ¼à¿Ø¡£

   
4595 ´Îä¯ÀÀ       27
Ïà¹ØÎÄÕÂ

»ùÓÚEAµÄÊý¾Ý¿â½¨Ä£
Êý¾ÝÁ÷½¨Ä££¨EAÖ¸ÄÏ£©
¡°Êý¾Ýºþ¡±£º¸ÅÄî¡¢ÌØÕ÷¡¢¼Ü¹¹Óë°¸Àý
ÔÚÏßÉ̳ÇÊý¾Ý¿âϵͳÉè¼Æ ˼·+Ч¹û
 
Ïà¹ØÎĵµ

GreenplumÊý¾Ý¿â»ù´¡Åàѵ
MySQL5.1ÐÔÄÜÓÅ»¯·½°¸
ijµçÉÌÊý¾ÝÖÐ̨¼Ü¹¹Êµ¼ù
MySQL¸ßÀ©Õ¹¼Ü¹¹Éè¼Æ
Ïà¹Ø¿Î³Ì

Êý¾ÝÖÎÀí¡¢Êý¾Ý¼Ü¹¹¼°Êý¾Ý±ê×¼
MongoDBʵս¿Î³Ì
²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿âÉè¼ÆÓëÓÅ»¯
PostgreSQLÊý¾Ý¿âʵսÅàѵ
×îл¼Æ»®
DeepSeekÔÚÈí¼þ²âÊÔÓ¦ÓÃʵ¼ù 4-12[ÔÚÏß]
DeepSeek´óÄ£ÐÍÓ¦Óÿª·¢Êµ¼ù 4-19[ÔÚÏß]
UAF¼Ü¹¹ÌåϵÓëʵ¼ù 4-11[±±¾©]
AIÖÇÄÜ»¯Èí¼þ²âÊÔ·½·¨Óëʵ¼ù 5-23[ÉϺ£]
»ùÓÚ UML ºÍEA½øÐзÖÎöÉè¼Æ 4-26[±±¾©]
ÒµÎñ¼Ü¹¹Éè¼ÆÓ뽨ģ 4-18[±±¾©]

MySQLË÷Òý±³ºóµÄÊý¾Ý½á¹¹
MySQLÐÔÄܵ÷ÓÅÓë¼Ü¹¹Éè¼Æ
SQL ServerÊý¾Ý¿â±¸·ÝÓë»Ö¸´
ÈÃÊý¾Ý¿â·ÉÆðÀ´ 10´óDB2ÓÅ»¯
oracleµÄÁÙʱ±í¿Õ¼äдÂú´ÅÅÌ
Êý¾Ý¿âµÄ¿çƽ̨Éè¼Æ

²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿â
¸ß¼¶Êý¾Ý¿â¼Ü¹¹Éè¼ÆÊ¦
HadoopÔ­ÀíÓëʵ¼ù
Oracle Êý¾Ý²Ö¿â
Êý¾Ý²Ö¿âºÍÊý¾ÝÍÚ¾ò
OracleÊý¾Ý¿â¿ª·¢Óë¹ÜÀí

GE Çø¿éÁ´¼¼ÊõÓëʵÏÖÅàѵ
º½Ìì¿Æ¹¤Ä³×Ó¹«Ë¾ Nodejs¸ß¼¶Ó¦Óÿª·¢
ÖÐÊ¢Òæ»ª ׿Խ¹ÜÀíÕß±ØÐë¾ß±¸µÄÎåÏîÄÜÁ¦
ijÐÅÏ¢¼¼Êõ¹«Ë¾ PythonÅàѵ
ij²©²ÊITϵͳ³§ÉÌ Ò×ÓÃÐÔ²âÊÔÓëÆÀ¹À
ÖйúÓÊ´¢ÒøÐÐ ²âÊÔ³ÉÊì¶ÈÄ£Ðͼ¯³É(TMMI)
ÖÐÎïÔº ²úÆ·¾­ÀíÓë²úÆ·¹ÜÀí