±¾ÎĽ«½éÉܶà¸öÔÚ Mesos ÉϽøÐи´ÔÓÊý¾Ý·ÖÎöµÄ¿ò¼Ü¡£ÎÒÃÇ»á½éÉÜÈçºÎÔÚ
Mesos Éϴ Storm ºÍ Spark Streaming À´´¦ÀíʵʱÊý¾ÝÁ÷£¬ÒÔ¼°ÈçºÎÔÚ Mesos
ÉÏÔËÐÐ NoSQL Êý¾Ý¿â Canssandra¡£
¸´ÔÓÊý¾ÝºÍ Lambda ¼Ü¹¹µÄÐËÆð
´óÊý¾ÝµÄ±¬Õ¨Ê½Ôö³¤²»½öÌåÏÖÔÚÉú²úµÄÊý¾ÝÁ¿ÉÏ£¬Ò²ÌåÏÖÔÚÒªÇó´¦Àíº£Á¿Êý¾ÝµÃµ½ÓÐÒâÒå½á¹ûµÄËٶȺͶàÑùÐÔÉÏ¡£Òò´Ë£¬Êý¾ÝºÍ¼ÆËãµÄËÙ¶ÈÍÆ¶¯¿ª·¢ÈËÔ±¿ª·¢ÊµÊ±Á÷´¦Àí¿ò¼Ü£¬Í¬Ê±£¬Êý¾ÝÌìÈ»µÄ¶àÑùÐÔºÍËÉÉ¢ÐÔÒ²´Ù½øÁËNoSQLµÄÑݽø¡£
Ëæ×ÅÎïÁªÍø£¨Internet of Things£¬IoT£©µÄÐËÆð£¬´«¸ÐÆ÷¡¢É罻ýÌå¡¢»úÆ÷ÊÂÎñ¡¢¼à¿ØµÈ¶¼ÔÚ´ó¹æÄ£µØ¸ßËÙ²úÉúÊý¾Ý¡£ÕâЩÊý¾ÝÄܹ»ÌṩµÄÐÅÏ¢·Ç³£ÓмÛÖµ£¬µ«ÊÇÈç¹ûÊý¾ÝµÄ·ÖÎö½á¹ûÓÐÑÓ³Ù£¬»òÕß½öÄÜ·ÖÎö¹ýÆÚÊý¾Ý£¬ÄÇôÕâЩÊý¾Ý¾Í»áɥʧ¼ÛÖµ¡£Ç°Ò»ÕÂÎÒÃǽéÉÜÁËʹÓÃ
Hadoop ºÍ Spark ¿ÉÒÔ´¦ÀíµÄÊý¾ÝÁ¿¡£ÕâЩ´«Í³¹¤¾ßºÜÊʺϱ»ÓÃÀ´Íê³ÉÅú´¦Àí»òÀëÏß·ÖÎö£¬µ«ÊÇËüÃDz»ÊÇΪʵʱÁ÷·ÖÎö»òµÍÑÓʱӦÓóÌÐòÉè¼ÆµÄ£¬±ÈÈ磬ÀàSQL²éѯ´¦Àí¡£
µ±Á÷´¦ÀíÔÚÏÖ´úÊý¾Ý¼Ü¹¹ÖÐÈÕÒæÖØÒªÊ±£¬ÏÖ´úÊý¾Ý¼Ü¹¹³öÏÖÁËÆäËûһЩ×é¼þ¡£ÏÖ´úÊý¾Ý¼Ü¹¹°üÀ¨·þÎñ²»Í¬ÐèÇóµÄ²»Í¬×é¼þ¡£Lambda
¼Ü¹¹£¨http://en.wikipedia.org/wiki/Lambda_architecture£©ÊÇÊ®·ÖÁ÷ÐеÄÉè¼ÆÊý¾Ý¼Ü¹¹µÄ·½Ê½¡£
ËüÖ÷Òª°üÀ¨Èý²ã£º
Åú´¦Àí²ã
ËٶȲã
·þÎñ²ã
ÔÚ Mesos ÉÏÔËÐÐ Lambda ¼Ü¹¹²»½ö¿ÉÒÔ¹²Ïí×ÊÔ´£¬¶øÇÒ¿ÉÒÔ°ïÖúÈÝ´í¡£Åú´¦Àí²ãµÄÀíÄîÊÇͨ¹ýËæÊ±´¦ÀíÊÕ¼¯µ½µÄÊý¾ÝÀ´µ¼³ö´¦Àí½á¹ûµÄ£¬±ÈÈ磬¹¹½¨Ô¤²âÄ£ÐÍ¡£Ç°Ò»ÕÂÒѾÑÝʾÁËÔÚ
Mesos ÉÏÈçºÎʹÓà Hadoop ºÍ Spark ½øÐÐÊý¾Ý´¦Àí¡£Apache Hama£¨https://hama.apache.org£©ÊÇÅú´¦Àí²ãµÄÁíÒ»ÖÖ¿ò¼Ü¡£ËüÊÇͨÓõĿéͬ²½´¦Àí£¨Bulk
Synchronous Processing£¬BSP£©¿ò¼Ü£¬ÔÚͼÏñ´¦ÀíºÍ¾ØÕó¼ÆËãÁìÓò·Ç³£ÓÐÓá£
Ëæ×ŶàÑù»¯ÐÂÊý¾ÝµÄ²úÉúËٶȲ»¶Ï¼Ó¿ì£¬½öÒÀÀµÀëÏßÄ£ÐÍ£¬ÖÜÆÚÐÔµØÀëÏß´¦ÀíÊý¾ÝÒѾ²»ÄÜÂú×ãÐèÇó¡£ËٶȲãµÄÀíÄîÊÇÔÚÊý¾ÝÉú³Éʱ¾ÍÍê³É´¦Àí¡£¸Ã²ãÖ÷Òª½øÐÐÁ÷´¦Àí£¬Ò²±»³ÆÎª¸´ÔÓʼþ´¦Àí£¨Complex
Event Processing£¬CEP£©»òʵʱ´¦Àí¡£Mesos Ö§³Ö Apache Samza¡¢Apache
Storm ºÍ Spark streaming ¿ò¼ÜÀ´ÊµÏÖËٶȲ㡣Apache Samza£¨http://samza.apache.org£©Êǹ¹½¨ÔÚ
Apache Kafka ¿ò¼ÜÖ®ÉϵÄÁ÷´¦Àí¿ò¼Ü¡£ÓÐÏîÄ¿ÕýÔÚ½« Samza ¼¯³Éµ½ Mesos Àhttps://github.com/Banno/samza-mesos£©¡£ÏÂÒ»½Ú»áÌÖÂÛ
Apache Storm ºÍ Spark Streaming¡£×¢ÒⲻͬµÄÁ÷´¦Àí¿ò¼ÜʹÓò»Í¬µÄ¼Ü¹¹£¬ÔÚËÙ¶È¡¢Ö§³ÖµÄ²Ù×÷¡¢Ò»ÖÂÐÔÓïÒåºÍ¿ÉÀ©Õ¹ÐÔÖ®¼ä²ÉÓò»Í¬µÄȨºâ´ëÊ©¡£
·þÎñ²ã¸ºÔð´æ´¢Åú´¦ÀíºÍËٶȲãµÄÊä³ö£¬²¢ÇÒ»ùÓÚÕâЩÊä³öÌṩ²éѯ·þÎñ¡£Mesos
ÌṩÁËÔ½À´Ô½¶àµÄÑ¡Ôñ£¬Îª´æ´¢ºÍ·þÎñÊý¾Ý¹¹½¨¿ÉÀ©Õ¹²ã¡£
HDFS£¨Hadoop ·Ö²¼Ê½Îļþϵͳ£©ÔÚÆÕͨӲ¼þÉÏÌṩ·Ö²¼Ê½Îļþϵͳ£¬µÚ 2 ÕÂÀïÓйýÏêϸ½éÉÜ¡£ÓÐÏîÄ¿ÕýÊÔͼÔÚ
Mesos ÉÏÔËÐÐ HDFS À´Ìṩ¸ß¿ÉÓÃµÄ HDFS¡£
Tachyon£¨http://www.tachyonproject.org£©ÊÇÒÔÄÚ´æÎªÖÐÐĵĴ洢ϵͳ¡£ÔÚ
https://github.com/mesosphere/tachyon-mesos ´¦ÓÐ Mesos
ÉÏ Tachyon µÄÔÐͰ汾¡£
Riak£¨https://github.com/basho/riak£©ÊÇ·Ö²¼Ê½µÄ¼üÖµ´æ´¢¡£ÓÐÏîÄ¿ÕýÊÔͼÈÃÆäÄÜÔÚ
Mesos ÉϹ¤×÷£¨https://github.com/edpaget/riak-mesos£©¡£
Elasticsearch£¨https://elasticsearch.org£©ÊÇ·Ö²¼Ê½È«Îı¾ËÑË÷ÒýÇæ¡£ElasticSearch
¿ÉÒÔÔÚ Mesos ÉÏÔËÐУ¨https://github.com/mesosphere/elasticsearch-mesos£©¡£
Apache Canssandra£¨http://cassandra.apache.org£©ÊÇ
NoSQL Êý¾Ý¿â£¬ÏÂÎĻὲÊöÈçºÎ½«ÆäÔËÐÐÔÚ Mesos ÉÏ¡£
³ýÁËÕâÈý¸ö²ã´ÎÖ®Í⣬»¹ÐèÒªÈý²ãÖ®¼äµÄÁ¬½ÓÆ÷À´½ÓÊÕÊý¾Ý£¬·¢Ë͵½ÆäËû²ã¡£Apache
Kafka£¨http://kafka.apache.org£©ÊÇ·Ö²¼Ê½·¢²¼-¶©ÔÄ£¨Ò²³ÆÎªpub/sub£©ÏûϢϵͳ¡£Pub/subϵͳÊÇÏÖ´úÊý¾Ý¼Ü¹¹µÄºǫ́»úÖÆ¡£ËüÃÇÒÔËÉÉ¢µÄ·½Ê½½«²»Í¬µÄÊý¾Ý´¦Àí¿ò¼ÜÁ¬½ÓÆðÀ´£¬ÊÊÓÃÓÚ¶àÖÖÓ¦Óó¡¾°¡£ÓÐÏîÄ¿ÕýÔÚ»ý¼«µØ½«Kafka¼¯³Éµ½MesosÀhttps://github.com/stealthly/kafka-mesos£©¡£ÁíÍâ£¬Ëæ×ÅÔ½À´Ô½¶àµÄ¿ò¼ÜʹÓü´½«·¢²¼µÄ
Mesos ÀïµÄ³Ö¾Ã»¯´æ´¢£¨https://issues.apache.org/jira/browse/MESOS-1554£©£¬Mesos
Éϵĸ´ÔÓÊý¾Ý´¦Àí»áÔ½À´Ô½½¡×³¡£Õâ²¢²»Òâζ×Å£¬¸´ÔÓÊý¾Ý·ÖÎöÐèÒª¸´ÔӵŤ¾ß¼¯£¬Mesos Ö§³ÖÐí¶àÆäËû¼Ü¹¹À´Âú×㸴ÔÓÊý¾Ý´¦ÀíµÄ¶àÑù»¯ÐèÇó¡£±¾Êé׫дʱ£¬ºÜ¶àÕâÑùµÄÏîÄ¿»¹´¦ÔÚÔçÆÚ·¢Õ¹½×¶Î£¬µ«ÊǶ¼ÒѾȡµÃÁËʵÖÊÐÔ½øÕ¹¡£
Storm
Apache Storm ÊÇʵʱ·Ö²¼Ê½Á÷ʼþ´¦ÀíÒýÇæ£¨https://storm.apache.org£©¡£Storm
ÌØµãÊÇÊÂÎñÐÔ¡¢¿É¿¿¡¢¿ÉÀ©Õ¹¡¢¿ÉÈÝ´í£¬²¢ÇÒÌṩÁËÒ×ÓÃµÄ API¡£ºÍ MapReduce Ïà±È£¬ËüµÄ¼Ü¹¹ÍêÈ«²»Í¬¡£MapReduce
ϵͳ£¬±ÈÈç Hadoop¡¢Spark µÈ¶¼½«´úÂëÒÆ¶¯µ½Êý¾Ý¸½½ü¡£ÕâÒâζ×ÅÔÚ MapReduce ¼Ü¹¹Àÿ¸ö½Úµã¶¼ÓÐһЩÊý¾Ý£¬Ã¿¸ö½ÚµãÒ²ÓµÓÐÍêÈ«ÏàͬµÄ´úÂëÀ´Éú³É½á¹û¡£µ«ÊÇÔÚ
Storm Àÿ¸ö½ÚµãÍê³É²»Í¬ÀàÐ͵Ĵ¦Àí£¬´¦Àí²»Í¬µÄÊý¾ÝÁ÷¡£
Storm µÄÖ÷Òª³éÏó¸ÅÄîÊÇÁ÷¹ý½ÚµãµÄÁ÷£¨Ôª×éÁ÷£©£¬Ã¿¸ö½ÚµãÍê³ÉһЩ´¦Àí¡£Ôª×é·Ç³£Í¨Ó㬿ÉÒÔ°üº¬Ò»ÏµÁÐÈÎÒâÀàÐ͵ĿÉÐòÁл¯µÄ¶ÔÏó¡£ÔÚ
Storm À´¦ÀíÐòÁÐÓÃÍØÆËÀ´ÃèÊö¡£ÍØÆËÓÀÔ¶ÔÚÔËÐУ¬ÔÚÁ÷Êý¾Ýµ½´ïʱÍê³É´¦Àí¡£

Storm ÍØÆË
ÉÏͼչʾÁË»ù±¾ÍØÆË¸ÅÄî¡£ÍØÆË°üº¬Êý¾ÝÔ´£¨spout£©ºÍÊý¾Ý²Ù×÷£¨bolt£©¡£Spout
ÊÇÊý¾ÝÔ´¡£ËüÃÇÕìÌýÊý¾ÝÔ´£¬½«Ôª×é·¢Ë͵½ÍØÆËÀÿһ´ÎÑ»·´ú±íÒ»´ÎÊý¾Ý²Ù×÷£¬Íê³ÉһЩ´¦Àí£¬ÕâЩÁ÷¶¯µÄÔª×éºÍÊý¾Ý²Ù×÷¹¹³ÉµÄ
DAG£¨ÓÐÏòÎÞ»·Í¼£©¾ÍÊÇÍØÆË¡£
Storm ÊÇÖ÷-´Ó¼Ü¹¹¡£Numbus ÊÇÖ÷½ÚµãÊØ»¤½ø³Ì£¬¸ºÔðе÷ºÍ¼à¿Ø¡£¹¤×÷½ÚµãÔËÐÐ
supervisor£¬Ö´ÐÐÍØÆËµÄÒ»²¿·Ö¡£Nimbus ºÍ supervisor ͨ¹ý ZooKeeper
»òÕß±¾µØ´ÅÅÌÏ໥ͨÐÅ¡£
StormÓкܶà¸ß¼¶ÌØÐÔ£¬±ÈÈ磬֧³Ö¾«Ï¸¼à¿Ø¡¢ÊÂÎñÓïÒåʹÓà Trident
µÈ£¬ÕâЩ²»ÔÚ±¾ÊéÌÖÂÛ·¶Î§¡£¸ü¶àStorm µÄÄÚÈÝ£¬²Î¿¼ Quinton Anderson µÄ¡¶Storm
ʵʱÊý¾Ý´¦Àí¡·¡£
Mesos É쵀 Storm
Nathan Marz ¿ª·¢Á˵÷¶ÈÆ÷ºÍÖ´ÐÐÆ÷µÄµÚÒ»¸ö°æ±¾£¬ËæºóÏîÄ¿ÔÚÉçÇø£¨https://github.com/mesos/storm£©Àï½øÒ»²½·¢Õ¹¡£ÔÚ
Mesos ÉÏÔËÐÐÈÎÒâ¿ò¼Ü£¬¶¼ÐèÒªµ÷¶ÈÆ÷À´´ú±í¿ò¼ÜµÄÈÎÎñÏò Mesos ÉêÇë×ÊÔ´£¬Ò²ÐèÒªÖ´ÐÐÆ÷ÔËÐÐÕâЩÈÎÎñ¡£Èçϲ½Öè¿ÉÒÔÍê³É
Mesos ÉÏ Storm µÄ°²×°£º
1.°²×° Mesos¡£
2.¿Ë¡ storm-mesos µÄ´æ´¢¿â£¬²¢ÇÒ½øÈë¸ùĿ¼£º
ubuntu@master:~ $ git clone https://github.com/mesos/storm ubuntu@master:~ $ cd storm |
3.´æ´¢¿âµÄ bin Ŀ¼Ï°üº¬ build-release.sh ½Å±¾¡£¸Ã½Å±¾°üº¬ºÜ¶à×ÓÃüÁ¿ÉÒÔͨ¹ý
-h ²ÎÊý²é¿´¡£ÏÈÏÂÔØÎ´¸ü¸ÄµÄ Apache Storm ·¢Ðа档¸Ã½Å±¾»áÏÂÔØ pom.xml ÎļþÀï
version ÊôÐÔÖ¸¶¨µÄ°æ±¾¡£Ä¬ÈÏÏÂÔØ×îа棬һ°ãÊÊÓÃÓÚ´ó¶àÊýÇé¿ö¡£Èç¹ûÐèÒª Storm µÄÌØ¶¨°æ±¾£¬ÐèÒª½«
version ÊôÐÔÉèÖÃΪËùÐè°æ±¾¡£±¾Êé׫дʱ£¬Ä¬Èϰ汾ÊÇ 0.9.3¡£Ò²¿ÉÒÔͨ¹ýÉèÖà MIRROR »·¾³±äÁ¿Ö¸¶¨ÏÂÔØ¾µÏñ£º
ubuntu@master:~/storm $ ./bin/build-release.sh downloadStormRelease |
4.ÉÏÒ»²½ÃüÁî»áÏÂÔØÃûΪ apache-storm-VERSION.zip
µÄÎļþ¡£½«ÏÂÔØµÄѹËõÎļþ×÷Ϊ½Å±¾µÄ²ÎÊý£º
ubuntu@master:~/storm $ ./bin/build-release.sh apache-storm-*.zip |
5.ÉÏÒ»²½ÃüÁî»áÔÚµ±Ç°Ä¿Â¼´´½¨ÃûΪ storm-mesos-VERSION.tgz
µÄ storm-mesos ·¢Ðа档
6.ÐèÒª¸üРStorm µÄÅäÖÃÀ´Æ¥Å伯ȺÉèÖá£Storm ʹÓà YAML
ÅäÖÃÎļþ¸ñʽ¡£¸üРconf/storm.yaml£¬Ìí¼ÓÈçÏÂÉèÖãº
mesos.master.url: "zk://master:2181/mesos" storm.zookeeper.servers: - "master" nimbus.host: "master" |
mesos.master.url ²ÎÊýÖ¸¶¨Îª <host:pair>£¬ÕâÊÇ
Mesos Ö÷½ÚµãÔËÐÐµÄ url¡£storm.zookeeper.servers Áгö Storm ʹÓõÄ
ZooKeeper ·þÎñÆ÷¡£nimbus.host Ö¸¶¨ Storm ¼¯ÈºµÄÖ÷½Úµã¡£Ï½ڻáÏêϸ½éÉÜ storm-mesos
µÄËùÓÐÅäÖÃÑ¡Ïî¡£
7.ÔËÐÐÈçÏÂÃüÁîÆô¶¯ Storm µÄ master-nimbus¡£
ubuntu@master:~/storm-mesos $ bin/storm-mesos nimbus |
8.Ò²¿ÉÒÔÑ¡ÔñÆô¶¯ Storm UI£¬ÔÚ <storm-master:port>
´¦·ÃÎÊ£¬±¾ÀýÖУ¬ÊÇ http://master:8080¡£
ubuntu@master:~/storm-mesos $ bin/storm ui |
ÖÁ´Ë£¬Storm ÒѾÔËÐÐÔÚ Mesos ÉÏÁË¡£Storm Web UI
ÈçÏÂͼËùʾ£¬ÏÔʾ¼¯ÈºÅäÖÃÓÐ 0 ¸ö supervisor£¬ÒòΪ supervisor ÊÇÔÚÔËÐÐÍØÆËʱ°´Ðè´´½¨µÄ¡£

ÁíÍ⣬ÔÚ Mesos UI ÉÏ£¬Storm »á±»ÁÐΪ»î¶¯¿ò¼Ü¡£ÏÖÔھͿÉÒÔÔËÐжàÖÖÁ÷´¦Àí×÷ÒµÁË¡£Storm
ÏîÄ¿ÔÚ examples/storm-starter Ŀ¼ÏÂÓзḻµÄÑùÀýÍØÆË¡£ÔËÐÐ ExclamationTopology£¬¸ÃÍØÆË»áÔÚÊäÈëµÄµ¥´Êºó¼ÓÉϸÐ̾ºÅ¡£ExclamationTopology
ÊÇÒ»¸ö»ù±¾ÍØÆË£¬´øÓÐÒ»¸öÊý¾ÝÔ´ word£¬Á½¸öÊý¾Ý²Ù×÷ exclaim1 ºÍ exclaim2£¬ÒÔÏßÐÔ·½Ê½Á´½Ó£º
ubuntu@master:~/storm-mesos $ bin/storm-mesos jar examples
/storm -starter/storm-starter-topologies-*.jar storm.starter.ExclamationTopology mytopology
|
×¢Ò⣺Storm ÒªÇó¸ø¶¨¼¯ÈºÀï£¬ÍØÆËÃû³ÆÊÇΨһµÄ¡£
ÉÏÊöÃüÁî»á½« ExclamationTopology Ìá½»µ½ Storm
¼¯Èº£¬ÃüÃûΪ mytopology¡£¿ÉÒÔʹÓÃÈçÏÂÃüÁîÐнӿÚÑéÖ¤¸ÃÍØÆËÕýÔÚÔËÐУº
ubuntu@master:~/storm-mesos $ bin/storm list |
»¹¿ÉÒÔʹÓà Storm Web ½Ó¿Ú²é¿´ÍØÆËµÄ¸÷ÖÖÐÅÏ¢¡£Êä³öÒÔ¼°ÆäËûÈÕÖ¾´æ´¢ÔÚ
logs Ŀ¼Ï£¬¸Ð̾ºÅÖ®ºó»áÏÔʾ Storm ÏîÄ¿µÄ¹±Ï×Õß¡£https://storm.apache.org/documentation
¿ÉÒÔ°ïÖúÀí½â¸ü¶à Storm ÀíÄî¼°¸÷ÖÖ Storm ÃüÁî¡£
Storm-Mesos ÅäÖÃStorm-mesos ʹÓûùÓÚ YAML
µÄÅäÖá£ÖÁÉÙÐèÒªÉèÖÃÈçÏÂÅäÖòÎÊý£º
1.mesos.executor.uri£ºÕâÊÇÖ´ÐÐÆ÷µÄ URI¡£
2.mesos.master.url£ºÕâÊÇ Mesos master µÄµØÖ·¡£
3.storm.zookeeper.servers£ºÕâÊÇ Storm Ö÷½ÚµãʹÓõÄ
ZooKeeper ·þÎñÆ÷¡£
4.nimbus.host£ºÕâÊÇÔËÐÐ Storm nimbus µÄÖ÷»úÃû¡£
¿ØÖÆ×ÊÔ´ÅäÖÃÊ®·ÖÖØÒª£¬ÒòΪÕâ»áÑÏÖØÓ°Ïìµ½ Storm µÄ¿ÉÀ©Õ¹ÐÔºÍÑÓʱ¡£Storm-mesos
ʹÓÃÈçÏÂÅäÖõ÷ÓÅ×ÊÔ´£º
1.topology.mesos.worker.cpu ºÍ topology.mesos.worker.mem.mb£ºÕâÁ½¸ö²ÎÊý·Ö±ðÖ¸¶¨Ã¿¸ö¹¤×÷½ÚµãµÄ
CPU ºÍÄڴ档ĬÈÏÖµ·Ö±ðΪ 1 ºÍ 1000MB¡£¸ÃÖµ±ØÐëÉèÖÃµÃ±È worker.childopts
¸ß 25%£¬À´ÊÊÓ¦ÈÎÎñµÄ¶îÍâÏûºÄ¡£±ÈÈ磬Èç¹û worker.childopts ÉèÖÃΪ -Xmx1000m£¬ÄÇô
topology.mesos.worker.mem.mb ±ØÐëÖÁÉÙÉèΪ 1250¡£
2.topology.mesos.executor.cpu ºÍ topology.mesos.executor.mem.mb£ºÕâÁ½¸ö²ÎÊý·Ö±ðÖ¸¶¨Ã¿¸öÖ´ÐÐÆ÷µÄ
CPU ºÍÄڴ档ĬÈÏÖµ·Ö±ðΪ 1 ºÍ 1000MB¡£
Storm-mesos »¹Ö¸³öϱí¿ÉÑ¡µÄÅäÖòÎÊý£º

Spark Streaming
ÎÒÃÇÒѾÁ˽⵽ Spark ¿ÉÒÔÓÃÀ´´¦Àí´óÁ¿Êý¾Ý¡£Spark Streaming
ÊÇ Spark API µÄÀ©Õ¹£¬ÓÃÀ´´¦ÀíÁ÷Êý¾Ý¡£ËüÖ§³Ö¸÷ÖÖ¸÷ÑùµÄÊäÈëÊý¾ÝÔ´£¬°üÀ¨ Twitter¡¢HDFS¡¢Kafka¡¢Flume¡¢Akka
Actor¡¢TCP socket ºÍ ZeroMQ¡£Spark Streaming ½«ÊäÈëÊý¾ÝÁ÷·Ö½â³ÉСÅúÁ¿£¬È»ºó
Spark ³ÌÐò´¦ÀíÕâЩÀëÉ¢»¯µÄÁ÷¡£ÒÑ´¦ÀíµÄСÅúÁ¿Êý¾Ý¿ÉÒÔÁ÷ת×ö½øÒ»²½´¦Àí£¬»òÕß±£´æµ½ HDFS¡¢Êý¾Ý¿âµÈÉÏ¡£
Spark Streaming ÓÐ DStream »òÕßÀëÉ¢Á÷£¨http://www.cs.berkeley.edu/~matei/papers/2012/hotcloud_spark_streaming.pdf£©µÄ»ù´¡³éÏó¡£Spark
ÄÚ²¿£¬DStream ÒÔ RDD ÐòÁеÄÐÎʽ´æÔÚ£¬DStream ÉϵIJÙ×÷±»×ª»¯Îª Dstream Àï
RDD µÄ²Ù×÷¡£ÕâÑù×ÔȻӵÓÐÁË RDD µÄËùÓÐÓÅÊÆ£¬±ÈÈçÒ»ÖÂÐÔ¡¢¼ì²éµãµÈ¡£ÏÂͼչʾÁË Spark ÊÇÈçºÎÆô¶¯Á÷´¦ÀíµÄ¡£

Spark Streaming ¼Ü¹¹
Spark Steaming Ö§³ÖºÜ¶à²»Í¬µÄ²Ù×÷£¬ÎÞ״̬ºÍÓÐ״̬²Ù×÷¶¼Ö§³Ö¡£
ϱíÁгöÁËĿǰ֧³ÖµÄ²Ù×÷£º
Spark Streaming Ö§³ÖµÄת»¯

Spark Streaming Ò²Ö§³Ö»ùÓÚ´°¿ÚµÄ²Ù×÷£¬Ò²¾ÍÊÇ˵¿ÉÒÔ²Ù×÷Êý¾ÝµÄ»¬¶¯´°¡£WindowLength
ºÍ slideInterval ²ÎÊý¿ØÖÆ´°¿ÚºÍ²Ù×÷¼ä¸ô¡£Ï±íÁгöÁËÖ§³ÖµÄ»ùÓÚ´°¿ÚµÄ²Ù×÷£º
Spark Steaming Ö§³ÖµÄ»ùÓÚ´°¿ÚµÄת»¯

ºÍ Spark ÀàËÆ£¬ÕâЩ²Ù×÷ÊǶèÐÔÖ´Ðеģ¬ÈçÏÂÊä³ö²Ù×÷²Å»á´¥·¢¼ÆË㣺
Spark Streaming Ö§³ÖµÄÊä³ö²Ù×÷

ÔÚ Mesos ÉÏÔËÐÐ Spark Streaming
Èç¹ûÒѾÔÚ Mesos ÉÏÔËÐÐÁË Spark£¬ÄÇÃ´ËæÊ±¶¼¿ÉÒÔ¿ªÊ¼Ê¹Óà Spark
Steaming¡£ÎÞÐëΪ Spark Streaming ×öÈκÎÌØÊâÅäÖá£Spark ·¢ÐаæÔÚ examples
Ŀ¼Ï°üº¬¶àÖÖÑùÀý£¬°üÀ¨ Spark Streaming ÑùÀý¡£ÈÃÎÒÃÇÔËÐÐ Spark Stream ÑùÀýÖ®Ò»£¬NetworkWordCount¡£¸ÃÑùÀýͳ¼ÆÃ¿Ãë¸ø¶¨ÍøÂçÁ÷Àï¸ø¶¨µ¥´Ê³öÏÖµÄÊýÁ¿¡£
Ê×ÏÈ£¬ÐèÒª´´½¨ÍøÂçÁ÷£¬¿ÉÒÔÔÚ TCP ¶Ë¿ÚʹÓà Netcat ·¢ËÍÎı¾¡£´ò¿ªÖÕ¶Ë£¬¼üÈëÈçÏÂÃüÁÔÚ¶Ë¿Ú
9999 Æô¶¯ Netcat ·þÎñÆ÷£º
ubuntu@master:~ $ nc -lk 9999 |
ÏÖÔÚÔÚÁíÒ»¸öÖÕ¶Ë£¬Æô¶¯ Spark Steaming ÑùÀý£º
ubuntu@master:~ $ cd spark ubuntu@master:~ $ ./bin/run-example streaming.NetworkWordCount localhost 9999 |
NetworkWordCount ÒѾÔÚÔËÐÐÁË£¬»áÕìÌýµÚÒ»¸öÖÕ¶ËÆô¶¯µÄ Netcat ÀïµÄÊäÈë¡£Ëü»á´òÓ¡µ¥´Ê¼°Ã¿Ãë¸Ãµ¥´Ê³öÏֵįµÂÊ¡£±ÈÈ磬Èç¹ûÎÒÃÇÔÚ
Netcat ´°¿Ú¼üÈë hello world£¬¾Í»áÔÚ Spark Streaming ´°¿Ú¿´µ½ÈçÏÂÊä³ö£º
Spark Streaming µ÷ÓÅ
Spark Streaming ÔÚ Mesos À↑Ïä¼´Óᣵ«ÊÇ£¬¶Ôϵͳ½øÐе÷ÓÅÒÔÂú×ãʵʱÁ÷´¦ÀíµÄÐèÇóºÍÓÅ»¯×ÊÔ´µÄʹÓÃÊÇÖÁ¹ØÖØÒªµÄ¡£ÈçÏÂÊǵ÷ÓÅÐèÒª¿¼Âǵö·½Ãæ¡£
Ñ¡ÔñÅúÁ¿´óС
Ñ¡ÔñÁ÷Êý¾ÝµÄÅúÁ¿´óСÊÇÄÜ·ñ¼°Ê±´¦ÀíÊäÈëÊý¾ÝµÄ¾ö¶¨ÐÔÒòËØ¡£ÅúÁ¿´óС²»ÄÜÉèÖùýС£¬·ñÔò¼¯Èº×ÊÔ´¿ÉÄܱ»ÀË·Ñ¡£ÁíÒ»·½Ã棬Èç¹ûÅúÁ¿´óСÉèÖùý´ó£¬Á÷¼ÆËã¿ÉÄܸú²»ÉÏ¡£Òò´Ë£¬ÍƼöµÄ·½Ê½ÊÇ´Ó¶ÔÓÚÓ¦ÓóÌÐò¶øÑԱȽϱ£ÊصÄÅúÁ¿´óС¿ªÊ¼£¬È»ºóÖð²½¼ì²â³ö¸üСµÄÊýÖµ¡£Èç¹ûÿ¸öÅú´ÎÊý¾Ý¶Ëµ½¶Ë´¦ÀíµÄËÙ¶ÈÄܹ»±ÈÊäÈëÅú´ÎÊý¾ÝµÄËٶȿ죬ÄÇô¾Í˵Ã÷ϵͳ¿ÉÒÔʤÈε±Ç°µÄ´¦ÀíËÙ¶È¡£Òª¶ÈÁ¿¸Ãʱ¼ä£¬¿ÉÒÔʹÓÃ
Spark ÌṩµÄorg.apache.spark.scheduler.StreamingListener
½Ó¿Ú¡£³ÖÐøÔö¼ÓµÄÑÓʱÊÇϵͳ²»ÄÜ´¦Àíµ±Ç°Êý¾ÝµÄÐźš£
À¬»ø»ØÊÕ
ÔÚÉú²ú»·¾³ÔËÐÐ Spark Streaming ʱ£¬ÐèÒª¸ñÍâ×¢ÒâµÄÖØÒªÅäÖòÎÊýÊÇ
spark.cleaner.ttl¡£¸Ã²ÎÊý¿ØÖÆ Spark ¼ÇÒäµÄÔªÊý¾ÝµÄ³¤¶È¡£Spark ĬÈϲ»»áÒÆ³ýÈκÎÔªÊý¾Ý£¬ÓÐÌ«¶àÔªÊý¾Ýʱ£¬¾Í»áÓ°Ïìµ½Á÷´¦ÀíÓ¦ÓóÌÐò¡£ÁíÍ⣬Èç¹û¸ÃÖµÉèÖõùýµÍ£¬´°¿Ú²Ù×÷¿ÉÄܾÍÎÞ·¨´¦Àí´°¿Ú³¤¶ÈÄÚµÄ
RDD¡£Òò´Ë£¬spark.cleaner.ttl µÄÖµ±ØÐëÉèÖõñÈÁ÷´¦ÀíÓ¦ÓóÌÐòµÄ×î´ó´°¿Ú³¤¶È¸ü´ó¡£Èç¹ûûÓÐÉèÖÃ
spark.cleaner.ttl£¬Spark »áÓÃ×î½ü×îÉÙʹÓã¨LRU£©µÄ·½Ê½Çå³ýËùÓÐ RDD¡£ÁíÍ⣬½«
spark.streaming.unpersist ÉèÖÃΪ true ¿ÉÒÔÆô¶¯Ò»ÖÖ¸üΪÖÇÄܵķ´³Ö¾Ã»¯·½°¸£¬ÏµÍ³»á¼ÆËã³öÄÄЩ
RDD ¿ÉÒÔ´ÓÄÚ´æÀïÒÆ³ý¡£ÁíÍâ£¬ÍÆ¼öʹÓà Java ÐéÄâ»úµÄ²¢·¢±ê¼ÇºÍÒÆ³ýʽÀ¬»ø»ØÊÕÆ÷£¬ÒòΪÕâÑùÔÊÐíºÜ¶àСÐ͵Ä
GC ÔÝÍ££¬¶ø²»ÊÇÒ»¸ö´óÐ͵ģ¬Õâ»áÈÃÁ÷´¦ÀíÑÓʱ¸üÎȶ¨¡£
²¢·¢
ʹÓÿÉÓü¯Èº×ÊÔ´²¢Ðл¯´¦ÀíÊ®·ÖÖØÒª¡£±ØÐë¸ø²Ù×÷´«ÊäºÏÊ浀 numTask
²ÎÊý£¬Ä¬ÈÏÖµÊÇ 8¡£Ò²¿ÉÒÔͨ¹ý spark.default.parallelism À´¸Ä±ä¸ÃĬÈÏÖµ¡£
¹ÊÕÏ´¦Àí
±ØÐ뿼Âǵ½Èç¹ûÇý¶¯½ÚµãºÍ¹¤×÷½Úµã·¢Éú¹ÊÕÏʱ¸ÃÈçºÎ´¦Àí£¬ÒòΪËùÓеÄÖмäÊý¾Ý¶¼¿ÉÒÔ¸ù¾Ý
RDD ´¦ÀíÁ´ÖØÐ¼ÆËã³öÀ´¡£ÎªÁËÈ·±£Çý¶¯Æ÷½ÚµãµÄ»Ö¸´£¬±ØÐëÆô¶¯¼ì²éµã£¨Í¨¹ý ssc.checkpoint
²ÎÊý£©£¬Ó¦ÓóÌÐò±ØÐë¼ì²éǰһ¸ö¼ì²éµã״̬ÊÇ·ñ´æÔÚ¡£Èç¹ûÊäÈëÔ´ÊÇÍøÂçÁ´½Ó¶ø¹¤×÷½ÚµãÔÚ¸´ÖÆÇ°Ê§°ÜÁË£¬»¹¿ÉÒÔ½«Êý¾Ý¸´ÖƵ½ÆäËû½ÚµãÉÏ£¬µ«ÊÇ¿ÉÄܻᶪʧһС²¿·ÖÊý¾Ý¡£¶ÔÓڳ־û¯ÊäÈë´æ´¢£¬±ÈÈç
HDFS£¬¹¤×÷½Úµãʧ°Ü²»»áÔì³ÉÈκÎÊý¾ÝµÄ¶ªÊ§¡£Spark StreamingÎĵµ£¨http://spark.apache.org/docs/latest/streaming-programming-guide.html£©ÏêϸÌÖÂÛÁË
Spark Streaming ËùÌṩµÄÈÝ´íÓïÒå¡£
ÈÎÎñ¶îÍ⿪Ïú
Mesos ÈÎÎñÆô¶¯µÄ¶îÍ⿪Ïú¶ÔÓÚµÍÑÓʱӦÓóÌÐò£¬±ÈÈç Spark Streaming£¬¿ÉÄÜÊÇÖÂÃüµÄ¡£Spark
Streaming ±ØÐëÔËÐÐÔÚϸÁ£¶È Mesos ģʽÏ£¬À´¼õÉÙÈÎÎñÆô¶¯µÄ¶îÍ⿪Ïú£¬µÚ 3 Õ¶ԴËÓÐÏêϸ½âÊÍ¡£ÁíÍ⣬ΪÁ˼õÉÙ
GC ÔÝÍ££¬Spark Streaming ½« RDD ÒÔÐòÁл¯¶þ½øÖƵĸñʽ³Ö¾Ã»¯´æ´¢ÏÂÀ´¡£ÕâÑù£¬ÐòÁл¯/·´ÐòÁл¯µÄ¶îÍ⿪Ïú¿ÉÄܴܺó£¬ÍƼöʹÓÿìËÙÐòÁл¯¿ò¼Ü£¬±ÈÈç
Kryo£¨https://github.com/EsotericSoftware/kryo£©¡£ÁíÍ⣬ÐòÁл¯ÈÎÎñÒ²¿ÉÄܼõÉÙÈÎÎñµÄÍøÂç´«Êäʱ¼ä£¬´Ó¶ø½µµÍÈÎÎñÆô¶¯µÄ¶îÍ⿪Ïú¡£
Mesos É쵀 NoSQL
ÊýÊ®ÄêÀSQL Ò»Ö±ÊÇÊý¾Ý·ÖÎöµÄÖ÷Òª¹¤¾ß¡£Ëæ×Å´óÊý¾ÝµÄÐËÆð£¬ºÜ¶àϵͳ³¢ÊÔ½«Êý¾Ý¿âÓ¦Óõ½´ó¹æÄ£¸´ÔÓÊý¾Ý·ÖÎöÁìÓò¡£ÕâÑùµÄϵͳ°üÀ¨
Hive¡¢Shark¡¢Spark SQL ºÍ NoSQL Êý¾Ý¿â£¬±ÈÈç Cassandra¡¢Hypertable
µÈ¡£¿ÉÒÔ½«ÕâЩÀàÐ͵Ť×÷¸ºÔض¼ÔËÐÐÔÚ Mesos ÉÏ£¬Í¬Ê±ÀûÓà Mesos µÄÓÅÊÆ£¬°üÀ¨×ÊÔ´¹²Ïí¡¢ÈÝ´íµÈ¡£Ï½ÚÏêϸ½éÉÜÔÚ
Mesos Éϰ²×° Cassandra µÄ²½Öè¡£
Mesos É쵀 Cassandra
Apache Cassandra£¨http://cassandra.apache.org£©ÊÇÁ÷ÐеÄ
NoSQL Êý¾Ý¿â¡£Cassandra ÓÉ Facebook ·¢Æð£¬Ôںܶà´ó¹æÄ£²¿Êð»·¾³ÉÏÆð×ÅÖØÒª×÷Óá£Í¨¹ýÔÚ
Mesos ÉÏÔËÐÐ Cassandra£¬¿ÉÒÔÀûÓà Mesos µÄÈÝ´íºÍÀ©Õ¹ÄÜÁ¦¡£Cassandra ·Ç³£ÊʺÏ
Mesos£¬ÒòΪÆä¼Ü¹¹ÊÇÍêȫȥÖÐÐÄ»¯µÄ¡£
ÒªÏëÔÚ Mesos ÉÏÔËÐÐ Cassandra£¬ÐèÒªµ÷¶ÈÆ÷ºÍ Mesos
е÷ Cassandra ËùÐèµÄ×ÊÔ´£¬Ö´ÐÐÆ÷ʵ¼ÊÔËÐÐ Cassandra µÄÊØ»¤½ø³Ì¡£µ÷¶ÈÆ÷»¹ÐèÒª½«ËùÓз¢ÐаæºÍÅäÖÃÎļþ¸´ÖƵ½ËùÓнڵãÉÏ¡£Cassandra
ÅäÖÃÐèÒª¶¨ÖÆÖÖ×ӽڵ㣬һµ©µ÷¶ÈÆ÷½ÓÊÜÁËÀ´×Ô Mesos µÄ offer£¬ÖÖ×Ó½Úµã¾Í»á°üº¬µ½µ÷¶ÈÆ÷Àï¡£ÏÂÁÐÊÇÔÚ
Mesos ÉÏÔËÐÐ Cassandra µÄ²½Ö裺
1.°²×° Mesos¡£
2.µÇÈë Mesos Ö÷½Úµã£¬´Ó Mesosphere ÏÂÔØ×îеÄÔ¤¹¹½¨µÄ
Cassandra-mesos ·¢Ðа档±¾Êé׫дʱ£¬×îаæÊÇ 2.0.5£º
ubuntu@master:~ $ wget http://downloads.mesosphere.io/cassandra/cassandra-mesos-2.0.5-1.tgz |
3.½âѹËõ¸ÃÎļþ£¬cd µ½Ä¿Â¼Ï£º
ubuntu@master:~ $ tar xzf cassandra-mesos-*.tgz ubuntu@master:~ $ cd cassandra-mesos-* |
4.ÐèÒª±à¼ conf/mesos.yaml À´·´Ó¦¼¯ÈºÅäÖá£Ä¬ÈÏÅäÖÃÕë¶ÔºÍ
ZooKeeper Ò»ÆðÔËÐеı¾µØ Mesos ¼¯Èº¡£Ï±íÁгöÅäÖÃÑ¡Ï
ÅäÖÃÑ¡Ïî

ÔÚ Mesos µ÷¶ÈÆ÷ÉÏÆô¶¯ Cassandra£º
ubuntu@master:~/cassandra-mesos $ bin/cassandra-mesos set -o errexit -o pipefail
FRAMEWORK_HOME='dirname $0'/..
dirname $0
export MESOS_NATIVE_LIBRARY=$(sed -e 's/:[^:\/\/]/="/g;s/$/"/g;s/
*=/=/g' "$FRAMEWORK_HOME"/conf/mesos.yaml
| tr -d $'\'' | grep -v \# | grep java.library.path
| sed 's/java.library.path=// g;s/"//g')
¡
# Start Cassandra on Mesos
¡
0 [main] INFO mesosphere.cassandra.Main$ - Starting
Cassandra on Mesos.
¡
114 [Thread-0] INFO mesosphere.cassandra.CassandraScheduler
- Starting Cassandra cluster ${clusterName} for
the first time. Allocating new ID for it.
¡
I0429 19:36:36.742849 27508 sched.cpp:391] Framework
registered with 20140429-193514-580538634-5050-25835-0000
175 [Thread-1] INFO mesosphere.cassandra.CassandraScheduler
- Framework registered as 20140429-193514-580538634-5050-25835-0000
437 [Thread-2] INFO mesosphere.cassandra.CassandraScheduler
- Got new resource offers ArrayBuffer(slave1)
455 [Thread-2] INFO mesosphere.cassandra.CassandraScheduler
- resources offered: List((cpus,2.0), (mem,6489.0),
(disk,7935.0), (ports,0.0))
455 [Thread-2] INFO mesosphere.cassandra.CassandraScheduler
- resources required: List((cpus,0.1), (mem,2048.0),
(disk,1000.0))
464 [Thread-2] INFO mesosphere.cassandra.CassandraScheduler
- Accepted offer: slave1
¡ |
ÕâÀ±»½Ø¶ÏµÄÊä³öÏÔʾ Cassandra ¼¯ÈºÒѾע²áµ½ Mesos ÉÏÁË£¬²¢ÇÒ´Ó slave1 ½ÓÊÕµ½ÁË×ÊÔ´
offer¡£ÏÖÔÚ Cassandra ÒѾÆô¶¯²¢ÔËÐÐÁË£¬ÔÚ Web UI ÀïÓ¦¸ÃÄÜ¿´µ½ÁгöµÄ¿ò¼ÜÀïÓÐ Cassandra
Test Cluster¡£¿ÉÒÔͨ¹ý Cassandra ²éѯÓïÑÔ£¨Cassandra Query Language
(CQL)£©shell ÓëÖ®½»»¥¡£´ÓÃüÁîÐлòÕßͨ¹ý Web UI£¨UIÉÏ µÄHost ×ֶΣ©£¬Ñ¡ÔñÔËÐÐ
Cassandra µÄÈÎÒâÖ÷»ú£¬Ê¹ÓÃÈçÏÂÃüÁîÁ¬½Óµ½ CQL »á»°ÖУº
ubuntu@master:~/cassandra-mesos $ bin/cqlsh <cassandra-host> |
Ó¦¸ÃÄܹ»¿´µ½ Cassandra Ìáʾ·û£¨>cqlsh£©¡£ÏÖÔھͿÉÒÔÔËÐÐ
Cassandra ²éѯÁË¡£ÁíÍâ×¢Òâ Mesos É쵀 Cassandra Ö§³ÖºÜ¶à³¡¾°ºÍ¹¦ÄÜ£¨±ÈÈçÀ©Õ¹ÐÔ£©£¬±¾ÎĶԴËûÓÐÉîÈë½éÉÜ¡£
С½á
ÅúÁ¿´¦Àíϵͳ²»ÔÙÊÇ¿ª·¢ÈËÔ±¿ÉÓõÄΨһÊý¾Ý´¦Àí¹¤¾ß£¬ÐµÄÓ¦ÓóÌÐò¼°ÒªÇó²»Í¬ÀàÐÍÊý¾ÝµÄ·ÖÎöÓÃÀý²ã³ö²»Ç¶ø²»ÊÇÖ»Óд«Í³ETL¹¤¾ßºÍ¿ò¼Ü£¬±ÈÈç
Hadoop£¬Ö§³ÖµÄÒ»ÖÖ³¡¾°¡£Òò´Ë£¬ÓëÆä³¢ÊÔÈÃÅúÁ¿´¦Àíϵͳ¸ü¿ì»òÀ©Õ¹´«Í³Êý¾Ý¿âÀ´´¦Àí·Ç½á¹¹»¯Êý¾Ý£¬µ¹²»ÈçʹÓÃÕýÈ·µÄ¹¤¾ßÀ´Íê³ÉÕâЩ¹¤×÷£¬´Ó¶øÈÃÕâЩӦÓóÌÐòµÄ¿ª·¢ºÍÀ©Õ¹¸üΪÈÝÒס£
±¾ÕÂ̽ÌÖÁËʹÓÃÔËÐÐÔÚ Mesos É쵀 Storm ºÍ Spark Streaming
´¦ÀíʵʱºÍÁ÷Êý¾ÝµÄ¿ÉÑ¡·½°¸¡£Ò²½²½âÁËÈçºÎʹÓÃÔËÐÐÔÚ Mesos É쵀 Cassandra ʵÏÖ¸ü¶à̽Ë÷ÐÔÊý¾Ý·ÖÎö¡£Cassandra
Ò²ÊÇ Mesos ÉϸüΪͨÓõÄÓ¦ÓóÌÐòʵÀýÖ®Ò»¡£
|