Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Modeler   Code  
»áÔ±   
 
   
 
 
     
   
 ¶©ÔÄ
  ¾èÖú
ÈÃSparkÈ绢ÌíÒíµÄZeppelin ¨C »ù´¡Æª
 
×÷Õߣºrangerwolf À´Ô´£º51cto ·¢²¼ÓÚ;2016-8-31
  4979  次浏览      27
 

¼ò½é

Spark ÊÇÒ»¸ö·Ç³£ºÃµÄ¼ÆËãÆ½Ì¨£¬Ö§³Ö¶àÖÖÓïÑÔ£¬Í¬Ê±»ùÓÚÄÚ´æµÄ¼ÆËãËÙ¶ÈÒ²·Ç³£¿ì¡£Õû¸ö¿ªÔ´ÉçÇøÒ²ºÜ»îÔ¾¡£

µ«ÊÇSparkÔÚÒ×ÓÃÐÔÉÏÃæ»¹ÊÇÓÐһЩÃÀÖв»×ã¡£ ¶ÔÓÚ¸Õ½Ó´¥µÄÈËÀ´Ëµ£¬ÉÏÊÖÒÔ¼°»·¾³´î½¨»¹ÊÇÓÐһЩÀ§ÄÑ¡£ ÁíÍ⣬Èç¹ûÏ£Íû½«½á¹û»æÖƳÉͼ±í·ÖÏí¸ø±ðÈË£¬»¹ÐèÒªºÜ³¤Ò»¶Î·³Ì¡£

ĿǰÒѾ­ÓÐһЩ½â¾ö·½°¸£º

¡¾TBD¡¿Jupyter Notebook

ʹÓúܹ㷺£¬µ«ÊÇ¿´ÆðÀ´Ö÷Òª»¹ÊÇÒÔǰipython-notebookµÄÔöÇ¿°æ¡£

Ŀǰ±ÊÕß¶ÔÆäÁ˽ⲻ¶à

Spark ĸ¹«Ë¾DataBricksÌṩµÄDataBricks Community Edition, ÀïÃæ×Ô´øSpark¼¯Èº + Notebook¡£

Ò×ÓÃÐÔ¡¢¹¦ÄÜÐÔ¶¼ºÜ²»´í¡£È±µãÊǼ¯Èº¼ÜÉèÔÚAWSÖ®ÉÏ£¬ÎÞ·¨¸ú×Ô¼º±¾µØµÄSpark ¼¯ÈºÁ¬ÔÚÒ»Æð

Apache Zeppelin

ÕâÊÇÒ»¸ö¸Õ¸Õ´ÓIncubationתÕýµÄÏîÄ¿

µ«ÊÇÒѾ­ÔÚ¸÷´ó¹«Ë¾¾ùÓвÉÓ㬱ÈÈçÃÀÍÅ¡¢Î¢ÈíµÈµÈ

±¾ÎÄÖ÷Òª¾ÍÊǽéÉÜÈçºÎÔÚ±¾µØ´î½¨Ò»¸öZeppelin ʹµÃSpark¸üÒ×Óã¬Í¬Ê±¿ÉÒԺܷ½±ãµÄ½«×Ô¼ºµÄ¹¤×÷³É¹¦Õ¹Ê¾¸ø¿Í»§

½èÓñðÈ˵ÄÒ»¸öЧ¹ûͼÕòÂ¥^_^

×¢Ò⣺

Zeppelin×Ô´øSparkʵÀý£¬ÄúÎÞÐè×Ô¼º¹¹½¨Ò»¸öSpark ¼¯Èº¾Í¿ÉÒÔѧϰZeppelin

Zeppelin µ±Ç°(2016Äê8ÔÂ19ÈÕ)×îа汾0.6.1, Ö»¼æÈÝ2.0+

1)Èç¹ûÄú±¾µØÓÐSpark ¼¯Èº²¢ÇÒ°æ±¾ÊÇ1.6.1 + Scala 2.10 , ÇëÏÂÔØZeppelin 0.6.0µÄ°æ±¾

2)Èç¹û¹ÙÍøµÄËٶȱȽÏÂý£¬¿ÉÒԲο¼ÏÂÃæµÄ·½Ê½µ½°Ù¶ÈÅÌÏÂÔØ

Á´½Ó: http://pan.baidu.com/s/1ctBBJo ÃÜÂë: e68g

1¡¢ ÏÂÔØ

Èç¹ûÄúÐèÒªµÄÊÇ0.6.0µÄ°æ±¾£¬¿ÉÒԲο¼ÉÏÃæ°Ù¶ÈÅ̵ÄÏÂÔØÁ´½Ó¡£

Èç¹ûÄúÐèÒªµÄÊÇ0.6.1+µÄ°æ±¾£¬¿ÉÒÔÖ±½Óµ½¹ÙÍøÏÂÔØ£¬ ÀïÃæµÄMirrorÏÂÔØËÙ¶ÈÒ»°ã»¹²»´í

2¡¢ °²×°

°æ±¾£º Zeppelin 0.6.0 + ×Ô½¨Spark¼¯Èº(1.6.1)

¸Ð¾õZeppelin»¹ÊDz»Ì«³ÉÊ죬²¢¿ªÏä¾ÍÓ㬻¹ÐèÒª²»ÉÙÈ˹¤µ÷Õû²ÅÄÜÕý³£¹¤×÷

1)½âѹ֮ºó£¬Ê×ÏÈÐèÒª´ÓÄ£°å´´½¨Ò»¸öеÄzeppelin-env.sh£¬ ²¢ÉèÖÃSPARK_HOME. ±ÈÈ磺

1export SPARK_HOME=/usr/lib/spark

Èç¹ûÊÇ»ùÓÚHadoop »òÕß Mesos ´î½¨µÄSpark ¼¯Èº£¬»¹ÐèÒª½øÐÐÁíÍâµÄÉèÖá£

2)´ÓÄ£°å´´½¨Ò»¸öеÄzeppelin-site.xml£¬²¢½«Ö®Ç°µÄ8080¶Ë¿Ú¸Äµ½±ÈÈç8089£¬±ÜÃâÓëTomcatµÈ¶Ë¿Ú³åÍ»

  <property> 
<name>zeppelin.server.port</name>
<value>8089</value>
<description>Server port.</description>
</property>

3)Ìæ»»jacksonÏà¹ØÀà¿â

a)ĬÈÏ×Ô´øµÄÊÇ2.5.*, µ«ÊÇʵ¼ÊʹÓõÄʱºòÖ¸¶¨µÄÊÇ2.4.4

b)²¢ÇÒ¿ÉÄÜ2.4.4 Óë 2.5.* ²¢²»ÍêÈ«¼æÈÝ¡£

c)Òò´ËÐèҪʹÓÃ2.4.4 Ìæ»»2.5.* £¬ ÓÐÏÂÃæ3¸öjarÐèÒªÌæ»»£º

 
jackson-annotations-2.4.4.jar 
jackson-core-2.4.4.jar
jackson-databind-2.4.4.jar

d)ÕâÕæµÄÊǷdz£¿ÓÈ˵ÄÒ»¸öµØ·½¡£¡£¡£

×öÍêÉÏËß¼¸²½Ö®ºó£¬¾Í¿ÉÒÔÆô¶¯À²£º

Æô¶¯/Í£Ö¹ÃüÁ

 bin/zeppelin-daemon.sh stop/start

Æô¶¯Ö®ºó£¬´ò¿ªhttp://localhost:8089 ¾Í¿ÉÒÔ¿´µ½ZeppelinµÄÖ÷½çÃæÀ²

3. ÅäÖÃSpark½âÊÍÆ÷

Spark InterpreterµÄÅäÖ÷dz£¼òµ¥£¬¿ÉÒÔÖ±½Ó²Î¿¼ÏÂͼµÄÅäÖ÷½Ê½£º

4. ¼¸µãʹÓþ­Ñé

Zeppline×Ô´ø±È½ÏÏêϸµÄTutorial, ¸÷λ¿´×Ô´øµÄnotebook tutorial ¿ÉÄÜЧ¹û¸üºÃ¡£ µ«ÊÇÎÒÔÚµÚÒ»´ÎʹÓõÄʱºò£¬Óöµ½Á˲»ÉÙ¿Ó£¬Ôڴ˼ǼÏÂÀ´£¬¸ø´ó¼Ò×ö¸ö²Î¿¼£º

(1) ÈÎÎñÌá½»Ö®ºó²»»á×Ô¶¯Í£Ö¹

µ±Zeppelin Ìá½»ÈÎÎñÖ®ºó£¬¿ÉÒÔ¿´µ½Spark Master UI ÉÏÃæ£¬µ±Ç°ÈÎÎñ¼´Ê¹Ö´ÐÐÍê³ÉÁË£¬Ò²²»»á×Ô¶¯Í˵ô

ÕâÊÇÒòΪ£¬Zeppelin ĬÈϾÍÏñÈËÊÖ¹¤ÔËÐÐÁËspark-shell spark://master-ip:7077 Ò»Ñù£¬ ³ý·ÇÊÖ¶¯¹Ø±ÕshellÃüÁ·ñÔò»áÒ»Ö±Õ¼ÓÃ×Å×ÊÔ´

½â¾ö°ì·¨¾ÍÊǽ«spark ½âÊÍÆ÷(interpreter) ÖØÆô

ÊÖ¶¯µÄÖØÆô°ì·¨£º

1.´ò¿ªInterpreter½çÃæ£¬ËÑË÷µ½Spark²¿·Ö²¢µã»÷ÖØÆô

2.ÍÆ¼ö£º µ÷ÓÃRestful API ½øÐÐÖØÆô¡£

a.¿ÉÒÔͨ¹ýChromeµÄNetwork ¼à¿Ø¿´Ò»Ïµã»÷restartÖ®ºó¾ßÌåµ÷ÓõÄAPIµÄÇé¿ö¡£ÈçÏÂͼ£º

b.Õâ¸öID(2BUDQXH2R)ÔÚ¸÷×ԵĻ·¾³¿ÉÄܸ÷²»Ïàͬ¡£ÁíÍâÕâ¸öAPIÊÇPUTµÄ·½Ê½£¬¿ÉÒÔÖ±½ÓʹÓÃÏÂÃæµÄpython´úÂëÔÚUIÉÏ×Ô¶¯ÖØÆô

 
 %python 
import requests
r = requests.put("http://IP:8089/api/interpreter/setting/restart/2BUDQXH2R")
print r.text

(2) Òì³£Ìáʾ£ºCannot call methods on a stopped SparkContext

±ÈÈçÎÒÃÇÔÚSpark Master UI ÉÏÃæ½«µ±Ç°job kill Ö®ºó£¬ÔÚZeppelinÕâ±ßÖØÆôÖ´ÐÐÈÎÎñ¾Í»áÓöµ½Õâ¸öÒì³£ÐÅÏ¢¡£

½â¾ö°ì·¨ºÜ¼òµ¥£º ÖØÆô½âÎöÆ÷

(3) ²»ÒªÖ÷¶¯µ÷Óà sc.stop()

ÕâÊǹٷ½Ã÷ȷ˵Ã÷µÄ£ºscala µÄspark-shell ×Ô¶¯³õʼ»¯ÁËSparkContext / SqlContext µÈµÈ

²»ÄÜ×Ô¼ºµ÷ÓÃsc.stop() Ö®ºóÖØÆô´´½¨Ò»¸öSparkContext

¿ÉÄܱÊÕßˮƽԭÒò£¬³¢ÊÔ×Ô¼º´´½¨ÐµÄsc Ö®ºó£¬¸÷ÖÖÆæÆæ¹Ö¹ÖµÄÎÊÌâ

(4) ¹ØÓÚpython module

Python Interpreter¿ÉÒÔʹÓõ±Ç°ZeppelinËùÔÚ»úÆ÷µÄpython ËùÓеÄmodel

ͬʱ֧³Öpython 2 Óë python 3

ÕâÊÇÒ»¸öºÜÓÐÓõŦÄÜ£¬±ÈÈçÎÒʹÓÃspark½«Êý¾Ý¼ÆËãÍê³ÉÖ®ºó£¬Éú³ÉÁËÒ»¸ö²¢²»Ì«´óµÄcsvÎļþ¡£Õâ¸öʱºòÍêÈ«¿ÉÒÔʹÓÃPandasÇ¿´óµÄ´¦ÀíÄÜÁ¦À´½øÐжþ´Î´¦Àí£¬²¢×îÖÕʹÓÃZeppelinµÄ×Ô¶¯»æÍ¼ÄÜÁ¦Éú³É±¨±í

ÓëTableauÖ®ÀàµÄBI¹¤¾ßÏà±È¹¦ÄܲîÁËһЩ£¬²»¹ý¸÷ÓÐËù³¤¡£Zeppelin ¶Ô³ÌÐòÔ±À´Ëµ¿ÉÒÔËãÊǷdz£·½±ãµÄÒ»¸ö¹¤¾ßÁË¡£ ¶ÔÈÕ³£µÄһЩ¼òµ¥±¨±íµÄ¹¤×÷Á¿´ó´ó¼õСÁË

(5) ¿ÉÒÔÉèÖÃ×Ô¶¯ÔËÐÐʱ¼ä

ÔÚÕû¸öNoteµÄ×îÉ϶ˣ¬¿ÉÒÔÉèÖõ±Ç°notebook ¶¨ÆÚÖ´ÐС£ ¶øÇÒ×¢Ò⣺ »¹¿ÉÒÔÉèÖÃÖ´ÐÐÍê³ÉÖ®ºó×Ô¶¯ÖØÆôinterpreter ²Î¿¼ÏÂͼ£º

   
4979 ´Îä¯ÀÀ       27
Ïà¹ØÎÄÕÂ

»ùÓÚEAµÄÊý¾Ý¿â½¨Ä£
Êý¾ÝÁ÷½¨Ä££¨EAÖ¸ÄÏ£©
¡°Êý¾Ýºþ¡±£º¸ÅÄî¡¢ÌØÕ÷¡¢¼Ü¹¹Óë°¸Àý
ÔÚÏßÉ̳ÇÊý¾Ý¿âϵͳÉè¼Æ ˼·+Ч¹û
 
Ïà¹ØÎĵµ

GreenplumÊý¾Ý¿â»ù´¡Åàѵ
MySQL5.1ÐÔÄÜÓÅ»¯·½°¸
ijµçÉÌÊý¾ÝÖÐ̨¼Ü¹¹Êµ¼ù
MySQL¸ßÀ©Õ¹¼Ü¹¹Éè¼Æ
Ïà¹Ø¿Î³Ì

Êý¾ÝÖÎÀí¡¢Êý¾Ý¼Ü¹¹¼°Êý¾Ý±ê×¼
MongoDBʵս¿Î³Ì
²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿âÉè¼ÆÓëÓÅ»¯
PostgreSQLÊý¾Ý¿âʵսÅàѵ
×îл¼Æ»®
DeepSeek´óÄ£ÐÍÓ¦Óÿª·¢ 6-12[ÏÃÃÅ]
È˹¤ÖÇÄÜ.»úÆ÷ѧϰTensorFlow 6-22[Ö±²¥]
»ùÓÚ UML ºÍEA½øÐзÖÎöÉè¼Æ 6-30[±±¾©]
ǶÈëʽÈí¼þ¼Ü¹¹-¸ß¼¶Êµ¼ù 7-9[±±¾©]
Óû§ÌåÑé¡¢Ò×ÓÃÐÔ²âÊÔÓëÆÀ¹À 7-25[Î÷°²]
ͼÊý¾Ý¿âÓë֪ʶͼÆ× 8-23[±±¾©]

APPÍÆ¹ãÖ®ÇÉÓù¤¾ß½øÐÐÊý¾Ý·ÖÎö
Hadoop Hive»ù´¡sqlÓï·¨
Ó¦Óö༶»º´æÄ£Ê½Ö§³Åº£Á¿¶Á·þÎñ
HBase ³¬Ïêϸ½éÉÜ
HBase¼¼ÊõÏêϸ½éÉÜ
Spark¶¯Ì¬×ÊÔ´·ÖÅä

HadoopÓëSpark´óÊý¾Ý¼Ü¹¹
HadoopÔ­ÀíÓë¸ß¼¶Êµ¼ù
HadoopÔ­Àí¡¢Ó¦ÓÃÓëÓÅ»¯
´óÊý¾ÝÌåϵ¿ò¼ÜÓëÓ¦ÓÃ
´óÊý¾ÝµÄ¼¼ÊõÓëʵ¼ù
Spark´óÊý¾Ý´¦Àí¼¼Êõ

GE Çø¿éÁ´¼¼ÊõÓëʵÏÖÅàѵ
º½Ìì¿Æ¹¤Ä³×Ó¹«Ë¾ Nodejs¸ß¼¶Ó¦Óÿª·¢
ÖÐÊ¢Òæ»ª ׿Խ¹ÜÀíÕß±ØÐë¾ß±¸µÄÎåÏîÄÜÁ¦
ijÐÅÏ¢¼¼Êõ¹«Ë¾ PythonÅàѵ
ij²©²ÊITϵͳ³§ÉÌ Ò×ÓÃÐÔ²âÊÔÓëÆÀ¹À
ÖйúÓÊ´¢ÒøÐÐ ²âÊÔ³ÉÊì¶ÈÄ£Ðͼ¯³É(TMMI)
ÖÐÎïÔº ²úÆ·¾­ÀíÓë²úÆ·¹ÜÀí