Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Modeler   Code  
»áÔ±   
 
   
 
 
     
   
 ¶©ÔÄ
  ¾èÖú
SparkÖ´ÐÐģʽ½âÎö
 
À´Ô´£º²©¿Í ·¢²¼ÓÚ£º 2017-5-2
  2727  次浏览      30
 

Ò»¡¢Ö´ÐÐģʽ

Ìá½»½Å±¾³£¼ûµÄÓï·¨£º

./bin/spark-submit \

--class <main-class>

--master <master-url> \

--deploy-mode <deploy-mode> \

--conf <key>=<value> \

... # other options

<application-jar> \

[application-arguments]

½Å±¾ËµÃ÷£º

£¨1£©¡ª-class£º Ö÷À࣬¼´mainº¯ÊýËùÓеÄÀà

£¨2£©¡ª- master : masterµÄURL£¬¼ûÏÂÃæµÄÏêϸ˵Ã÷¡£

£¨3£©¡ª-deploy-mode:clientºÍcluster2ÖÖģʽ

£¨4£©¡ª-conf:Ö¸¶¨key=valueÐÎʽµÄÅäÖÃ

ÏÂÃæ¶Ô¸÷ÖÖģʽ×öÒ»¸ö˵Ã÷£º

1¡¢local

±¾µØÅܳÌÐò£¬Ò»°ãÓÃÀ´²âÊÔ¡£¿ÉÒÔÖ¸¶¨Ïß³ÌÊý

# Run application locally on 8 cores

./bin/spark-submit \

--class org.apache.spark.examples.SparkPi \

--master local[8] \

/path/to/examples.jar \

100

2.Standalone client

¸Ã·½Ê½Ó¦ÓÃÖ´ÐÐÁ÷³Ì£º

£¨1£©Óû§Æô¶¯¿Í»§¶Ë£¬Ö®ºó¿Í»§¶ËÔËÐÐÓû§³ÌÐò£¬Æô¶¯Driver½ø³Ì¡£ÔÚDriverÖÐÆô¶¯»òʵÀý»¯DAGSchedulerµÈ×é¼þ¡£ ¿Í»§¶ËµÄDriverÏòMaster×¢²á¡£

£¨2£©WorkerÏòMaster×¢²á£¬MasterÃüÁîWorkerÆô¶¯Exeuctor¡£Workerͨ¹ý´´½¨ExecutorRunnerỊ̈߳¬ÔÚExecutorRunnerÏß³ÌÄÚ²¿Æô¶¯ExecutorBackend½ø³Ì¡£

£¨3£©ExecutorBackendÆô¶¯ºó£¬Ïò¿Í»§¶ËDriver½ø³ÌÄÚµÄSchedulerBackend×¢²á£¬ÕâÑùDriver½ø³Ì¾ÍÄÜÕÒµ½¼ÆËã×ÊÔ´¡£DriverµÄDAGScheduler½âÎöÓ¦ÓÃÖеÄRDD DAG²¢Éú³ÉÏàÓ¦µÄStage£¬Ã¿¸öStage°üº¬µÄTaskSetͨ¹ýTaskScheduler·ÖÅ䏸Executor¡£ ÔÚExecutorÄÚ²¿Æô¶¯Ï̳߳ز¢Ðл¯Ö´ÐÐTask¡£

ʾÀý½Å±¾£º

# Run on a Spark Standalone cluster in client deploy mode

./bin/spark-submit \

--class org.apache.spark.examples.SparkPi \

--master spark://207.184.161.138:7077 \

--executor-memory 20G \

--total-executor-cores 100 \

/path/to/examples.jar \

1000

3¡¢standalone cluster

¸Ã·½Ê½Ó¦ÓÃÖ´ÐÐÁ÷³Ì£º

£¨1£©Óû§Æô¶¯¿Í»§¶Ë£¬¿Í»§¶ËÌá½»Ó¦ÓóÌÐò¸øMaster¡£

£¨2£©Masterµ÷¶ÈÓ¦Óã¬Õë¶Ôÿ¸öÓ¦Ó÷ַ¢¸øÖ¸¶¨µÄÒ»¸öWorkerÆô¶¯Driver£¬¼´Scheduler-Backend¡£ Worker½ÓÊÕµ½MasterÃüÁîºó´´½¨DriverRunnerỊ̈߳¬ÔÚDriverRunnerÏß³ÌÄÚ´´½¨SchedulerBackend½ø³Ì¡£Driver³äµ±Õû¸ö×÷ÒµµÄÖ÷¿Ø½ø³Ì¡£Master»áÖ¸¶¨ÆäËûWorkerÆô¶¯Exeuctor£¬¼´ExecutorBackend½ø³Ì£¬Ìṩ¼ÆËã×ÊÔ´¡£Á÷³ÌºÍÉÏÃæºÜÏàËÆ£¬£¨3£©Worker´´½¨ExecutorRunnerỊ̈߳¬ExecutorRunner»áÆô¶¯ExecutorBackend½ø³Ì¡£

£¨4£©ExecutorBackendÆô¶¯ºó£¬ÏòDriverµÄSchedulerBackend×¢²á£¬ÕâÑùDriver»ñÈ¡Á˼ÆËã×ÊÔ´¾Í¿ÉÒÔµ÷¶ÈºÍ½«ÈÎÎñ·Ö·¢µ½¼ÆËã½ÚµãÖ´ÐС£SchedulerBackend½ø³ÌÖаüº¬DAGScheduler£¬Ëü»á¸ù¾ÝRDDµÄDAGÇзÖStage£¬Éú³ÉTaskSet£¬²¢µ÷¶ÈºÍ·Ö·¢Taskµ½Executor¡£¶ÔÓÚÿ¸öStageµÄTaskSet£¬¶¼»á±»´æ·Åµ½TaskSchedulerÖС£TaskScheduler½«ÈÎÎñ·Ö·¢µ½Executor£¬Ö´ÐжàÏ̲߳¢ÐÐÈÎÎñ¡£

ʾÀý½Å±¾£º

# Run on a Spark Standalone cluster in cluster deploy mode with supervise

./bin/spark-submit \

--class org.apache.spark.examples.SparkPi \

--master spark://207.184.161.138:7077 \

--deploy-mode cluster

--supervise

--executor-memory 20G \

--total-executor-cores 100 \

/path/to/examples.jar \

1000

4¡¢yarn-client

ÔÚyarn-clientģʽÏ£¬DriverÔËÐÐÔÚClientÉÏ£¬Í¨¹ýApplicationMasterÏòRM»ñÈ¡×ÊÔ´¡£±¾µØDriver¸ºÔðÓëËùÓеÄexecutor container½øÐн»»¥£¬²¢½«×îºóµÄ½á¹û»ã×Ü¡£½áÊøµôÖÕ¶Ë£¬Ï൱ÓÚkillµôÕâ¸ösparkÓ¦Óá£Ò»°ãÀ´Ëµ£¬Èç¹ûÔËÐеĽá¹û½ö½ö·µ»Øµ½terminalÉÏʱÐèÒªÅäÖÃÕâ¸ö¡£

¿Í»§¶ËµÄDriver½«Ó¦ÓÃÌá½»¸øYarnºó£¬Yarn»áÏȺóÆô¶¯ApplicationMasterºÍexecutor£¬ÁíÍâApplicationMasterºÍexecutor¶¼ ÊÇ×°ÔØÔÚcontainerÀïÔËÐУ¬containerĬÈϵÄÄÚ´æÊÇ1G£¬ApplicationMaster·ÖÅäµÄÄÚ´æÊÇdriver- memory£¬executor·ÖÅäµÄÄÚ´æÊÇexecutor-memory¡£Í¬Ê±£¬ÒòΪDriverÔÚ¿Í»§¶Ë£¬ËùÒÔ³ÌÐòµÄÔËÐнá¹û¿ÉÒÔÔÚ¿Í»§¶ËÏÔ Ê¾£¬DriverÒÔ½ø³ÌÃûΪSparkSubmitµÄÐÎʽ´æÔÚ¡£

Yarn-clientģʽÏÂ×÷ÒµÖ´ÐÐÁ÷³Ì£º

1. ¿Í»§¶ËÉú³É×÷ÒµÐÅÏ¢Ìá½»¸øResourceManager(RM)

2. RMÔÚ±¾µØNodeManagerÆô¶¯container²¢½«Application Master(AM)·ÖÅ䏸¸ÃNodeManager(NM)

3. NM½ÓÊÕµ½RMµÄ·ÖÅ䣬Æô¶¯Application Master²¢³õʼ»¯×÷Òµ£¬´ËʱÕâ¸öNM¾Í³ÆÎªDriver

4. ApplicationÏòRMÉêÇë×ÊÔ´£¬·ÖÅä×ÊԴͬʱ֪ͨÆäËûNodeManagerÆô¶¯ÏàÓ¦µÄExecutor

5. ExecutorÏò±¾µØÆô¶¯µÄApplication Master×¢²á»ã±¨²¢Íê³ÉÏàÓ¦µÄÈÎÎñ

ʾÀý½Å±¾£º

# Run on a YARN client

export HADOOP_CONF_DIR=XXX

./bin/spark-submit \

--class org.apache.spark.examples.SparkPi \

--master yarn-client \

--executor-memory 20G \

--num-executors 50 \

/path/to/examples.jar \

1000

5¡¢Yarn-cluster

Spark DriverÊ×ÏÈ×÷Ϊһ¸öApplicationMasterÔÚYARN¼¯ÈºÖÐÆô¶¯£¬¿Í»§¶ËÌá½»¸øResourceManagerµÄÿһ¸öjob¶¼»áÔÚ¼¯ÈºµÄworker½ÚµãÉÏ·ÖÅäÒ»¸öΨһµÄApplicationMaster£¬ÓɸÃApplicationMaster¹ÜÀíÈ«ÉúÃüÖÜÆÚµÄÓ¦Óá£ÒòΪDriver³ÌÐòÔÚYARNÖÐÔËÐУ¬ËùÒÔÊÂÏȲ»ÓÃÆô¶¯Spark Master/Client£¬Ó¦ÓõÄÔËÐнá¹û²»ÄÜÔÚ¿Í»§¶ËÏÔʾ£¨¿ÉÒÔÔÚhistory serverÖв鿴£©£¬ËùÒÔ×îºÃ½«½á¹û±£´æÔÚHDFS¶ø·ÇstdoutÊä³ö£¬¿Í»§¶ËµÄÖÕ¶ËÏÔʾµÄÊÇ×÷ΪYARNµÄjobµÄ¼òµ¥ÔËÐÐ×´¿ö

Yarn-clusterģʽÏÂ×÷ÒµÖ´ÐÐÁ÷³Ì£º

1. ¿Í»§¶ËÉú³É×÷ÒµÐÅÏ¢Ìá½»¸øResourceManager(RM)

2. RMÔÚijһ¸öNodeManager(ÓÉYarn¾ö¶¨)Æô¶¯container²¢½«Application Master(AM)·ÖÅ䏸¸ÃNodeManager(NM)

3. NM½ÓÊÕµ½RMµÄ·ÖÅ䣬Æô¶¯Application Master²¢³õʼ»¯×÷Òµ£¬´ËʱÕâ¸öNM¾Í³ÆÎªDriver

4. ApplicationÏòRMÉêÇë×ÊÔ´£¬·ÖÅä×ÊԴͬʱ֪ͨÆäËûNodeManagerÆô¶¯ÏàÓ¦µÄExecutor

5. ExecutorÏòNMÉϵÄApplication Master×¢²á»ã±¨²¢Íê³ÉÏàÓ¦µÄÈÎÎñ

ʾÀý½Å±¾£º

# Run on a YARN cluster

export HADOOP_CONF_DIR=XXX

./bin/spark-submit \

--class org.apache.spark.examples.SparkPi \

--master yarn-cluster \

--executor-memory 20G \

--num-executors 50 \

/path/to/examples.jar \

1000

¶þ¡¢Ö´ÐÐ×¢ÒâÊÂÏî

1¡¢¹ØÓÚjar°ü

hadoopºÍsparkµÄÅäÖûᱻ×Ô¶¯¼ÓÔØµ½SparkContext,Òò´Ë£¬Ìá½»applicationʱֻÐèÒªÌá½»Óû§µÄ´úÂëÒÔ¼°ÆäËüÒÀÀµ°ü£¬ÕâÓÐ2ÖÖ×ö·¨£º

£¨1£©½«Óû§´úÂë´ò°ü³Éjar£¬È»ºóÔÚÌá½»applicationʱʹÓáª-jarÀ´Ìí¼ÓÒÀÀµjar°ü£¨ÍƼöʹÓÃÕâÖÖ·½·¨£©

£¨2£©½«Óû§´úÂëÓëÒÀÀµÒ»Æð´ò°ü³ÉÒ»¸ö´ó°ü assembly jar £¨´òÒ»¸öjar°ü£¬ÓпÉÄÜÕû¸öjar´ò³öÀ´ÉϰÙM£¬´ò°üºÍ·¢²¼¹ý³ÌÂý£©

2¡¢µ÷ÊÔģʽ

Spark Standlongģʽ£º

Ö»Ö§³ÖFIFO

Spark On Mesosģʽ£ºÓÐÁ½ÖÖµ÷¶Èģʽ

1) ´ÖÁ£¶Èģʽ£¨Coarse-grained Mode£©

2) ϸÁ£¶Èģʽ£¨Fine-grained Mode£©

Spark On YARNģʽ£º

Ŀǰ½öÖ§³Ö´ÖÁ£¶Èģʽ£¨Coarse-grained Mode£©

3¡¢¶ÁÈ¡ÅäÖÃÓÅÏȼ¶

ÔÚ´úÂëÖеÄSparkConfÖеÄÅäÖòÎÊý¾ßÓÐ×î¸ßÓÅÏȼ¶£¬Æä´ÎÊÇ´«ËÍspark-submit½Å±¾µÄ²ÎÊý£¬×îºóÊÇÅäÖÃÎļþ(conf/spark-defaults.conf)ÖеIJÎÊý¡£

Èç¹û²»Çå³þÅäÖòÎÊý´ÓºÎ¶øÀ´£¬¿ÉÒÔʹÓÃspark-submitµÄ¡ªverboseÑ¡ÏîÀ´´òÓ¡³öϸÁ£¶ÈµÄµ÷¶ÈÐÅÏ¢¡£

 

   
2727 ´Îä¯ÀÀ       30
Ïà¹ØÎÄÕÂ

»ùÓÚEAµÄÊý¾Ý¿â½¨Ä£
Êý¾ÝÁ÷½¨Ä££¨EAÖ¸ÄÏ£©
¡°Êý¾Ýºþ¡±£º¸ÅÄî¡¢ÌØÕ÷¡¢¼Ü¹¹Óë°¸Àý
ÔÚÏßÉ̳ÇÊý¾Ý¿âϵͳÉè¼Æ ˼·+Ч¹û
 
Ïà¹ØÎĵµ

GreenplumÊý¾Ý¿â»ù´¡Åàѵ
MySQL5.1ÐÔÄÜÓÅ»¯·½°¸
ijµçÉÌÊý¾ÝÖÐ̨¼Ü¹¹Êµ¼ù
MySQL¸ßÀ©Õ¹¼Ü¹¹Éè¼Æ
Ïà¹Ø¿Î³Ì

Êý¾ÝÖÎÀí¡¢Êý¾Ý¼Ü¹¹¼°Êý¾Ý±ê×¼
MongoDBʵս¿Î³Ì
²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿âÉè¼ÆÓëÓÅ»¯
PostgreSQLÊý¾Ý¿âʵսÅàѵ
×îл¼Æ»®
DeepSeekÔÚÈí¼þ²âÊÔÓ¦ÓÃʵ¼ù 4-12[ÔÚÏß]
DeepSeek´óÄ£ÐÍÓ¦Óÿª·¢Êµ¼ù 4-19[ÔÚÏß]
UAF¼Ü¹¹ÌåϵÓëʵ¼ù 4-11[±±¾©]
AIÖÇÄÜ»¯Èí¼þ²âÊÔ·½·¨Óëʵ¼ù 5-23[ÉϺ£]
»ùÓÚ UML ºÍEA½øÐзÖÎöÉè¼Æ 4-26[±±¾©]
ÒµÎñ¼Ü¹¹Éè¼ÆÓ뽨ģ 4-18[±±¾©]

APPÍÆ¹ãÖ®ÇÉÓù¤¾ß½øÐÐÊý¾Ý·ÖÎö
Hadoop Hive»ù´¡sqlÓï·¨
Ó¦Óö༶»º´æÄ£Ê½Ö§³Åº£Á¿¶Á·þÎñ
HBase ³¬Ïêϸ½éÉÜ
HBase¼¼ÊõÏêϸ½éÉÜ
Spark¶¯Ì¬×ÊÔ´·ÖÅä

HadoopÓëSpark´óÊý¾Ý¼Ü¹¹
HadoopÔ­ÀíÓë¸ß¼¶Êµ¼ù
HadoopÔ­Àí¡¢Ó¦ÓÃÓëÓÅ»¯
´óÊý¾ÝÌåϵ¿ò¼ÜÓëÓ¦ÓÃ
´óÊý¾ÝµÄ¼¼ÊõÓëʵ¼ù
Spark´óÊý¾Ý´¦Àí¼¼Êõ

GE Çø¿éÁ´¼¼ÊõÓëʵÏÖÅàѵ
º½Ìì¿Æ¹¤Ä³×Ó¹«Ë¾ Nodejs¸ß¼¶Ó¦Óÿª·¢
ÖÐÊ¢Òæ»ª ׿Խ¹ÜÀíÕß±ØÐë¾ß±¸µÄÎåÏîÄÜÁ¦
ijÐÅÏ¢¼¼Êõ¹«Ë¾ PythonÅàѵ
ij²©²ÊITϵͳ³§ÉÌ Ò×ÓÃÐÔ²âÊÔÓëÆÀ¹À
ÖйúÓÊ´¢ÒøÐÐ ²âÊÔ³ÉÊì¶ÈÄ£Ðͼ¯³É(TMMI)
ÖÐÎïÔº ²úÆ·¾­ÀíÓë²úÆ·¹ÜÀí