Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Modeler   Code  
»áÔ±   
 
   
 
 
     
   
 ¶©ÔÄ
  ¾èÖú
SparkÔÚVMµÄhadoop¼Ü¹¹»ù´¡Éϰ²×°
 
×÷Õߣºwjcquking À´Ô´£º¿ªÔ´ÏîÄ¿ ·¢²¼ÓÚ£º2015-6-23
  2836  次浏览      27
 

clusterÅäÖÃ

1 namenode£¬4 datanode

namenode: compute-n

datanode: compute-0-1, compute-0-2, compute-0-3, compute-0-4

°²×°µÄ°æ±¾

Linux °æ±¾

Linux compute-n 2.6.32-38-generic #83-Ubuntu SMP Wed Jan 4 11:12:07 UTC 2012 x86_64 GNU/Linux  

JDK

java version "1.8.0_40"  
Java(TM) SE Runtime Environment (build 1.8.0_40-b26)
Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)

Hadoop

Hadoop 2.6.0  
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1
Compiled by jenkins on 2014-11-13T21:10Z
Compiled with protoc 2.5.0
From source with checksum 18e43357c8f927c0695f1e9522859d6a
This command was run using /home/hadoop/hadoop-2.6.0/share/hadoop/common/hadoop-common-2.6.0.jar

1. ÏÂÔØSparkºÍScala

±¾ÈËÏÂÔØµÄÊÇSpark-2.6.0 ºÍ Scala 2.11.6

sparkÏÂÔØµØÖ· http://spark.apache.org/downloads.html

scalaÏÂÔØµØÖ· http://www.scala-lang.org/download/

2. ½âѹscala£¬ÅäÖÃscalaµÄ»·¾³±äÁ¿

tar -zxf scala-2.11.6.tgz</span>  

Ö®ºó½«ÎļþÒÆ¶¯µ½ /usr/lib/scala

mkdir /usr/lib/scala  
sudo mv scala-2.11.6 /usr/lib/scala</span>

½«scalaÒÆ¶¯µ½ÆäËûµÄ»úÆ÷ÉÏÈ¥

sudo scp -r scala-2.11.6  hadoop@compute-0-1:/home/hadoop/Downloads/  
ssh compute-0-1

3. °²×°Spark

3.1 ½âѹSpark£¬Òƶ¯µ½¶ÔӦĿ¼

tar -zxf spark-1.3.1-bin-hadoop2.6.tgz   

¿½±´sparkÎļþµ½ /usr/local/spark

3.2 ÅäÖû·¾³±äÁ¿

½øÈë/etc/profile

JAVA_HOME=/home/hadoop/jdk1.8.0_40  
HADOOP_HOME=/home/hadoop/hadoop-2.6.0
SCALA_HOME=/usr/lib/scala/scala-2.11.6/
SPARK_HOME=/usr/local/spark/spark-1.3.1-bin-hadoop2.6l

CLASSPATH=.:$JAVA_HOME/lib.tools.jar
PATH=${SCALA_HOME}/bin:$JAVA_HOME/bin:${SPARK_HOME} /bin:$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export SPARK_HOME SCALA_HOME JAVA_HOME CLASSPATH PATH

±£´æÍ˳öºó£¬Ê¹ÅäÖÃÉúЧ

source /etc/profile  

3.3 ÅäÖÃSpark

½øÈëSparkµÄÅäÖÃĿ¼conf

cp spark-env.sh.templates spark-env.sh  
cp slaves.templates slaves

ÐÞ¸Äspark-env.shÎļþ

export JAVA_HOME=/home/hadoop/jdk1.8.0_40  
export SCALA_HOME=/usr/lib/scala/scala-2.11.6/
export SPARK_MASTER_IP=10.119.178.200
export SPARK_WORKER_MEMORY=8G
export HADOOP_CONF_DIR=/home/hadoop/hadoop-2.6.0/etc/hadoop

ÐÞ¸Äslaves

compute-0-1  
compute-0-2
compute-0-3
compute-0-4

4. ¿½±´Í¬ÑùµÄ»·¾³µ½ÆäËûµÄ»úÆ÷

Ö®ºó½øÈë/usr/local Ŀ¼Ï£¬ ¿½±´Sparkµ½ÆäËûµÄ»úÆ÷ÉÏ£¬ÒòΪֱ½Ó¿½±´µ½/usr/local²»±»ÔÊÐí£¬ËùÒÔÏÈ¿½±´µ½DownloadsĿ¼ÏÂ

sudo scp -r spark hadoop@compute-0-1:/home/hadoop/Downloads/  
sudo scp -r spark /usr/local/

5.Æô¶¯Spark

5.1 ½øÈëspark°²×°Ä¿Â¼ÏµÄsbinÎļþÏÂ

ʹÓÃÒÔÏÂÃüÁî

./start-all.sh  

µ«ÊÇÌáʾÒÔÏ´íÎó

mkdir: cannot create directory `/usr/local/spark/spark-1.3.1-bin-hadoop2.6/sbin/../logs': Permission denied  

ÌáʾȨÏÞÓÐÎÊÌ⣬½øÈëÏà¶ÔÓ¦µÄslave£¬²é¿´·¢ÏÖsparkµÄownerÊÇroot£¬¸ü¸ÄsparkµÄȨÏÞ£¬½øÈë /usr/local Ŀ¼Ï½øÐÐÐÞ¸Ä

sudo chown -R -v hadoop:hadoop spark  

½á¹ûÈçÏÂ

5.2 ÖØÐÂÆô¶¯Spark, Ö®ºóÓÃjps²é¿´Ö÷½ÚµãºÍslave½ÚµãµÄ½ø³Ì

Ö÷½ÚµãµÄ½ø³Ì

×Ó½ÚµãµÄ½ø³Ì

Ö®ºó¿ÉÒÔ½øÈëSpark¼¯ÈºµÄwebÒ³Ãæ£¬·ÃÎÊ£ºcompute-n:8080

5.3 ½øÈëSparkµÄbinĿ¼£¬Æô¶¯spark-sheel¿ØÖÆÌ¨

./spark-shell  

ÎÒÃÇ¿ÉÒÔͨ¹ýcompute-n:4040, ´ÓwebµÄ½Ç¶ÈÀ´¿´SparkUIµÄÇé¿ö

6. ÔËÐÐÒ»¸öexample£¬À´ÑéÖ¤

6.1 Ê×ÏÈ£¬ÎÒÃÇÖªµÀÔÚshell»·¾³ÖÐÉú³ÉÁËÒ»¸ösc±äÁ¿£¬scÊÇSparkContextµÄʵÀý£¬ÕâÊÇÔÚÆô¶¯Spark shellµÄʱºòϵͳ°ïÖúÎÒÃÇ×Ô¶¯Éú³ÉµÄ¡£ÎÒÃÇÔÚ±àдSpark´úÂ룬ÎÞÂÛÊÇÒªÔËÐб¾µØ»¹ÊǼ¯Èº¶¼±ØÐëÓÐSparkContextµÄʵÀý

Ö®ºó½øÈëĿ¼ /usr/local/spark/spark-1.3.1-bin-hadoop2.6ÖÐ

hadoop fs -copyFromLocal README.md ./  

Ö®ºóÈ¥¶ÁÕâ¸öÎļþ

val file = sc.textFile("hdfs://compute-n:8025/user/hadoop/README.md")  

½á¹û

Ö®ºó´Ó¶ÁÈ¡µÄÎļþÖйýÂ˳öËùÓеĵġ°Spark¡±Õâ¸ö´Ê£¬ÔËÐÐ

val sparks = file.filter(line => line.contains("Spark"))

½á¹û

´ËʱÉú³ÉÁËÒ»¸öFilteredRDD

Ö®ºóͳ¼Æ¡°Spark¡±Ò»¹²³öÏÖÁ˶àÉٴΣ¬ÔËÐÐ

sparks.count  

Ö®ºó½øÈëcompute-n:4040ÍøÒ³²é¿´

   
2836 ´Îä¯ÀÀ       27
Ïà¹ØÎÄÕÂ

»ùÓÚEAµÄÊý¾Ý¿â½¨Ä£
Êý¾ÝÁ÷½¨Ä££¨EAÖ¸ÄÏ£©
¡°Êý¾Ýºþ¡±£º¸ÅÄî¡¢ÌØÕ÷¡¢¼Ü¹¹Óë°¸Àý
ÔÚÏßÉ̳ÇÊý¾Ý¿âϵͳÉè¼Æ ˼·+Ч¹û
 
Ïà¹ØÎĵµ

GreenplumÊý¾Ý¿â»ù´¡Åàѵ
MySQL5.1ÐÔÄÜÓÅ»¯·½°¸
ijµçÉÌÊý¾ÝÖÐ̨¼Ü¹¹Êµ¼ù
MySQL¸ßÀ©Õ¹¼Ü¹¹Éè¼Æ
Ïà¹Ø¿Î³Ì

Êý¾ÝÖÎÀí¡¢Êý¾Ý¼Ü¹¹¼°Êý¾Ý±ê×¼
MongoDBʵս¿Î³Ì
²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿âÉè¼ÆÓëÓÅ»¯
PostgreSQLÊý¾Ý¿âʵսÅàѵ
×îл¼Æ»®
DeepSeekÔÚÈí¼þ²âÊÔÓ¦ÓÃʵ¼ù 4-12[ÔÚÏß]
DeepSeek´óÄ£ÐÍÓ¦Óÿª·¢Êµ¼ù 4-19[ÔÚÏß]
UAF¼Ü¹¹ÌåϵÓëʵ¼ù 4-11[±±¾©]
AIÖÇÄÜ»¯Èí¼þ²âÊÔ·½·¨Óëʵ¼ù 5-23[ÉϺ£]
»ùÓÚ UML ºÍEA½øÐзÖÎöÉè¼Æ 4-26[±±¾©]
ÒµÎñ¼Ü¹¹Éè¼ÆÓ뽨ģ 4-18[±±¾©]

MySQLË÷Òý±³ºóµÄÊý¾Ý½á¹¹
MySQLÐÔÄܵ÷ÓÅÓë¼Ü¹¹Éè¼Æ
SQL ServerÊý¾Ý¿â±¸·ÝÓë»Ö¸´
ÈÃÊý¾Ý¿â·ÉÆðÀ´ 10´óDB2ÓÅ»¯
oracleµÄÁÙʱ±í¿Õ¼äдÂú´ÅÅÌ
Êý¾Ý¿âµÄ¿çƽ̨Éè¼Æ

²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿â
¸ß¼¶Êý¾Ý¿â¼Ü¹¹Éè¼ÆÊ¦
HadoopÔ­ÀíÓëʵ¼ù
Oracle Êý¾Ý²Ö¿â
Êý¾Ý²Ö¿âºÍÊý¾ÝÍÚ¾ò
OracleÊý¾Ý¿â¿ª·¢Óë¹ÜÀí

GE Çø¿éÁ´¼¼ÊõÓëʵÏÖÅàѵ
º½Ìì¿Æ¹¤Ä³×Ó¹«Ë¾ Nodejs¸ß¼¶Ó¦Óÿª·¢
ÖÐÊ¢Òæ»ª ׿Խ¹ÜÀíÕß±ØÐë¾ß±¸µÄÎåÏîÄÜÁ¦
ijÐÅÏ¢¼¼Êõ¹«Ë¾ PythonÅàѵ
ij²©²ÊITϵͳ³§ÉÌ Ò×ÓÃÐÔ²âÊÔÓëÆÀ¹À
ÖйúÓÊ´¢ÒøÐÐ ²âÊÔ³ÉÊì¶ÈÄ£Ðͼ¯³É(TMMI)
ÖÐÎïÔº ²úÆ·¾­ÀíÓë²úÆ·¹ÜÀí