SparkµÄºËÐĸÅÄîÊÇRDD£¬¶øRDDµÄ¹Ø¼üÌØÐÔÖ®Ò»ÊÇÆä²»¿É±äÐÔ£¬À´¹æ±Ü·Ö²¼Ê½»·¾³Ï¸´Ôӵĸ÷ÖÖ²¢ÐÐÎÊÌâ¡£Õâ¸ö³éÏó£¬ÔÚÊý¾Ý·ÖÎöµÄÁìÓòÊÇûÓÐÎÊÌâµÄ£¬ËüÄÜ×î´ó»¯µÄ½â¾ö·Ö²¼Ê½ÎÊÌ⣬¼ò»¯¸÷ÖÖËã×ӵĸ´ÔÓ¶È£¬²¢Ìṩ¸ßÐÔÄܵķֲ¼Ê½Êý¾Ý´¦ÀíÔËËãÄÜÁ¦¡£
È»¶øÔÚ»úÆ÷ѧϰÁìÓò£¬RDDµÄÈõµãºÜ¿ìÒ²±©Â¶ÁË¡£»úÆ÷ѧϰµÄºËÐÄÊǵü´úºÍ²ÎÊý¸üС£RDDƾ½è×ÅÂß¼Éϲ»Â䵨µÄÄÚ´æ¼ÆËãÌØÐÔ£¬¿ÉÒԺܺõĽâ¾öµü´úµÄÎÊÌ⣬Ȼ¶øRDDµÄ²»¿É±äÐÔ£¬È´·Ç³£²»ÊʺϲÎÊý·´¸´¶à´Î¸üеÄÐèÇó¡£Õâ±¾ÖÊÉϵIJ»Æ¥ÅäÐÔ£¬µ¼ÖÂÁËSparkµÄMLlib¿â£¬·¢Õ¹Ò»Ö±·Ç³£»ºÂý£¬´Ó2015Ä꿪ʼ¾ÍûÓÐʵÖÊÐԵĴ´Ð£¬ÐÔÄÜÒ²²»ºÃ¡£
Ϊ´Ë£¬AngelÔÚÉè¼ÆÉú̬ȦµÄʱºò£¬ÓÅÏÈ¿¼ÂÇÁËSpark¡£ÔÚV1.0.0ÍÆ³öµÄʱºò£¬¾ÍÒѾ¾ß±¸ÁËSpark
on AngelµÄ¹¦ÄÜ£¬»ùÓÚAngelΪSpark¼ÓÉÏÁËPS¹¦ÄÜ£¬ÔÚ²»±äÖмÓÈëÁ˱仯µÄÒòËØ£¬¿ÉνÈ绢ÌíÒí¡£
ÎÒÃǽ«ÒÔL-BFGSΪÀý£¬À´·ÖÎöSparkÔÚ»úÆ÷ѧϰËã·¨µÄʵÏÖÉϵÄÎÊÌ⣬ÒÔ¼°Spark on AngelÊÇÈçºÎ½â¾öSparkÔÚ»úÆ÷ѧϰÈÎÎñÖеÄÓöµ½µÄÆ¿¾±£¬ÈÃSparkµÄ»úÆ÷ѧϰ¸ü¼ÓÇ¿´ó¡£
1. L-BFGSË㷨˵Ã÷

2.L-BFGSµÄSparkʵÏÖ


3.L-BFGSµÄSpark on AngelʵÏÖ
3.1 ʵÏÖ¿ò¼Ü
Spark on Angel½èÖúAngel PS-ServiceµÄ¹¦ÄÜΪSparkÒýÈëPSµÄ½ÇÉ«£¬¼õÇáÕû¸öËã·¨Á÷³Ì¶ÔdriverµÄÒÀÀµ¡£two-loop
recursionËã·¨µÄÔËËã½»¸øPS£¬¶ødriverÖ»¸ºÔðÈÎÎñµÄµ÷¶È£¬´ó´ó¼õÇáµÄ¶ÔdriverÐÔÄܵÄÒÀÀµ¡£
Angel PSÓÉÒ»×é·Ö²¼Ê½½Úµã×é³É£¬Ã¿¸övector¡¢matrix±»Çзֳɶà¸öpartition±£´æµ½²»Í¬µÄ½ÚµãÉÏ£¬Í¬Ê±Ö§³ÖvectorºÍmatrixÖ®¼äµÄÔËË㣻
3.2 ÐÔÄÜ·ÖÎö
Õû¸öËã·¨¹ý³Ì£¬driverÖ»¸ºÔðÈÎÎñµ÷¶È£¬¶ø¸´ÔÓµÄtwo-loop recursionÔËËãÔÚPSÉÏÔËÐУ¬ÌݶȵÄAggregateºÍÄ£Ð͵Äͬ²½ÊÇexecutorºÍPSÖ®¼ä½øÐУ¬ËùÓÐÔËËã¶¼±ä³É·Ö²¼Ê½¡£ÔÚÍøÂç´«ÊäÖУ¬¸ßά¶ÈµÄPSVector»á±»ÇгÉСµÄÊý¾Ý¿éÔÙ·¢Ë͵½Ä¿±ê½Úµã£¬ÕâÖÖ½ÚµãÖ®¼ä¶à¶Ô¶àµÄ´«Êä´ó´óÌá¸ßÁËÌݶȾۺϺÍÄ£ÐÍͬ²½µÄËÙ¶È¡£
ÕâÑùSpark on AngelÍêÈ«±Ü¿ªÁËSparkÖÐdriverµ¥µãµÄÆ¿¾±£¬ÒÔ¼°ÍøÂç´«Êä¸ßά¶ÈÏòÁ¿µÄÎÊÌâ¡£
4.¡°ÇáÒ×Ç¿¿ì¡±µÄSpark on Angel
Spark on AngelÊÇAngelΪ½â¾öSparkÔÚ»úÆ÷ѧϰģÐÍѵÁ·ÖеÄȱÏݶøÉè¼ÆµÄ¡°²å¼þ¡±£¬Ã»ÓжÔSpark×ö¡°ÇÖÈëʽ¡±µÄÐ޸ģ¬ÊÇÒ»¸ö¶ÀÁ¢µÄ¿ò¼Ü¡£¿ÉÒÔÓÃ
¡°Çᡱ¡¢¡°Òס±¡¢¡°Ç¿¡±¡¢¡°¿ì¡± À´¸ÅÀ¨Spark on AngelµÄÌØµã¡£
4.1 Çá ¡ª ¡°²å¼þʽ¡±µÄ¿ò¼Ü
Spark on AngelÊÇAngelΪ½â¾öSparkÔÚ»úÆ÷ѧϰģÐÍѵÁ·ÖеÄȱÏݶøÉè¼ÆµÄ¡°²å¼þ¡±¡£Spark
on AngelûÓжÔSparkÖеÄRDD×öÇÖÈëʽµÄÐ޸ģ¬Spark on AngelÊÇÒÀÀµÓÚSparkºÍAngelµÄ¿ò¼Ü£¬Í¬Ê±ÆäÂß¼ÓÖ¶ÀÁ¢ÓÚSparkºÍAngel¡£
Òò´Ë£¬SparkÓû§Ê¹ÓÃSpark on Angel·Ç³£¼òµ¥£¬Ö»ÐèÔÚSparkµÄÌá½»½Å±¾Àï×öÈý´¦¸Ä¶¯¼´¿É£¬ÏêÇé¿É¼ûAngelµÄGithub
Spark on Angel Quick StartÎĵµ
¿ÉÒÔ¿´µ½Ìá½»µÄSpark on AngelÈÎÎñ£¬Æä±¾ÖÊÉÏÒÀÈ»ÊÇÒ»¸öSparkÈÎÎñ£¬Õû¸öÈÎÎñµÄÖ´Ðйý³ÌÓëSparkÒ»ÑùµÄ¡£
source ${Angel_HOME}/bin/spark-on-angel-env.sh
$SPARK_HOME/bin/spark-submit \
--master yarn-cluster \
--conf spark.ps.jars=$SONA_ANGEL_JARS \
--conf spark.ps.instances=20 \
--conf spark.ps.cores=4 \
--conf spark.ps.memory=10g \
--jars $SONA_SPARK_JARS \
....
Spark on AngelÄܹ»³ÉΪÈç´ËÇáÁ¿¼¶µÄ¿ò¼Ü£¬µÃÒæÓÚAngel¶ÔPS-ServiceµÄ·â×°£¬Ê¹SparkµÄdriverºÍexecutor¿ÉÒÔͨ¹ýPsAgent¡¢PSClientÓëAngel
PS×öÊý¾Ý½»»¥¡£

4.2 Ç¿ ¡ª ¹¦ÄÜÇ¿´ó£¬Ö§³Öbreeze¿â
breeze¿âÊÇscalaʵÏÖµÄÃæÏò»úÆ÷ѧϰµÄÊýÖµÔËËã¿â¡£Spark MLlibµÄ´ó²¿·ÖÊýÖµÓÅ»¯Ëã·¨¶¼ÊÇͨ¹ýµ÷ÓÃbreezeÀ´Íê³ÉµÄ¡£ÈçÏÂËùʾ£¬SparkºÍSpark
on AngelÁ½ÖÖʵÏÖ¶¼ÊÇͨ¹ýµ÷ÓÃbreeze.optimize.LBFGSʵÏֵġ£SparkµÄʵÏÖÊÇ---BreezePSVector¡£-----
BreezePSVectorÊÇÖ¸Angel PSÉϵÄVector£¬¸ÃVectorʵÏÖÁËbreeze
NumericOpsÏµķ½·¨£¬Èç³£ÓÃµÄ dot£¬scale£¬axpy£¬addµÈÔËË㣬Òò´ËÔÚLBFGS[BreezePSVector]
two-loop recursionËã·¨Öеĸßά¶ÈÏòÁ¿ÔËËãÊÇBreezePSVectorÖ®¼äµÄÔËË㣬¶øBreezePSVectorÖ®¼äÈ«²¿ÔÚAngel
PSÉÏ·Ö²¼Ê½Íê³É¡£
SparkµÄL-BFGSʵÏÖ

4.3 Ò× ¡ª ±à³Ì½Ó¿Ú¼òµ¥
SparkÄܹ»ÔÚ´óÊý¾ÝÁìÓòÕâôÁ÷ÐеÄÁíÍâÒ»¸öÔÒòÊÇ£ºÆä±à³Ì·½Ê½¼òµ¥¡¢ÈÝÒ×Àí½â£¬Spark on AngelͬÑù¼Ì³ÐÁËÕâ¸öÌØÐÔ¡£
Spark on Angel±¾ÖÊÊÇÒ»¸öSparkÈÎÎñ£¬Õû¸ö´úÂëʵÏÖÂß¼¸úSparkÊÇÒ»Öµģ»µ±ÐèÒªÓëPSVector×öÔËËãʱ£¬µ÷ÓÃÏàÓ¦µÄ½Ó¿Ú¼´¿É¡£
ÈçÏ´úÂëËùʾ£¬LBFGSÔÚSparkºÍSpark on AngelÉϵÄʵÏÖ£¬¶þÕß´úÂëµÄÕûÌå˼·ÊÇÒ»ÑùµÄ£¬Ö÷ÒªµÄÇø±ðÊÇÌݶÈÏòÁ¿µÄAggregateºÍÄ£ÐÍ
µÄpull/push¡£ Òò´Ë£¬Èç¹û½«SparkµÄËã·¨¸ÄÔì³ÉSpark on AngelµÄÈÎÎñ£¬Ö»ÐèÒªÐÞ¸ÄÉÙÁ¿µÄ´úÂë¼´¿É¡£
L-BFGSÐèÒªÓû§ÊµÏÖDiffFunction£¬DiffFunctionµÄcalculte½Ó¿ÚÊäÈë²ÎÊýÊÇ
£¬±éÀúѵÁ·Êý¾Ý²¢·µ»Ø loss ºÍ gradient¡£
ÆäÍêÕû´úÂ룬ÇëǰÍùGithub SparseLogistic
SparkµÄDiffFunctionʵÏÖ
4.4 ¿ì ¡ª ÐÔÄÜÇ¿¾¢
ÎÒÃÇ·Ö±ðʵÏÖÁËSGD¡¢LBFGS¡¢OWLQNÈýÖÖÓÅ»¯·½·¨µÄLR£¬²¢ÔÚSparkºÍSpark on
AngelÉÏ×öÁËʵÑé¶Ô±È¡£ ¸ÃʵÑé´úÂëÇëǰÍùGithub SparseLRWithX.scala .
Êý¾Ý¼¯£ºÌÚѶÄÚ²¿Ä³ÒµÎñµÄÒ»·ÝÊý¾Ý¼¯£¬2.3ÒÚÑù±¾£¬5ǧÍòά¶È
ʵÑéÉèÖãº
˵Ã÷1£ºÈý×é¶Ô±ÈʵÑéµÄ×ÊÔ´ÅäÖÃÈçÏ£¬ÎÒÃǾ¡¿ÉÄܱ£Ö¤ËùÓÐÈÎÎñÔÚ×ÊÔ´³ä×ãµÄÇé¿öÏÂÖ´ÐУ¬Òò´ËÅäÖõÄ×ÊÔ´±Èʵ¼ÊÐèÒªµÄÆ«¶à£»
˵Ã÷2£ºÖ´ÐÐSparkÈÎÎñʱ£¬ÐèÒª¼Ó´óspark.driver.maxResultSize²ÎÊý£»¶øSpark
on Angel¾Í²»ÓÃÅäÖô˲ÎÊý¡£

ÈçÉÏÊý¾ÝËùʾ£¬Spark on AngelÏà½ÏÓÚSparkÔÚѵÁ·LRÄ£ÐÍʱÓÐ50%ÒÔÉϵļÓËÙ£»¶ÔÓÚÔ½¸´ÔÓµÄÄ£ÐÍ£¬Æä¼ÓËٵıÈÀýÔ½´ó¡£
5.½áÓï
Spark on AngelµÄ³öÏÖ¿ÉÒÔ¸ßЧ¡¢µÍ³É±¾µØ¿Ë·þSparkÔÚ»úÆ÷ѧϰÁìÓòÓöµ½µÄÆ¿¾±£»ÎÒÃǽ«¼ÌÐøÓÅ»¯Spark
on Angel£¬²¢Ìá¸ßÆäÐÔÄÜ¡£Ò²»¶Ó´ó¼ÒÔÚGithubÉÏÒ»Æð²ÎÓëÎÒÃǵĸĽø¡£
|