µ¼¶Á£º±¾ÎĽéÉܰٶȻùÓÚSparkµÄÒì¹¹·Ö²¼Ê½Éî¶Èѧϰϵͳ£¬°ÑSparkÓëÉî¶Èѧϰƽ̨PADDLE½áºÏÆðÀ´½â¾öPADDLEÓëÒµÎñÂß¼¼äµÄÊý¾Ýͨ·ÎÊÌ⣬ÔÚ´Ë»ù´¡ÉÏʹÓÃGPUÓëFPGAÒì¹¹¼ÆËãÌáÉýÿ̨»úÆ÷µÄÊý¾Ý´¦ÀíÄÜÁ¦£¬Ê¹ÓÃYARN¶ÔÒì¹¹×ÊÔ´×ö·ÖÅ䣬֧³ÖMulti-Tenancy£¬ÈÃ×ÊÔ´µÄʹÓøüÓÐЧ¡£
Éî²ãÉñ¾ÍøÂç¼¼Êõ×î½ü¼¸ÄêÈ¡µÃÁ˾޴óµÄÍ»ÆÆ£¬ÌرðÔÚÓïÒôºÍͼÏñʶ±ðÓ¦ÓÃÉÏÓÐÖʵķÉÔ¾£¬ÒѾ±»ÑéÖ¤Äܹ»Ê¹Óõ½Ðí¶àÒµÎñÉÏ¡£ÈçºÎ´ó¹æÄ£·Ö²¼Ê½µØÖ´ÐÐÉî¶Èѧϰ³ÌÐò£¬Ê¹Æä¸üºÃµØÖ§³Ö²»Í¬µÄÒµÎñÏß³ÉΪµ±ÎñÖ®¼±¡£ÔÚ¹ýÈ¥Á½Ä꣬°Ù¶ÈÉî¶ÈѧϰʵÑéÊÒÔÚÐìΰµÄ´øÁìÏ¿ª·¢ÁË·Ö²¼Ê½Éî¶Èѧϰƽ̨PADDLE£¨Parallel
Asynchronous Distributed Deep Learning£©£¬ºÜºÃµØÂú×ãÁËÐí¶àÒµÎñÐèÇó¡£µ«ÓÉÓÚPADDLEÊǶÀÁ¢µÄÉî¶Èѧϰƽ̨£¬²»ÄܺܺõظúÆäËûÒµÎñÂß¼½áºÏ£¬µ¼ÖÂPADDLEÓëÆäËûÒµÎñÂß¼¼äµÄÊý¾Ýͨ·³ÉΪÁËÐÔÄܵį¿¾±¡£ÎªÁËÈøü¶àµÄÒµÎñʹÓÃÉÏÉî¶Èѧϰ¼¼Êõ£¬ÎÒÃÇ¿ª·¢ÁËSpark
on PADDLEƽ̨£¬ÈÃPADDLE±ä³É°Ù¶ÈSparkÉú̬ϵͳµÄÒ»¸ö¹¦ÄÜÄ£¿é¡£ÔÚµÚÒ»°æÍê³ÉÖ®ºó£¬ÎÒÃÇ·¢ÏÖCPU¼ÆËãÄÜÁ¦ÒѾÂú×ã²»Á˰ٶȾ޴óµÄÊý¾ÝÁ¿ÐèÇó£¬ÓÚÊÇÎÒÃÇÔÚSpark
on PADDLEµÄ»ù´¡ÉÏÔö¼ÓÁ˶ÔÒì¹¹µÄÖ§³Ö£¬³ä·ÖÀûÓÃÁËGPUºÍFPGAµÈ×ÊÔ´È¥¼ÓËÙPADDLEÉϵÄ×÷Òµ¡£
Éî¶ÈѧϰϵͳPADDLEµÄÉè¼Æ
PADDLEÊÇÒ»¸ö³ÉÊìµÄ·Ö²¼Ê½Éî¶Èѧϰƽ̨£¬¹ã·ºÓ¦ÓÃÓڰٶȵÄͼÏñʶ±ð¡¢×ÔÈ»ÓïÑÔÀí½â¡¢ÓïÒô¡¢ÎÞÈ˳µµÈÁìÓò£¬ÆäÖ÷ÒªµÄÌØµãÊÇѵÁ·Ëã·¨¸ß¶ÈÓÅ»¯£¬Ö§³Ö¶àGPU/CPUѵÁ·£¬ÑµÁ·Ð§Âʸߣ¬¶ÔÏ¡ÊèÌØÕ÷ÓжÀÌØµÄÓÅ»¯¡£
ÏÖÓеÄÉî¶Èѧϰƽ̨£¬Ò»°ã¶¼ÊÇͨ¹ýµ¥»ú·½Ê½½øÐÐѵÁ·£¬È翪ԴµÄCaffeƽ̨ҲÊÇͨ¹ýµ¥»ú¶à¿¨µÄ·½Ê½½øÐÐѵÁ·¡£µ«µ±Êý¾Ý»òÕßÄ£Ð͹æÄ£ÉÏÈ¥ÒÔºó£¬ÒªÌá¸ßѵÁ·Ð§ÂÊ£¬±ØÈ»Òª½øÐзֲ¼Ê½ÑµÁ·£¬Ö÷ÒªÓÐÊý¾Ý²¢ÐкÍÄ£ÐͲ¢ÐÐÁ½ÖÖ·½·¨¡£
Êý¾Ý²¢ÐÐÊÇ·Ö²¼Ê½Éî¶ÈѧϰÓõÃ×î¶àµÄ²¢Ðз½·¨¡£ËùνÊý¾Ý²¢ÐУ¬¾ÍÊÇÒòΪѵÁ·Êý¾Ý¹æÄ£·Ç³£´ó£¬ÐèÒª°ÑÊý¾Ý²ð·Ö£¬°ÑÄ£ÐÍ·Ö²¼µ½N¸ö»úÆ÷ѵÁ·¡£µ«ÊÇÒòΪ×îÖÕѵÁ·µÄÊÇÒ»¸öÄ£ÐÍ£¬Í¬Ê±Ã¿¸ö»úÆ÷Ö»ÄÜ·ÖÅäµ½Ò»²¿·ÖÊý¾Ý£¬ÑµÁ·µÄͬ²½ºÍÊÕÁ²ÐÔ±ØÐëµÃµ½±£Ö¤¡£×î¾µäµÄ×ö·¨ÊÇÔÚ¡¶Parameter
Server for Distributed Machine Learning¡·ÖÐÌáµ½µÄÓòÎÊý·þÎñÆ÷£¨Parameter
Server£©µÄ·½·¨¡£¾ßÌåµÄÏë·¨ÊÇÓÃÄ£ÐͲÎÊý·þÎñµÄ·½·¨À´Í¬²½²ÎÊýµÄ¸üУ¬Ã¿¸ö²ÎÊý·þÎñÆ÷Ö»¸ºÔðͬ²½¹«¹²²ÎÊýµÄÒ»²¿·Ö¡£¾Ù¸öÀý×ÓÀ´Ëµ£¬Èç¹ûÄ£ÐÍM£¬±»·Ö²¼µ½N¸ö»úÆ÷ÉÏÃæÑµÁ·£¬Ã¿¸ö»úÆ÷Äõ½Ò»²¿·ÖÊý¾Ý.

¼ÙÉèѵÁ·µÄ²ÎÊý¼¯ºÏÊÇW£¬Ã¿¸ö»úÆ÷Ê×ÏȽøÐб¾µØÑµÁ·£¬¼ÙÉèËûÃdzõʼ»¯²ÎÊý¶¼ÊÇ

¸ù¾Ý

ÿ̨»úÆ÷¶¼ÄÜËã³öÏàÓ¦µÄ´ú¼Ûº¯ÊýµÄÌݶȣ¬Ò»°ã°´ÕÕµ¥»úÉñ¾ÍøÂç·´Ïò´«²¥µÄ·½Ê½£¬Ã¿¸ö²ã¶¼¿ÉÒÔÌݶÈÀ´µÃµ½²ÎÊýµÄÐÞÕýÖµ£¬ÕâÑù²ÎÊý¾Í±ä³É

ÒòΪÊǶà»ú£¬Ã¿¸ö½Úµã¶Ô²ÎÊýµÄÐÞÕýÁ¿²»Í¬£¬¾Í»á¶àÁËÒ»¸ö²½Öè°Ñ¸÷×Ô²ÎÊýµÄÐÞÕýÁ¿push¸ø²ÎÊý·þÎñÆ÷£¬ÓÉËüͳһ¾ö²ßϸöѵÁ·Ñ»·µÄÐÞÕýÁ¿£¬ÕâÑù´ó¼ÒµÄѵÁ·Ä£Ð;ͻᱻͳһÆðÀ´¡£

ͼ1 Êý¾Ý²¢ÐÐ
ͼ1չʾÁËÉî¶ÈѧϰÊý¾Ý²¢ÐеIJ¿Êð¼Ü¹¹¡£Ò»°ã·ÖΪÒÔϲ½Ö裻
1.ѵÁ·Êý¾ÝÔ¤´¦Àí£¬°ÑÊý¾ÝÇзÖΪdata shards£»
2. ÿ¸ö»úÆ÷µÃµ½Í¬ÑùµÄÄ£ÐͶ¨Òå,²¢ÇÒͳһ³õʼ»¯²ÎÊý£»
3. ¶ÔÓÚÿ¸öѵÁ·Ñ»·£¬Ã¿¸ö»úÆ÷Ëã¸÷×ÔµÄÌݶȣ¬²¢ÇÒ°ÑÌݶÈÐÞÕýÁ¿push¸ø²ÎÊý·þÎñÆ÷£¬²ÎÊý·þÎñÆ÷ͳһ¼ÆË㣬²¢ÇÒ°ÑÏÂÒ»ÂÖµü´úµÄ²ÎÊýpush¸ø±¾µØÑµÁ·»úÆ÷£»
4.²»¶ÏÑ»·£¬Ö±µ½Ä£ÐÍÊÕÁ²
²ÎÊý·þÎñÆ÷µÄ¸üÐÂËã·¨»¹·ÖΪͬ²½ºÍÒì²½µÄÇø±ð¡£ÒòΪÑϸñͬ²½µÄ·½·¨»áÈñ¾µØÑµÁ·»úÔÚÿһ¸öѵÁ·µü´ú¶¼»á½øÐвÎÊýµÄͬ²½¸üУ¬ÕâÑùÔÚÓÐÂý½ÚµãµÄÇé¿öÏ£¬Õû¸öѵÁ·¶¼»á±»ÍÏÂý¡£Òì²½²ÎÊý¸üеÄÏë·¨ÊÇÈòÎÊýͬ²½µÄƵÂʱ䳤£¬ÕâÑù¿ÉÒÔÈñ¾µØÑµÁ·»úµü´úºÃ¼¸¸ö»ØºÏÒÔºóÔÙ½øÐвÎÊýͬ²½£¬ÕâÑùµÄ×ö·¨ÓÐÀûÓбף¬ºÃ´¦ÊÇÂý½Úµã¶ÔÕâ¸öѵÁ·µÄÓ°Ïì±äС£¬»µ´¦ÊÇÿ¸öÄ£ÐÍѵÁ·¿ÉÄÜ»áÀË·ÑѵÁ·ÖÜÆÚ£¬ÒòΪͬ²½ÒÔºóµÄÐÞÕýÁ¿¿ÉÄܸú±¾µØÑµÁ·»ú×öµÄÐÞÕýÁ¿ÓкܴóµÄ²»Í¬¡£ÕâÆäÖжÔÓÚͬ²½ÆµÂʵİÑÎÕºÍÒì²½ÊÕÁ²ÐÔµÄÎÊÌâ¶¼ÊÇÑо¿µÄ·½Ïò¡£
Ä£ÐͲ¢Ðз½·¨Èçͼ2Ëùʾ£¬Õë¶Ô²ÎÊý¹æÄ£´ïµ½µ¥»úÎÞ·¨ÔØÈëµÄÁ¿¼¶»òÕßÄ£ÐÍ¼ä´æÔÚºÜÉÙÁ¬½ÓµÄÇø¿éµÄ³¡¾°£¬¿ÉÒÔ¿¼ÂÇ×öÄ£ÐͲ¢ÐУ¬µ«ÊÇÄ£ÐͲ¢ÐÐͨÐÅ¿ªÏúºÍͬ²½ÏûºÄ³¬¹ýÊý¾Ý²¢ÐУ¬Ð§ÂÊ¿ÉÄÜûÓÐÊý¾Ý²¢Ðиߡ£
ͼ2 Ä£ÐͲ¢ÐÐ
PADDLEµÄÉè¼ÆÖ÷Òª²ÉÓÃÁ˵¥»ú×öµ½Ä£ÐͲ¢ÐС¢¶à»ú×öµ½Êý¾Ý²¢Ðеķ½Ê½£¬´Ó¶ø´ïµ½ÒÚ¼¶Ä£Ð͹æÄ£ÒÔÉÏ£¬´ó¹æÄ£Êý¾ÝÁ¿µÄ·Ö²¼Ê½ÑµÁ·¡£
PADDLEÓëÒµÎñÂß¼½áºÏµÄÍ´µã
PADDLEÊÇÒ»¸ö¶ÀÁ¢µÄÉî¶Èѧϰƽ̨£¬²»ÄܺܺõØÖ§³Ö°ÑÊý¾Ý´ÓÆäËûƽ̨½ÓÈëµÄÐèÇó¡£Ñз¢ÈËԱͨ³£ÒªµÈÉÏÒ»½×¶ÎµÄ¹¤×÷Íê³É²úÉúPADDLEµÄÊäÈëÊý¾Ýºó£¬°ÑÊý¾ÝÏÈ´æÈëHDFS£¬ÔÙ¶Áµ½PADDLE¼¯ÈºµÄ±¾µØÄÚ´æÓëÓ²ÅÌ£¬µÈÊý¾Ý×¼±¸ºÃÒÔºóÔÙÓÃPADDLEȥѵÁ·Ä£ÐÍ¡£µÈÄ£ÐÍѵÁ·ºÃºó£¬ÔÙ°ÑÄ£ÐÍ´æÔÚHDFSÀÈÃÏÂÒ»¸öÒµÎñÂ߼ȥ¶ÁÈ¡¡£Õâ¸ö¹ý³Ì²»½öºÄʱ³¤£¬³ÉΪÕû¸ö¼ÆËãÁ÷³ÌµÄÆ¿¾±£¬²¢ÇÒ¶¼ÊÇÖØ¸´ÐԵĿÝÔ﹤×÷£¬Ó°ÏìÁËPADDLEƽ̨µÄÍÆ¹ã£¬ÈúܶàÓÐÐèÒªµÄÍŶÓû·¨ÓÃÉÏÉî¶Èѧϰ¼¼Êõ¡£
ΪÁ˽â¾öÕâ¸öÎÊÌ⣬ÎÒÃÇÉè¼ÆÁËSpark on PADDLE¼Ü¹¹£¬°ÑSparkÓëPADDLEñîºÏÆðÀ´£¬ÈÃPADDLE³ÉΪSparkµÄÒ»¸öÄ£¿é¡£Èçͼ3Ëùʾ£¬Ä£ÐÍѵÁ·¿ÉÒÔÓëǰ¶ËµÄ¹¦ÄÜÕûºÏ£¬±ÈÈçÌØÕ÷Ìáȡͨ¹ýRDDµÄÐÎʽ½øÐÐÊý¾Ý´«µÝ£¬ÎÞÐèͨ¹ýHDFS½øÐÐÊý¾Ýµ¼Á÷¡£ÕâÑùÒ»À´£¬PADDLEÓëÒµÎñÂß¼¼äµÄÊý¾Ýͨ·²»ÔÙÊÇÐÔÄÜÆ¿¾±¡£

ͼ3 »ùÓÚ°Ù¶ÈSparkµÄͨÓÃÒµÎñÂß¼
Spark on PADDLE¼Ü¹¹1.0°æ
SparkÊǽü¼¸Äê¿ìËÙÐËÆðµÄ´óÊý¾Ý´¦ÀíÆ½Ì¨£¬²»½ö½öÔÚÓÚËüµÄ¼ÆËãÄ£Ðͱȴ«Í³µÄHadoop
MapReduceÒª¸ßЧºÜ¶à£¬Í¬Ê±ÔÚÓÚËüËù´øÀ´µÄÉú̬ϵͳ·Ç³£Ç¿´ó¡£»ùÓÚSpark¼ÆËãÒýÇæ¹¹½¨µÄÉϲãÓ¦ÓÃÈçSpark
SQL¡¢Spark Streaming¡¢Spark MLlibµÈ£¬¶¼ÊǺÜÓÅÐãµÄÓ¦Ó㬱ȴ«Í³Ó¦ÓÃÐÔÄܺü¸±¶£¬²¢ÇÒ¸ü¼ÓÎȶ¨¡£Í¬Ê±ÓëYarn/MesosµÄ½áºÏÈÃSpark¶Ô¼ÆËã×ÊÔ´µÄ¹ÜÀíºÍ·ÖÅä¸ü¼ÓÁé»î¡£
SparkÔÚ°Ù¶ÈÄÚ²¿ÒѾ¹ã·ºÓ¦Óã¬Ö÷ÒªÓÃÓÚÊý¾Ý´¦ÀíºÍÊý¾Ý·ÖÎö¡£µ«ÊÇ´«Í³µÄÊý¾Ý´¦ÀíÆ½Ì¨±Ø¶¨»áÓиù¾ÝÊý¾ÝѵÁ·Ä£Ð͵ĻúÖÆ£¬¹ã¸æÏµÍ³µÄCTRÔ¤²â¾ÍÊÇÒ»¸öÀý×Ó£¬¶ÔÓÚÓû§²úÉú´óÁ¿µÄµã»÷ºÍä¯ÀÀÈÕÖ¾£¬Spark¿ÉÒÔ½øÐд¦ÀíºÍÇåÏ´¡£µ«ÊǶÔÓÚ´ó¹æÄ£Ä£Ð͵ÄѵÁ·£¬Spark
MLlibµÄÖ§³Ö»¹ÊÇÓÐÏÞ£¬ÌرðÊǶÔÓÚÉî¶ÈѧϰµÄÖ§³Ö£¬ËùÒÔÐèÒª½â¾öÔÚSparkÉÏÖ§³ÖPADDLEµÄÎÊÌâ¡£
¶ÔÓÚÓû§µÄÓ¦ÓóÌÐò£¬Spark½ÐÇý¶¯½Úµã£¨Driver£©,¿ÉÒÔÊÓΪSparkÓû§·Ö²¼Ê½³ÌÐòµ÷¶ÈºÍ³ÌÐòÁ÷¿ØÖƵÄÖ÷½Úµã¡£Spark³ÌÐòµÄ¾ßÌåÔËËã¶¼·Ö²¼ÔÚWorker
NodeÉÏÃæµÄExecutorÅÜ¡£Spark»¹ÓÐÒ»¸ö·Ç³£ÖØÒªµÄ¸ÅÄî½ÐRDD£¬ÕâÊÇÒ»¸ö·Ö²¼Ê½µÄ·ÖÇø£¨partitioned£©Êý¾Ý³éÏ󼯡£SparkËùÓÐÊäÈëºÍÊä³öÊý¾Ý¶¼ÊÇÒÔRDDΪµ¼ÏòµÄ£¬Ëü²»½öÃèÊöÁËÊý¾Ý¼¯µÄÒÀÀµ¹ØÏµ£¬Í¬Ê±»¹¶ÔÊý¾Ý½øÐÐÁËÂß¼ÉϵÄÇз֣¬¶ÔÒ»¸öRDD²Ù×÷Ò»°ã¶¼ÊÇpartitionÀ´²¢Ðеġ£

ͼ4 Spark DNNѵÁ·ÔËÐй¹¼Ü
Spark DNNѵÁ·ÔËÐй¹¼ÜÈçͼ4Ëùʾ£¬ÑµÁ·Ò»°ã·ÖΪÒÔÏÂ5¸ö²½Ö裺
1.DNN Êý¾ÝÔ¤´¦ÀíºÍѵÁ·ÌØÕ÷×¼±¸
Ò»°ãÕâÊÇSparkµÄÇ¿Ï²»¹ÜÊÇÁ÷ʽÊý¾Ý»¹ÊÇÒѾÂäÅ̵ÄÊý¾Ý¶¼Í¨¹ýSparkÀ´½øÐÐÊý¾Ý´¦Àí£¬ÆäÖаüÀ¨Êý¾ÝÇåÏ´¡¢ÌØÕ÷×¼±¸£¬È»ºó°ÑµÃµ½µÄѵÁ·Êý¾ÝÓÃRDDÊä³ö¡£
2.×ÊÔ´ÉêÇë
SparkѵÁ·ÈÎÎñÌá½»µÄʱºòÏÈ´ÓYarnÄÇÀïÄõ½¶ÔÓÚDNNѵÁ·ÈÎÎñµÄ½Úµã×ÊÔ´£¬±ÈÈç˵һ¸öѵÁ·ÈÎÎñÐèÒª4¸öÓÐ4
GPU»úÆ÷µÄ½Úµã¡£Yarn»á¶Ô×ÊÔ´×öContainerʽµÄ¹ÜÀí£¬²»¹ÜCPU»¹ÊÇGPU¶ÔÓÚYarnÀ´Ëµ¶¼ÊÇÒ»¸öÐéÄâµÄ×ÊÔ´¡£ºóÎÄ»á×ö¾ßÌå½éÉÜ¡£
3.ѵÁ·³õʼ»¯
Driver»á¸ù¾ÝYarn·ÖÅäµÄ×ÊÔ´ÏàÓ¦·Ö·¢Ä£ÐÍÅäÖá£Ä£ÐÍѵÁ·×ÊÔ´¿â£¬²¢ÇÒÆô¶¯ÑµÁ·»úºÍ²ÎÊý·þÎñÆ÷£¬Í¬Ê±³õʼ»¯Ä£Ð͵ijõʼ²ÎÊý¡£
4.Ä£ÐÍѵÁ·
ѵÁ·µÄÊý¾Ý»áÒÔRDDµÄ·½Ê½ÊäÈ뵽ѵÁ·»ú½Ó¿Ú£¬ÒÔÊý¾Ý²¢Ðеķ½Ê½½øÐÐѵÁ·£¬²¢ÇÒÆô¶¯µÄѵÁ·»ú»á¸ú²ÎÊý·þÎñÆ÷ͨÐÅ£¬Íê³ÉÌݶȽ»»»ºÍ²ÎÊýͬ²½£¬µ±ÑµÁ·×î´óµü´ú´ïµ½»òÕßÄ£ÐÍÊÕÁ²£¬ÔòѵÁ·ÖÕÖ¹¡£
5.Ä£ÐÍÔ¤²â
Ä£ÐÍ¿ÉÒÔ´«Èëijһ¸ö·þÎñÆ÷¼¯Èº»òÕßÒÔSpark StreamingµÄ·½Ê½½øÐÐÔØÈë²¢ÇÒÔ¤²â¡£
ÔÚSpark on PADDLE 1.0¿ª·¢µÄ¹ý³ÌÖУ¬ÎÒÃÇÑéÖ¤ÁËSparkȷʵ¿ÉÒÔ°ÑETL¡¢ÑµÁ·Êý¾ÝÔ¤´¦ÀíºÍÉî¶ÈѧϰѵÁ·½áºÏÆðÀ´£¬Í¬Ê±·¢ÏÖ°Ù¶ÈÄÚ²¿ÓкܶàÉî¶ÈѧϰÐèÇó£¬ÐèÒªÔÚ1.0µÄ»ù´¡ÉÏ¿¼ÂǰÑSpark
on PADDLEƽ̨»¯£¬×öµ½Multi-TenancyµÄ×ÊÔ´¹ÜÀí¡¢ÑµÁ·¼à¿Ø¡¢ÑµÁ·ÈÝ´íµÈµÈ¡£
Spark on PADDLE ¼Ü¹¹2.0°æ
ƽ̨»¯ÊÇSpark on PADDLE 2.0µÄÖ÷ҪĿ±ê¡£ËüÒýÈëÁ˸ü¶àµÄ¹¦ÄÜ£¬Ö÷Òª°üÀ¨ÔÚѵÁ·¹ý³ÌÖÐÒýÈëÁË¼à¿Ø»úÖÆ¡¢ÈÝ´í»úÖÆ£¬¼ÓÈëÁËML¾ö²ßÄ£¿é×ö³¬²ÎÊýÑ¡ÔñµÈ¡£ÏÂÃæÊǶÔSpark
on PADDLE 2.0Éè¼ÆµÄ·ÖÎö¡£
Èçͼ5¡¢Í¼6Ëùʾ£¬¿Í»§¿ÉÒÔÖ±½ÓÓëSpark DNN DriverͨÐÅÆô¶¯DNNѵÁ·£¬Spark
DNN DriverÆô¶¯Ò»¸öѵÁ·ÊµÀý£¨Training Instance£©£¬²¢ÇÒ͸´«ÑµÁ·Êý¾Ý¡¢ÑµÁ·ÍøÂçÅäÖõÈÐÅÏ¢¡£Ò»¸öѵÁ·ÊµÀý°üÀ¨ÁËѵÁ·ËùÐèµÄÕûÌå·þÎñ£¬°üÀ¨Ò»×éѵÁ·Æ÷ÒÔ¼°¶ÔÓ¦µÄ²ÎÊý·þÎñÆ÷¡£È»ºóÓÐÒ»¸öѵÁ·Master£¨Training
Master£©À´¹ÜÀíÕû¸öµÄѵÁ·½ø³Ì¡£Í¬Ê±ÑµÁ·Master¹ÜÀíѵÁ·Æ÷ºÍ³¬²ÎÊý·þÎñÆ÷µÄÉú´æÖÜÆÚºÍʧ°ÜÖØÆô¡£²ÎÊý·þÎñÆ÷ºÍѵÁ·Æ÷»á¶¨ÆÚ¸øÑµÁ·Master·¢ËÍheartbeat£¬È·±£ÆäÕý³£ÔËÐС£

ͼ5 Spark on PADDLE 2.0 ×ÜÌå¼Ü¹¹
ͼ6 Spark on PADDLE 2.0 Training Instance¼Ü¹¹
ѵÁ·¹ý³ÌÖÐµÄ¼à¿Ø»úÖÆ
µ±ÑµÁ·¿ªÊ¼ÒÔºó£¬Óû§»á¶ÔѵÁ·¹ý³ÌÖеÄһЩÊý¾Ý½øÐÐ¼à¿Ø£¬°üÀ¨ÑµÁ·µÄÿ¸öµü´úµÄlossÖµ¡¢´íÎóÂÊ¡¢ËùÓõÄʱ¼äÒÔ¼°ÑµÁ·»úºÍ²ÎÊý·þÎñÆ÷µÄÈÕÖ¾½øÐÐ¼à¿Ø£¬ÎÒÃÇÔÚʵÏֵĹý³ÌÖлáÔÚWorker¶ËÓÃÏûÏ¢´«µÝµÄ·½Ê½£¨AKKA£©ÏòDriver¶Ë»ã±¨ÑµÁ·µÄÊý¾Ý¡£¶ÔÓÚÕû¸öSpark
JobµÄÐÔÄÜÊý¾Ý»áÒÀÀµSpark±¾ÉíÌṩµÄ¼à¿Ø¹¦ÄÜ£¬ËùÓÐÐÅÏ¢¶¼·´À¡ÔÚ¼à¿ØÒ³ÃæÖУ¨Web UI£©¡£
ѵÁ·¹ý³ÌÖеÄÈÝ´í»úÖÆ
ÒòΪDNNÔÚѵÁ·¹ý³ÌÖУ¬ÑµÁ·»úºÍ²ÎÊý·þÎñÆ÷¶¼ÊÇÓпÉÄÜʧ°ÜµÄµØ·½¡£×î¼òµ¥µÄÈÝ´í·½Ê½ÊǶ¨ÆÚ¶ÔÄ£Ð͵IJÎÊýºÍѵÁ·ÐÅÏ¢×ö±¸·Ý£¬µ±Ä£ÐÍѵÁ·Ê§°ÜÒԺ󣬴ӱ¸·Ýµã¿ªÊ¼ÖØÆôÄ£ÐÍѵÁ·¾Í¿ÉÒÔ¡£ÑµÁ·Master»á°ÑÕâЩÐÅÏ¢ÊÕ¼¯ÆðÀ´£¬²¢Çһ㱨¸øSpark
DNN Driver¡£¶ÔÓÚ²ÎÊý·þÎñÆ÷µÄÈÝ´í£¬¿ÉÒÔ²ÉÈ¡Ôö¼ÓÈßÓàµÄ·½·¨£¬Èç¹ûÒ»¸ö²ÎÊý·þÎñÆ÷¹Òµô£¬ÑµÁ·Master»á¸ºÔðÖØÆôÏàÓ¦·þÎñ£¬µ«ÊÇ»áÓÐÒ»¸ö±¸·ÝµÄ²ÎÊý·þÎñÆ÷È¥¸ºÔð¹ÒµôµÄ²ÎÊý·þÎñÆ÷µÄ²ÎÊý¸üС£
³¬²ÎÊýÑ¡Ôñ

ͼ7 ³¬²ÎÊýÑ¡ÔñѵÁ·
³¬²ÎÊýÊÇÈ·Á¢Ä£ÐÍѵÁ·µÄ»ù´¡£¬SparkÔÚMLlibÖÐÒýÈëÁ˳¬²ÎÊýÑ¡ÔñÄ£¿é£¬Ö÷ÒªµÄ×ö·¨¾ÍÊÇͨ¹ýÒ»¶¨µÄ³¬²ÎÊýÑ¡ÔñËã·¨¶ÔÄ£ÐͽøÐв¢ÐÐѵÁ·£¬×îÖÕÑ¡ÔñµÄ³¬²ÎÊý½«»á±»ÓÃ×ö×îÖÕµÄÄ£ÐÍѵÁ·¡£³¬²ÎÊýµÄÑ¡Ôñ¶ÔÓÚÉî¶ÈѧϰºÜÓÐÒâÒ壬°üÀ¨ÍøÂçÍØÆË¡¢²ÎÊýµÄË¥¼õÂÊ¡¢´¥·¢º¯ÊýµÄÑ¡Ôñ¶¼ÊÇÓ°ÏìÉî¶ÈѧϰµÄ³¬²ÎÊý¡£Í¼7ÏÔʾÁËÒ»¸ö´ó¸ÅµÄ³¬²ÎÊýÑ¡ÔñÁ÷³Ì£¬Ä£Ð͵ÄÌØÕ÷Ñ¡Ôñµ½¹é»¯ÏµÊý£¨Regulation
Parameter£©Ò»ÆðÅä¶ÔÀ´ÑµÁ·Ò»¸öÄ£ÐÍ£¬×îÖÕÆÀ¹ÀÄ£¿éÑ¡Ôñ×îÖÕ³¬²ÎÊý¡£ÔÚSparkµÄ³¡¾°ÖУ¬DNN Driver¶Ë»á¸úÆÀ¹À¶Ëͨ¹ýRPCͨÐÅÀ´¾ö²ßÐèÒª³¢ÊÔʲô³¬²ÎÊý¡£ÆÀ¹À¶ËÂß¼ÊÇÔÚSpark
DNN DriverÒÀÀµµÄMLApplication·þÎñ¡£Èç¹ûÓû§ÐèÒª¶ÔDNNѵÁ·Ä£ÐͽøÐг¬²ÎÊýÑ¡Ôñ£¬ÔòSpark
DNN Driver»á¸ù¾Ý²»Í¬²ÎÊýÅä¶ÔÆô¶¯¶à¸öѵÁ·ÊµÀý£¬È»ºó¸ù¾ÝѵÁ·À´ÊÇ·ñÐèÒª½øÒ»²½ËÑË÷¡£
SparkÒì¹¹·Ö²¼Ê½¼ÆËãÆ½Ì¨¼Ü¹¹
ÈçÉÏËùÊö£¬ÎÒÃÇÒѾ¿´µ½Spark on PADDLEÄܹ»Ê¹µÃ´«Í³µÄÉî¶Èѧϰ¿ÉÒÔÔÚ¸ü´ó¹æÄ£µÄ·Ö²¼Ê½ÏµÍ³ÉÏÔËÐС£µ«ÊÇ£¬°Ù¶ÈÃæÁٷdz£ÏÖʵµÄÎÊÌâ¾ÍÊǾÞÁ¿µÄÊý¾Ý¡£ÔÚ°Ù¶ÈÄÚ²¿£¬Ã¿Ìì´¦ÀíµÄÊý¾ÝÁ¿¶¼Ô¶Ô¶³¬³öÁË´«Í³Æ½Ì¨µÄÄÜÁ¦£¬»áʹÓõ½¾ÞÁ¿µÄÄ£ÐͲÎÊý¡¢ÌØÕ÷ÒÔ¼°ÑµÁ·Êý¾Ý¡£ÕâЩ¾ÞÁ¿Êý¾Ý¶Ô·Ö²¼Ê½ÏµÍ³µÄÐÔÄܺÍÀ©Õ¹ÐÔ¶¼Ìá³öÁ˸ü¸ßµÄÒªÇó¡£Ò»·½Ã棬ÎÒÃÇÏ£ÍûÌṩ¿ÉÒÔ±ÈÄ⴫ͳMapReduce¼¯Èº¹æÄ£µÄÉî¶Èѧϰ¼ÆË㼯Ⱥ£¬¿ÉÒÔ²¢ÐÐÔËÐдóÁ¿µÄÉî¶ÈѧϰÈÎÎñ£»ÁíÒ»·½Ã棬ÿ¸öÉî¶ÈѧϰģÐͲ»¿ÉÄÜÎÞÏÞÖÆµØÇзֳɸüСµÄµ¥Ôª£¬Òò´Ëÿ¸ö½ÚµãµÄÄ£ÐÍ´¦ÀíÄÜÁ¦Ò²ÊÇÖÁ¹ØÖØÒªµÄ¡£
ĿǰÒÔCPUΪÖ÷µÄ¼ÆËã½ÚµãÊܵ½±¾Éí¼ÆËãÄÜÁ¦µÄÏÞÖÆ£¬Ô¶Ô¶²»ÄÜÂú×ã¼ÆËãµÄÐèÇó£¬Òò´Ë£¬ÎÒÃÇÐèҪͨ¹ý¸üÇ¿´óµÄÒì¹¹¼ÆËãÀ´¼ÓËÙÏÖÔڵļÆËãÆ½Ì¨¡£Ä¿Ç°ÎÒÃǵÄÏîÄ¿Ö÷񻃾¼°µ½Á½ÖÖ¼ÆËã×ÊÔ´£ºGPUºÍFPGA¡£GPU¿ÉÒÔÌṩǿ´óµÄ¼ÆËãÄÜÁ¦£¬ÊÊÓÃÓÚ¸ßÃܶȵļÆËãÀàÐÍ£»FPGAÓе͹¦ºÄ¡¢¸ß¶È¿É¶¨ÖƵÄÌØµã£¬ÊʺϼÓËٺܶàÌØ¶¨µÄ¶¯Ì¬ÈÎÎñ£¨±¾ÏîĿʹÓõÄFPGAÓ²¼þ¼ÓËÙÓɰٶÈÃÀ¹úÑз¢ÖÐÐĵļÆËãÍŶÓÌṩ£©¡£
ÎÒÃǵÄÏîÄ¿ÕýÊÇ»ùÓÚSpark on PADDLE£¬Ì½Ë÷ÁËÈçºÎÓÐЧµØ°ÑÒì¹¹×ÊÔ´ÕûºÏµ½ÏÖÔڵĴó¹æÄ£·Ö²¼Ê½ÏµÍ³£¬ÒÔÌṩ¸ßÓ¦ÓÃÐÔÄܺÍÒ×ÓÃÐÔΪĿ±ê¡£ÔÚÂú×ãǰÊöÒªÇóµÄ»ù´¡ÉÏ£¬ÏµÍ³ÐèÒª¶¯Ì¬µØ¶ÔGPU/FPGA×ÊÔ´½øÐйÜÀí£¬½øÐÐÎÞ·ìµÄµ÷¶È£¬ÕýÈçCPUºÍMemoryµÈ×ÊÔ´µÄµ÷¶ÈÒ»Ñù¡£ÕâÒ»¹¦ÄÜÊÇͨ¹ý°Ñ×ÊÔ´µ÷¶ÈÕûºÏµ½¿ªÔ´µÄYarnϵͳÀ´ÊµÏֵ쬶ø×ÊÔ´¸ôÀë·½°¸»ùÓÚÒµ½çÁ÷ÐеÄContainer¼¼Êõ¡£
ͬʱ£¬ÎÒÃÇ»¹ÐèÒªÌṩ¼òµ¥Ò×Óõıà³Ì½Ó¿Ú£¬ÒÔ±ãÏÖÓеÄÓ¦ÓóÌÐò¿ÉÒÔ¸ü¿ìµØÇ¨ÒƵ½ÎÒÃǵÄϵͳÉÏÀ´¡£ÒòΪSparkËùÓеÄÊý¾Ý¶¼ÊÇ»ùÓÚRDDµÄ£¬ÎÒÃÇ´´½¨ÁËÒ»ÀàеÄRDD£¬Í¨¹ýÕâ¸öRDD£¬³ÌÐò¿ÉÒÔÖ±½ÓʹÓõ½µ×²ãµÄGPU/FPGAÀ´¼ÓËÙÏàÓ¦µÄ¼ÆËã¡£ÎÒÃÇÖªµÀ£¬ÕæÕýÔÚGPU/FPGAÉÏÍê³É³ÌÐòµÄ¹¦ÄÜ£¬»¹ÐèÒªÌṩKernels£¬ÕâÀïÎÒÃDzÉÓÃÁËÒµ½ç×îΪÁ÷Ðеıê×¼OpenCL½Ó¿Ú£¬ÒÔ±ãÓÚ½«³ÌÐòÒÆÖ²µ½²»Í¬µÄGPU/FPGA¡£¿ÉÒÔ¿´µ½£¬Ò»¸öÌØ¶¨µÄ¹¦ÄÜʵÏÖÐèÒª3¸ö²¿·Ö£ºÒ»¸öScala
Driver£¬Ò»¸öC++µÄWorkerÒÔ¼°Ò»¸öOpenCL Kernel£¨on GPU/FPGA£©¡£Èç¹û³£ÓõŦÄÜÒѾ¼¯³ÉÔÚMLlibÖУ¬ÄÇôÓû§Ö»ÐèÒª´´½¨×Ô¼ºµÄScala
Driver£¬Í¨¹ýеÄRDDµ÷ÓÿâÀïÃæÒѾ֧³ÖµÄº¯Êý£¬¾Í¿ÉÒÔÎÞ·ìÏíÊܵ½GPU/FPGA×ÊÔ´µÄ¼ÓËÙ¡£
ͼ8 SparkÒì¹¹¼ÆËãÆ½Ì¨¼Ü¹¹
Ò칹ϵͳ¼Ü¹¹Èçͼ8Ëùʾ¡£ÏµÍ³µÄÔËÐйý³ÌÈçÏ£º
1.Ê×ÏÈÓû§Ó¦ÓóÌÐò£¨Scala Driver£©»áÓÉApp MasterÆô¶¯£»
2.È»ºóÓû§Ó¦ÓóÌÐò»áÏòYarnÇëÇóÆäËùÐèµÄ×ÊÔ´£¬ÆäÖÐGPU¡¢FPGA×÷Ϊ²»Í¬µÄ×ÊÔ´Àà±ð£¬ÓëÇëÇóCPU×ÊÔ´·½Ê½ÍêȫһÖ£»
3.Óû§Ó¦ÓóÌÐòÈ¡µÃËùÓÐ×ÊÔ´£¬ÓÉApp MasterÔÚÏàÓ¦µÄApp SlaveÉÏÆô¶¯ContainerÔËÐÐÓû§³ÌÐòµÄÒ»¸öScala
Worker£»
4.Õâʱ£¬°´ÕÕ³ÌÐòScala WorkerµÄÐèÇó£¬Èç¹ûʹÓõ½ÁËеÄRDD£¬±ã»áµ÷ÓÃÏàÓ¦µÄC++µÄOpenCL³ÌÐò£¬Èç¹ûº¯Êý¹¦ÄÜÊÇMLlibÄÚǶµÄ£¬ÄÇôÕⲿ·Ö¶ÔÓû§Ò²ÊÇÍêȫ͸Ã÷µÄ¡£
5.OpenCL³ÌÐòÆô¶¯ºó£¬»á°ÑËù·ÖÅäµÄÊý¾Ý´«Êäµ½GPU»òFPGAÉÏ£¬È»ºóÔÚGPU»òÕßFPGAÉ϶¯Ì¬Æô¶¯Ìض¨µÄOpenCL
Kernel£¬´¦ÀíÕâЩÒѾ´«Êä¹ýÀ´µÄÊý¾Ý¡£
6.OpenCL Kernel¼ÆËãÍê³Éºó£¬Êý¾Ý»á×Ô¶¯±»À»Øµ½Ö÷´æ£¬ÕâʱOpenCLµÄ³ÌÐò¾Í¿ÉÒ԰ѽá¹û·µ»Ø¸øScala
Worker£»
7.×îºóËùÓÐScala Worker°Ñ½á¹ûÌá½»¸øÔÚApp MasterÉÏÔËÐеÄÓû§³ÌÐòScala
Driver¡£
¿ÉÒÔ¿´µ½£¬Õû¸öÁ÷³ÌÖ§³Ö¼ÓÈëÁËеÄGPU/FPGA¼ÆËã×ÊÔ´£¬»¹ÓÐÐèÒªÓû§Ê¹ÓÃеÄRDD¡£ÆäËû·½Ãæ¶ÔÓû§³ÌÐòÀ´ËµÃ»ÓÐÈκζîÍâµÄ¸Ä¶¯¡£
SparkÒ칹ƽ̨ÐÔÄÜÆÀ¹À
ÔÚÒ칹ƽ̨¼Ü¹¹´î½¨ºÃºó£¬ÎÒÃÇÊ×ÏȲâÊÔÁË»úÆ÷ѧϰµ×²ã¾ØÕóÔËËã¿âµÄCPUÓëGPUÐÔÄܶԱȡ£½á¹ûÏÔʾ£¬ÔÚÖ´ÐÐͬһ¸ö¼ÆËã·½³Ìʱ£¬GPUµÄ¼ÓËÙЧ¹ûºÜºÃ£¬¶ÔCPUµÄ¼ÓËٱȴóÔ¼ÊÇ30±¶¡£Óë´Ëͬʱ£¬°Ù¶ÈÃÀ¹úÑз¢ÖÐÐļÆËãÍŶÓÒ²¶ÔKmeansËã·¨ÓÃFPGA½øÐмÓËÙ£¬È¡µÃÁË15µ½20±¶µÄ¼ÓËÙ»¯£¬¶øÇÒFPGAÄܺÄÖ»ÊÇCPUµÄ20%¡£ÔÚµÚ¶þ¸öʵÑéÖУ¬ÎÒÃǶԱÈÁËSpark
on PADDLEÔÚѵÁ·ImageNetʱµÄGPUÓë CPU¼ÓËٱȣ¬·¢ÏÖʹÓÃGPU¿ÉÒÔ¼ÓËÙ30±¶£¬Ò²¾ÍÊÇ˵£¬ÔÚʹÓÃÒ칹ƽ̨ºóÎÒÃÇÖ»ÓÃ3%µÄ»úÆ÷×ÊÔ´¾Í¿ÉÒÔÍê³ÉͬÑùµÄ¼ÆËã¡£
ÔںܺõØÁ˽âÁËÒ칹ƽ̨¼ÓËٱȺó£¬ÎÒÃÇÒ²Ñо¿ÁËÒ칹ƽ̨µÄ¿ÉÀ©Õ¹ÐÔ¡£²âÊÔ½á¹ûÈçͼ9Ëùʾ£¬»ù±¾ÉÏËæ×ÅGPU×ÊÔ´µÄÔö¼Ó£¬¼ÆËãʱ¼äÒ²ÔÚÏßÐԵؽµµÍ£¬±íÏÖ³öºÜÇ¿µÄ¿ÉÀ©Õ¹ÐÔ£¬¿ÉÒÔ³ÐÊܴܺóµÄÊý¾ÝÁ¿Óë¼ÆËãÁ¿¡£

ͼ9 SparkÒì¹¹¼ÆËãÆ½Ì¨ÐÔÄÜÊý¾Ý
×ܽá
±¾ÎĽéÉÜÁ˰ٶȻùÓÚSparkµÄÒì¹¹·Ö²¼Ê½Éî¶Èѧϰϵͳ¡£°ÑSparkÓëÉî¶Èѧϰƽ̨PADDLE½áºÏÆðÀ´½â¾öÁËPADDLEÓëÒµÎñÂß¼¼äµÄÊý¾Ýͨ·ÎÊÌ⣬ʹҵÎñ·½¿ÉÒÔºÜÈÝÒ×µØÊ¹ÓÃÉî¶Èѧϰ¼¼Êõ¡£ÔÚ´Ë»ù´¡ÉÏ£¬ÎÒÃÇʹÓÃGPUÓëFPGAµÄÒ칹ƽ̨¼«´óµØÌáÉýÁËÿ̨»úÆ÷µÄÊý¾Ý´¦ÀíÄÜÁ¦¡£ÔÚÒ칹ƽ̨ÉÏ£¬ÎÒÃÇʹÓÃYARN¶ÔÒì¹¹×ÊÔ´×ö·ÖÅ䣬ÒÔÖ§³ÖMulti-Tenancy£¬ÈÃ×ÊÔ´µÄʹÓøüÓÐЧ¡£ÏÂÒ»²½¹¤×÷ÎÒÃÇ´òËã°ÑÆ½Ì¨ÍÆ¹ãµ½°Ù¶È²»Í¬µÄÒµÎñƽ̨£¬±ÈÈçÓïÒô¡¢°Ù¶ÈÃØÊé¡¢°Ù¶ÈͼËÑ¡¢°Ù¶ÈÎÞÈ˳µµÈ£¬ÈÃÆ½Ì¨ÔÚ²»Í¬ÒµÎñÉÏ´¸Á¶¡£ÔÚÆ½Ì¨¸ü³ÉÊìºó£¬ÎÒÃÇ´òËã°ÑSpark
on PADDLEÒÔ¼°Òì¹¹¼ÆËãÆ½Ì¨¿ªÔ´£¬»ØÀ¡ÉçÇø¡£
|