½üÈÕ£¬¾©¶«·¢²¼µÇÔ»úÆ÷ѧϰƽ̨£¬²¢ÔÚ¾©¶«ÔÆÉÏÏߣ¬Õýʽ¶ÔÍâÌṩÈ˹¤ÖÇÄÜ·þÎñ¡£µÇÔ»úÆ÷ѧϰƽ̨µÄÉÏÏß´ú±íמ©¶«È˹¤ÖÇÄܼ¼Êõ´ÓÓ¦Óü¶·þÎñµ½»ù´¡Ëã·¨µÄÈ«Ãæ¶ÔÍ⿪·Å£¬Êµ¼ùמ©¶«RaaS£¨ÁãÊÛ¼´·þÎñ£©µÄ·¢Õ¹²ßÂÔ¡£½ñÌìÎÒÃÇÑûÇëÁËAIÓë´óÊý¾Ý²¿µÄ¹¤³ÌʦΪ´ó¼ÒÉî¶È½âÃܾ©¶«µÇÔÂÆ½Ì¨»ù´¡¼Ü¹¹¡£
´Ó2016Äê9Ô¿ªÊ¼£¬¾©¶«AI»ù´¡Æ½Ì¨²¿»ùÓÚKubernetesºÍDocker¹¹½¨»úÆ÷ѧϰƽ̨µÄµ×²ã¼Ü¹¹£¬ºóÐøÖð²½ÍêÉÆºÍÓÅ»¯ÁËÍøÂç¡¢GPU¹ÜÀí¡¢´æ´¢¡¢ÈÕÖ¾¡¢¼à¿Ø¡¢È¨ÏÞ¹ÜÀíµÈ¹¦ÄÜ¡£Ä¿Ç°¼¯Èº¹ÜÀíµÄÈÝÆ÷ʵÀýÊýÁ¿ÓÐ5K+£¬ÖÁ½ñÒÑÉÏÏßÔËÐÐÁË20¶à¸öAIǰÏò·þÎñ£¨50¶à¸öAPI£©£¬Í¬Ê±ÎªºóÏòѵÁ·Ìṩ֧³Ö£¬ÔÚ618´ó´ÙÖбíÏÖ¸ßЧÎȶ¨¡£
¼Ü¹¹
µÇÔÂÆ½Ì¨µÄ»ù´¡¼Ü¹¹ÒÔDocker+KubernetesΪÖÐÐÄ£¬µ×²ã»ù´¡ÉèÊ©°üÀ¨CPU¡¢GPU¡¢FPGA¼ÆËã×ÊÔ´£¬IB¡¢OPA¸ßËÙ»¥ÁªÍøÂçÒÔ¼°¶àÑù»¯µÄÎļþϵͳ£¬Ö®ÉÏÊÇ»úÆ÷ѧϰ¿ò¼ÜºÍËã·¨¿â£¬×îÉϲãÊÇÒµÎñÓ¦Ó᣹ÜÀíÖÐÐİüÀ¨È¨ÏÞ¹ÜÀí¡¢ÈÎÎñ¹ÜÀí¡¢Á÷³Ì¹ÜÀí¡¢¼à¿ØÖÐÐÄ¡¢ÈÕÖ¾ÖÐÐÄ¡£
ƽ̨ÕûÌåÉè¼ÆË¼ÏëÊÇKubernetesµ÷¶ÈÒ»ÇУ¬Ó¦¾ßÓÐÒÔÏÂÌØÐÔ£¨ÎªÁË·½±ãÆð¼ûËùÓеÄinferenceÀàÐ͵ÄÓ¦ÓÃÎÒÃdzÆÎªApp£¬ËùÓÐtrainingÀàÐ͵ÄÓ¦ÓÃÎÒÃdzÆÎªJob£©£º
- ¸ß¿ÉÓᢸºÔؾùºâ¡£´óÁ¿µÄinference AppÔËÐÐÔÚÈÝÆ÷ÖУ¬ÐèÒª±£Ö¤AppÄܹ»Îȶ¨¸ßЧµÄ¶ÔÍâÌṩ·þÎñ¡£
- Ó¦Óôò°üÓë¸ôÀë¡£Ñо¿ÈËÔ±¡¢¿ª·¢ÈËÔ±½«×Ô¼ºµÄ´úÂë´ò°ü³Éimage£¬·½±ãµÄ½øÐÐCI/CD£¬Í¸Ã÷µÄ½«×Ô¼ºµÄAppÔËÐÐÓÚÆ½Ì¨ÖС£
- ×Ô¶¯À©ÈÝ/ËõÈÝ£¬training/inferenceÓÃͬһÅú»úÆ÷µ÷¶È¡£°×ÌìÓÐÐí¶à»îÔ¾µÄÓû§£¬Æ½Ì¨Ó¦¸ÃÀ©Õ¹¸ü¶àinference App£¬¶øµ½ÁËÍíÉÏ£¬Ó¦¸Ã½«¸ü¶àµÄ×ÊÔ´·ÖÅ䏸training Job¡£
- ×÷Ϊ´óÊý¾Ýµ÷¶Èƽ̨¡£Æ½Ì¨²»½ö¿ÉÒÔÔÉúµÄµ÷¶ÈTensorflow/Caffe/XGBoost/MXNetµÈ»úÆ÷ѧϰ¡¢Éî¶Èѧϰ¹¤¾ß°ü£¬Ò²Ó¦¸Ã½«Hadoop/SparkϵÁеĴóÊý¾ÝÉú̬ϵͳµ÷¶ÈÔÚKubernetesÖС£
- Ö§³Ö·á¸»µÄÓ²¼þ×ÊÔ´ÀàÐÍ¡£¸ù¾Ý²»Í¬µÄApp£¬JobÀàÐÍ£¬Ó¦¸ÃʹÓò»Í¬µÄÓ²¼þ×ÊÔ´ÒÔÌá¸ß¼ÓËٱȣ¬Æ½Ì¨²»½öÐèÒªÖ§³ÖCPU¡¢GPU£¬»¹Ó¦¸ÃÖ§³ÖFPGA£¬InfiniBand£¬OPAµÈרÓøßËÙ¼ÆËã×ÊÔ´¡£
- ×î´ó»¯ÀûÓÃÕû¸ö¼¯Èº×ÊÔ´¡£ÏÔ¶øÒ×¼û£¬¶ÔÓÚÆ½Ì¨À´ËµÒѾ²»ÔÙÇø·ÖÊÇinference App»¹ÊÇtraining Job£¬ËùÓеļÆËã×ÊÔ´¶¼Í³Ò»ÔÚÒ»¸ö´óµÄ×ÊÔ´³ØÖС£
- ÍÆÐÐÊý¾Ý¸ôÀë¼Ü¹¹£¬±£Ö¤Êý¾Ý°²È«¡£Í¨¹ýÍøÂçÓÅÊÆ½«Êý¾ÝºÍ¼ÆËã½øÐзÖÀ룬Ìṩ¸ü¸ß¼¶±ðµÄÊý¾ÝaccessȨÏÞ¡£
- ¶à×â»§°²È«±£Ö¤¡£Æ½Ì¨½ÓÈ빫ÓÐÔÆ£¬ÐèÒªÖ§³Ömulti-tenancyµÄ¼Ü¹¹£¬²»Í¬µÄÓû§¹²Ïí¼ÆËã×ÊÔ´µÄ³Ø×Ó£¬µ«ÊDZ˴ËÔÚÍøÂç¼¶±ð¡¢Îļþϵͳ¼¶±ð¡¢LinuxÄں˼¶±ð¶¼Ï໥¸ôÀë¡£

µÇÔÂÆ½Ì¨¼Ü¹¹
ÍøÂç
Kubernetes×ÔÉí²»¾ß±¸ÍøÂç×é¼þ£¬ÐèҪʹÓõÚÈý·½ÍøÂç²å¼þʵÏÖ¡£Ç°ÆÚÎÒÃǵ÷ÑÐÁËFlannel¡¢Weave¡¢CalicoÈýÖÖÈÝÆ÷ÍøÂ磬²¢×öÁËÐÔÄܶԱȲâÊÔ¡£ÓÉÓÚFlannel¡¢Weave¶¼ÊÇoverlayÍøÂ磬¾ù²ÉÓÃËíµÀ·½Ê½£¬ÍøÂçͨÐŰü´«Êä¹ý³ÌÖж¼Óзâ°ü²ð°ü´¦Àí£¬Òò´ËÐÔÄÜ´ó´òÕÛ¿Û£»¶øCalico»ùÓÚBGP·ÓÉ·½Ê½ÊµÏÖ£¬Ã»Óзâ°ü²ð°üºÍNAT£¬ÐÔÄÜ¿°±ÈÎïÀí»úÍøÂç¡£
ÁíÍ⣬CalicoÊÇ´¿Èý²ãµÄÊý¾ÝÖÐÐĽâ¾ö·½°¸£¬Ö÷»úÖ®¼ä¶þ²ãͨÐÅʹÓõÄÊÇÎïÀí»úµÄMACµØÖ·£¬±ÜÃâÁËARP·ç±©¡£³ýÁË·ÓÉ·½Ê½£¬CalicoÒ²Ö§³ÖIPIPµÄËíµÀ·½Ê½£»Èç¹ûʹÓÃBGP·½Ê½£¬ÐèÒª»ú·¿µÄÍøÂçÉ豸¿ªÆôBGP¹¦ÄÜ¡£
¹«ÓÐÔÆÉÏÐèÒª½â¾öµÄÒ»¸öÖØÒªÎÊÌâ¾ÍÊǶà×â»§ÍøÂç¸ôÀ룬ÎÒÃÇʹÓÃÁËKubernetes×ÔÉíµÄNetworkPolicyºÍCalicoµÄÍøÂç²ßÂÔʵÏÖ¡£¸øÃ¿¸öÓû§·ÖÅäÒ»¸öµ¥¶ÀµÄNamespace£¬NamespaceÄÚ²¿µÄPod¼ä¿ÉÒÔͨÐÅ£¬µ«NamespaceÖ®¼äµÄPod²»ÔÊÐíͨÐÅ¡£KubernetesµÄNetworkPolicyÖ»Ö§³Ö¶Ô¡°ÈëÁ÷Á¿¡±£¨ingress£©×÷ÏÞÖÆ£¬¶øCalicoµÄÍøÂç²ßÂÔ×÷Á˸ü¶àµÄÀ©Õ¹£¬Ö§³Ö¶Ô¡°³öÁ÷Á¿¡±£¨egress£©×÷ÏÞÖÆ£¬¶øÇÒ»¹¾ß±¸¸ü¾«Ï¸µÄ¹æÔò¿ØÖÆ£¬ÈçÐÒé¡¢¶Ë¿ÚºÅ¡¢ICMP¡¢Ô´Íø¶Î¡¢Ä¿µÄÍø¶ÎµÈ¡£
´ó²¿·ÖÈÝÆ÷ÍøÂç¸øÈÝÆ÷·ÖÅäµÄIPÖ»¶Ô¼¯ÈºÄÚ²¿¿É¼û£¬¶øµÇÔÂÆ½Ì¨ÉϺܶàǰÏò·þÎñ¶ÔÍâÌṩRPC½Ó¿Ú£¬ÐèÒª½«ÈÝÆ÷IP±©Â¶µ½¼¯ÈºÍⲿ¡£¾µ÷ÑÐÖ®ºóÑ¡ÓÃÁËCisco¿ªÔ´µÄContivÏîÄ¿£¬ËüµÄµ×²ãÔÀíÊÇÓÃOVS´òͨÁËÈÝÆ÷µÄ¿çÖ÷»úͨÐÅ£¬ÎÒÃÇʹÓõÄÊÇËüµÄVLANģʽ£¬Ïà¶ÔÓÚ»ùÓÚËíµÀ¼¼ÊõʵÏÖµÄoverlayÍøÂçÀ´Ëµ£¬ÕâÊÇunderlayÍøÂ磬Ëü²»Êǹ¹½¨ÓÚÎïÀí»úµÄÍøÂçÖ®ÉÏ£¬¶øÊÇÓëÎïÀí»úλÓÚÍ¬Ò»ÍøÂç²ãÃæ£¬ÕâÖÖÍøÂçµÄÐÔÄܽӽüÓÚÎïÀíÍøÂç¡£
´æ´¢
Kubernetes±¾Éí²»Ìṩ´æ´¢¹¦ÄÜ£¬¶øÊÇͨ¹ý´æ´¢²å¼þÓëµÚÈý·½´æ´¢ÏµÍ³ÊµÏÖ£¬KubernetesÖ§³Ö¶þÊ®¶àÖÖ´æ´¢ºó¶Ë£¬ÎÒÃÇÑ¡ÓÃÁËGlusterfs¡£
GlusterfsÊÇÃæÏòÎļþµÄ·Ö²¼Ê½´æ´¢ÏµÍ³£¬¼Ü¹¹ºÍ²¿Ê𶼺ܼòµ¥£¬ÉçÇø°æÒѾ×ã¹»Îȶ¨£¬ËüµÄÌØµãÊÇ£ºµ¯ÐÔ¡¢ÏßÐÔºáÏòÀ©Õ¹¡¢¸ß¿É¿¿¡£GlusterfsÔڼܹ¹ÉÏÏû³ýÁË´ó¶àÊýÎļþϵͳ¶ÔÔªÊý¾Ý·þÎñµÄÒÀÀµ£¬È¡¶ø´úÖ®µÄÊÇÒÔµ¯ÐÔ¹þÏ£Ë㷨ʵÏÖÎļþ¶¨Î»£¬ÓÅ»¯ÁËÊý¾Ý·Ö²¼£¬Ìá¸ßÁËÊý¾Ý·ÃÎʲ¢ÐÐÐÔ£¬¼«´óµØÌáÉýÁËÐÔÄܺÍÀ©Õ¹ÐÔ¡£
KubernetesµÄVolumeÖ§³Ö¾²Ì¬·ÖÅäºÍ¶¯Ì¬·ÖÅäÁ½ÖÖ·½Ê½¡£¾²Ì¬·ÖÅäÖ¸µÄÊÇÓɹÜÀíÔ±ÊÖ¶¯Ìí¼ÓºÍɾ³ýºó¶Ë´æ´¢¾í£¬¶¯Ì¬·ÖÅäÔòÊÇʹÓÃKubernetesµÄStorageClass½áºÏHeketi·þÎñʵÏÖ¡£HeketiÊÇGlusterfsµÄ¾íµÄ¹ÜÀí·þÎñ£¬¶ÔÍâÌṩREST½Ó¿Ú£¬¿ÉÒÔ¶¯Ì¬´´½¨¡¢Ïú»ÙGlusterfs Volume¡£
GlusterfsËäÈ»ÐÔÄܺܺã¬È´²»Êʺϴ洢º£Á¿Ð¡Îļþ£¬ÒòΪËüÖ»ÔÚºê¹ÛÉ϶ÔÊý¾Ý·Ö²¼×÷ÁËÓÅ»¯£¬È´Ã»ÔÚ΢¹ÛÉ϶ÔÎļþIO×÷ÓÅ»¯¡£µÇÔÂÆ½Ì¨ÉÏ´ó¶àÊýǰÏò·þÎñ¶¼ÊÇͼÏñʶ±ðÓ¦Óã¬ÐèÒª½«Í¼Æ¬ºÍʶ±ð½á¹û±£´æÏÂÀ´£¬ÓÃ×÷ѵÁ·Êý¾Ý£¬½øÐÐËã·¨µÄµü´úÓÅ»¯¡£ÎÒÃÇÔÚµ÷ÑÐÖ®ºó²ÉÓÃÁËSeaweedFS×÷ΪСÎļþ´æ´¢ÏµÍ³¡£
SeaweedFSµÄÉè¼ÆË¼ÏëÔ´ÓÚFacebookµÄHaystackÂÛÎÄ£¬¼Ü¹¹ºÍÔÀí¶¼ºÜ¼òµ¥£¬ÐÔÄܼ«ºÃ£¬²¿ÊðºÍά»¤Ò²ºÜ·½±ã¡£SeaweedFS¶ÔÍâÌṩREST½Ó¿Ú£¬½áºÏËüµÄfiler·þÎñ¿ÉʵÏÖĿ¼¹ÜÀí£¬ÎÒÃÇÔÚ´Ë»ù´¡ÉÏʵÏÖÁËÅúÁ¿ÉÏ´«ºÍÏÂÔØ¹¦ÄÜ¡£SeaweedFS¾ßÓÐrack-awareºÍdatacenter-aware¹¦ÄÜ£¬¿É¸ù¾Ý¼¯ÈºµÄÍØÆË½á¹¹£¨½ÚµãÔÚ»ú¼ÜºÍÊý¾ÝÖÐÐĵķֲ¼Çé¿ö£©ÊµÏÖ¸ü¿É¿¿µÄÊý¾ÝÈßÓà²ßÂÔ¡£Ä¿Ç°µÇÔÂÆ½Ì¨ÉϺܶàͼÏñ·þÎñÒѾ½ÓÈëSeaweedFS£¬Ã¿Ìì´æ´¢µÄͼƬÊýÁ¿´ïµ½600ÍòÕÅ£¬´æ´¢Á¿ÒÔÿÌì30GµÄËÙ¶ÈÔö³¤¡£
ÒòΪ¶àÊý¼ÆËãÈÎÎñ¶¼»áʹÓÃHDFS£¬ËùÒÔHDFSÒ²ÊǵÇÔÂÆ½Ì¨±Ø²»¿ÉÉٵĴ洢×é¼þ¡£ÎªÁËÌá¸ßÊý¾Ý¶ÁдËÙ¶È£¬ÎÒÃÇÒýÈëAlluxio×÷ΪHDFSµÄcache²ã£¬¸úÖ±½Ó¶ÁдHDFSÏà±È£¬ÐÔÄÜÌáÉýÁ˼¸Ê®±¶¡£
ÔÚÎļþϵͳµÄ¶à×â»§¸ôÀë·½Ãæ£¬Ê¹ÓÃKerberosºÍRanger¶ÔHDFS×÷°²È«¹ÜÀí£¬KerberosÌṩÁËÉí·ÝÑéÖ¤£¬RangerÌṩÁËȨÏÞУÑé¡£¶øGlusterfsµÄVolumeʹÓÃmount·½Ê½¹ÒÔØµ½ÈÝÆ÷ÖУ¬±¾Éí¾Í¿É½«Óû§ÏÞ¶¨ÔÚÌØ¶¨¾íÖУ¬Òò´Ë¿É±äÏàÖ§³Ö¶à×â»§¸ôÀë¡£
GPU×ÊÔ´¹ÜÀí
ƽ̨µ±Ç°Ê¹ÓõÄKubernetes ÊÇ1.4°æ±¾£¬µ±Ê±ÉçÇø»¹Ã»ÓмÓÈë¶Ô¶àGPUµÄÖ§³Ö£¬ÎÒÃǾÍ×Ô¼º¿ª·¢Á˶àGPU¹ÜÀí£¬°üÀ¨£ºGPU̽²âÓëÓ³É䣬cuda driver¹ÜÀíÓëÓ³É䣬GPU½¡¿µ¼ì²éºÍ״̬¼à¿Ø£¬GPU-awareµ÷¶ÈµÈ¡£GPU-awareµ÷¶È¿É¸ù¾ÝGPUÐͺš¢ÏÔ´æ´óС¡¢¿ÕÏеÄGPUÊýÁ¿µÈÌõ¼þºÏÀíµØµ÷¶ÈÓ¦ÓóÌÐò£¬ÒÔ±£Ö¤×ÊÔ´ÀûÓÃÂÊ×î´ó»¯¡£
¸ºÔؾùºâ
µÇÔÂÆ½Ì¨µÄǰÏò·þÎñ¶ÔÍâÌṩµÄͨÐŽӿÚÓÐRPCºÍHTTPÁ½ÖÖ¡£RPC·þÎñ¿ÉÒÔͨ¹ý×¢²áÖÐÐĺÍRPC ClientʵÏÖ¸ºÔؾùºâ£¬HTTP·þÎñʹÓõÄÊÇKubernetes ÉçÇøµÄingress×é¼þʵÏÖ¸ºÔؾùºâ¡£IngressµÄ±¾ÖÊÊǶÔNginx×÷ÁË·â×°¡£Óû§Ö»Ð轫Ingress¹æÔòÅäÖõ½KubernetesÀָ¶¨·þÎñµÄHost¡¢PathÓëKubernetesµÄServiceÖ®¼äµÄÓ³Éä¹ØÏµ£¬È»ºóIngress-controllerʵʱ¼à¿Ø¹æÔòµÄ±ä»¯£¬²¢Éú³ÉNginxÅäÖÃÎļþ£¬½«Nginx³ÌÐòreload£¬Á÷Á¿¾Í»á±»·Ö·¢µ½Serivce¶ÔÓ¦µÄPodÉÏ¡£
CI/CD
ÎÒÃÇÑ¡ÓÃGitlab+Jenkins+Harbor×÷Ϊ³ÖÐø¼¯³É/²¿ÊðµÄ×é¼þ¡£¿ª·¢Õß½«´úÂëÌá½»ÖÁGitlab£¬ÓÉJenkins´¥·¢±àÒë¡¢´ò°üµÄ¹æÔò£¬²¢Éú³ÉDocker¾µÏñpushµ½HarborÉÏ¡£µ±Óû§Ö´ÐÐÉÏÏß²Ù×÷ºó£¬¾µÏñ±»ÀÈ¡µ½Kubernetes¼¯ÈºµÄWorker½ÚµãÉÏ£¬Æô¶¯ÈÝÆ÷¡£Æ½Ì¨Ê¹ÓÃHarbor´î½¨ÁË˽ÓвֿâºÍmirror²Ö¿â£¬ÎªÁ˼ÓËÙÀÈ¡¾µÏñµÄËÙ¶È£¬ÔÚ²»Í¬»ú·¿×÷Á˸´ÖƲֿ⡣
ÈÕÖ¾
ÔÚÈÕÖ¾²É¼¯·½Ã棬ÎÒÃDzÉÓÃÁËÒµ½çÆÕ±éµÄ½â¾ö·½°¸EFK£ºÈÝÆ÷½«ÈÕÖ¾´òµ½±ê×¼Êä³ö£¬ÓÉdocker daemonÂäÅÌ´æµ½ËÞÖ÷»úµÄÎļþÀȻºó¾FluentdÊÕ¼¯£¬·¢¸øKafka£¬ÔÙ¾Fluentdת·¢µ½Elasticsearch£¬×îºóͨ¹ýKibanaչʾ¸øÓû§×÷²éѯºÍ·ÖÎö¡£Ö®ËùÒÔÖмä¼ÓÁËKafka£¬Ò»ÊǶÔÁ÷Á¿Æðµ½Ï÷·åÌî¹ÈµÄ×÷Ó㬶þÊÇ·½±ãÒµÎñ·½Ö±½Ó´ÓKafkaÉÏÏû·ÑÈÕÖ¾µ¼ÈëµÚÈý·½ÏµÍ³´¦Àí¡£
¼à¿Ø
ÎÒÃDzÉÓõÄÊÇHeapster+Influxdb+Grafana¼à¿Ø×é¼þ¡£Heapster¶¨ÆÚ´Óÿ¸öNodeÉÏÀÈ¡Kubelet±©Â¶³öµÄmetricÊý¾Ý£¬¾¹ý¾ÛºÏ»ã×ÜÖ®ºóдÈëInfluxdb£¬×îÖÕÓÉGrafanaչʾ³öÀ´¡£HeapsterÌṩÁËContainer¡¢Pod¡¢Namespace¡¢Cluster¡¢Node¼¶±ðµÄmetricͳ¼Æ£¬ÎÒÃǶÔHeapster×÷ÁËÐ޸쬼ÓÈëÁËService¼¶±ðµÄmetric¾ÛºÏ£¬ÒÔ±ãÓû§´ÓÓ¦ÓõÄά¶È²é¿´¼à¿Ø¡£
Kubernetesµ÷¶ÈSpark
ÖØµã˵һÏÂSpark on Kubernetes¡£¸Ã¿ªÔ´ÏîÄ¿ÓÉGoogle·¢Æð£¬Ö¼ÔÚ½«SparkÄܹ»ÔÉúµÄµ÷¶ÈÔÚKubernetesÖУ¬ºÍYARN¡¢Mesosµ÷¶È¿ò¼ÜÀàËÆ¡£
Òµ½çÓÐÒ»ÖֱȽϼòµ¥µÄ×ö·¨£¬¾ÍÊǽ«Spark StandaloneģʽÔËÐÐÔÚDockerÖУ¬ÓÉKubernetes½øÐе÷¶È¡£
¸Ã×ö·¨¾ßÓÐÒÔÏÂȱµã£º
- Standalone±¾Éí¾ÍÊÇÒ»ÖÖµ÷¶Èģʽ£¬È´ÅÜÔÚÁíÒ»ÖÖµ÷¶Èƽ̨ÖУ¬¼Ü¹¹ÉÏÖØµþÍÏí³¡£
- StandaloneģʽÅÜÔÚKubernetesÖо¹ýʵ¼Ê²âÊÔ£¬ºÜ¶à»úÆ÷ѧϰÈÎÎñÐÔÄÜ»áÓÐ30%ÒÔÉϵÄË¥¼õ¡£
- ÐèÒªÔ¤ÏÈÉ趨WorkerµÄÊýÁ¿£¬Executor½ø³ÌºÍWorker½ø³ÌÅÜÔÚͬһ¸öContainerÖУ¬Ï໥ӰÏì¡£
- ÎÞ·¨Íê³É¶à×â»§µÄ¸ôÀë¡£ÔÚͬһ¸öDockerÖÐWorker¿ÉÒÔÆô¶¯²»Í¬Óû§µÄExecutor£¬°²È«ÐԺܲ
ΪÁ˽â¾öÉÏÊöÎÊÌ⣬KubernetesÐèÒªÔÉúÖ§³Öµ÷¶ÈSpark£¬¼Ü¹¹Í¼ÈçÏ£º

Native Spark on Kubernetes¼Ü¹¹
´ÓKubernetesµÄ½Ç¶È³ö·¢£¬°ÑDriverºÍExecutor·Ö±ðContainer»¯£¬Íê³ÉÔÉúµ÷¶È£¬¼Ü¹¹ÇåÎú¡£
¼Ì³ÐÁËDockerµÄ¼ÆËã×ÊÔ´¸ôÀëÐÔ£¬²¢ÇÒͨ¹ýKubernetesµÄNamespace¸ÅÄ¿ÉÒÔ½«²»Í¬µÄJob´ÓÍøÂçÉϳ¹µ×¸ôÀë¡£
¿ÉÒÔ±£³Ö¶à°æ±¾²¢ÐУ¬Spark-submitÌá½»ÈÎÎñµÄʱ£¬¿É¸ù¾ÝÓû§ÐèÇó¶¨Ò岻ͬ°æ±¾µÄDriverºÍExecutor¡£
´ÓClusterģʽµÄ½Ç¶ÈÀ´¹Û²ìSpark on Kubernetes£¬ÏÔ¶øÒ×¼ûµÄ½áÂÛÊÇ£¬ÎÒÃÇÒѾ²»ÔÙÓÐÒ»¸öËùνµÄ¡°Spark Cluster¡±£¬È¡¶ø´úÖ®µÄÊÇKubernetesµ÷¶ÈÁËÒ»ÇУ¬Spark Job¿ÉÒÔÎÞ·ìµØÓëÆäËûÓ¦ÓöԽӣ¬ÕæÕýÐγÉÁËÒ»¸ö´óµÄµ÷¶Èƽ̨¡£
µ±Ç°µÄÉçÇøµÄ°æ±¾ÊÇ·ÇÉú²ú»·¾³Ïµģ¬ÎÒÃÇÍŶÓΪ´Ë×öÁË´óÁ¿µÄbenchmark²âÊÔ£¬Îȶ¨ÐÔ²âÊԵȵȡ£ÎªÁËÖ§³Ö¸ü¶àµÄÐèÇó£¬Èçmulti-tenancy£¬python job£¬ ÎÒÃÇÐÞ¸ÄÁ˲¿·Ö´úÂ룬ά»¤Á˾©¶«µÄÒ»Ì×°æ±¾¡£
¼ÆËãÓëÊý¾Ý·ÖÀë
ÔÚHadoopÉú̬Ȧ£¬Êý¾Ý±¾µØÐÔÒ»Ö±±»½ò½òÀÖµÀ¡£µ«ÊÇÔÚÈÝÆ÷»¯¡¢ÔƵÄÁìÓò£¬´ó¼Ò¶¼ÔÚÍÆ³ç´æ´¢ÖÐÐÄ»¯£¬Êý¾ÝºÍ¼ÆËã·ÖÀ룬ÏÖÔÚÓÐÔ½À´Ô½¶àµÄ¹«Ë¾ÕýÔÚ½«´æ´¢ºÍ¼ÆËãÏà·ÖÀ룬ÕâÖ÷ÒªÊǵÃÒæÓÚÍøÂç´ø¿íµÄ·ÉËÙ·¢Õ¹¡£²»Ëµ×¨ÓÐÍøÂ磬¾Í˵ͨÓõÄ25GÍøÂ磬»¹ÓÐRDMAºÍSPDKµÈм¼ÊõµÄʹÓã¬ÈÃÎÒÃǾ߱¸ÁË´æ´¢¼ÆËã·ÖÀëµÄÄÜÁ¦¡£
´Ó¼Ü¹¹µÄ½Ç¶È¿´ÓÐÈçÏÂÒâÒ壺
- 1¡¢¶à×â»§³¡¾°£¬Êý¾Ý°²È«ÐԵõ½±£Ö¤£¬ÊµÏÖÎïÀíÉϵĸôÀë¡£
- 2¡¢²¿Êð»ú·¿¿ÉÒÔÁé»î¶à±ä£¬¼ÆËã×ÊÔ´ºÍ´æ´¢×ÊÔ´¿ÉÒÔ·Ö»ú·¿²¿Êð¡£µ±È»£¬Èç¹ûÐèÒªÐÔÄܱ£Ö¤£¬¿ÉÒÔ¼ÓÈëÖмä¼þÀýÈçAlluxio¡£
- 3¡¢Æ½Ì¨¿ÉÒÔ·½±ãµÄ²¿ÊðÔÚÓû§ÍøÂ磬¶ø²»¸Ä±äÆäÊý¾Ý½á¹¹¡£ÀýÈçÁªÍ¨¡¢¹¤ÉÌÒøÐеȡ£
¶ÔÓÚTensorflow/Caffe/MXNet¿ò¼ÜÀ´Ëµ£¬Glusterfs¿ÉÒÔÖ±½ÓÂú×ãÐèÇó¡£¶ø¶ÔÓÚSpark¿ò¼Ü£¬ÎÒÃÇÖ±½ÓÓÃHDFSºÍSparkÏà·ÖÀëµÄ¼ÆËã¼Ü¹¹£¬¾¹ý´óÁ¿µÄBenchmark£¬10GÍøÂçÏÂLR£¬KMEANS£¬Decision Tree£¬Native BayesµÈMLlibËã·¨£¬Êý¾Ý·ÖÀëÓëÊý¾Ý±¾µØÐԶԱȣ¬ÐÔÄÜËðʧÔÚ3%×óÓÒ¡£ÕâÑùÒ»À´£¬ËùÓеĻúÆ÷ѧϰ/Éî¶Èѧϰ¿ò¼Ü¶¼¿ÉÒÔͳһ¼Ü¹¹£¬½«¼ÆËãºÍ´æ´¢Ïà·ÖÀë¡£
Kubernetes×÷ΪÈÝÆ÷¼¯Èº¹ÜÀí¹¤¾ß£¬ÎªÓ¦ÓÃÆ½Ì¨ÌṩÁË»ùÓÚÔÆÔÉúµÄ΢·þÎñÖ§³Ö£¬Æä»îÔ¾µÄÉçÇøÎüÒýÁ˹ã´ó¿ª·¢ÕßµÄÈÈÇ鹨ע£¬´Ì¼¤ÁËÈÝÆ÷ÖܱßÉú̬µÄ¿ìËÙ·¢Õ¹£¬Í¬Ê±ÎªÖڶ໥ÁªÍøÆóÒµ²ÉÓÃÈÝÆ÷¼¯Èº¼Ü¹¹Éý¼¶ÄÚ²¿ITƽ̨ÉèÊ©£¬¹¹½¨¸ßЧ´ó¹æÄ£¼ÆËãÌåϵÌṩÁ˼¼Êõ»ù´¡¡£
AI»ù´¡Æ½Ì¨²¿ÊÇÒ»¸öרע¡¢¿ª·ÅµÄteam£¬ÖÂÁ¦ÓÚ´òÔ찲ȫ¸ßЧµÄ»úÆ÷ѧϰƽ̨¼Ü¹¹£¬ÎªµÇÔÂË㷨ƽ̨Ìṩµ×²ãÖ§³Ö£¬Ñо¿·½ÏòÖ÷ҪΪKubernetes£¬AIËã·¨¹¤³Ì»¯£¬´óÊý¾ÝϵͳÐéÄ⻯µÈ·½Ïò¡£
¸ÐлIntel¹«Ë¾ÔÚSpark on k8s£¬BigDLµÈÁìÓòΪÎÒÃÇÌṩÁËÓÐÁ¦Ö§³ÖºÍ±¦¹ó¾Ñé¡£ |