Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Modeler   Code  
»áÔ±   
 
   
 
 
     
   
 ¶©ÔÄ
  ¾èÖú
Spark Éú̬ϵͳ×é¼þ
 
À´Ô´£ºgeek.csdn.net ·¢²¼ÓÚ£º 2017-1-12
  1921  次浏览      30
 

ÑÔ£ºËæ×Å´óÊý¾Ý¼¼ÊõµÄ·¢Õ¹£¬ÊµÊ±Á÷¼ÆËã¡¢»úÆ÷ѧϰ¡¢Í¼¼ÆËãµÈÁìÓò³ÉΪ½ÏÈȵÄÑо¿·½Ïò£¬¶øSpark×÷Ϊ´óÊý¾Ý´¦ÀíµÄ¡°ÀûÆ÷¡±ÓÐ׎ÏΪ³ÉÊìµÄÉú̬Ȧ£¬Äܹ»Ò»Õ¾Ê½½â¾öÀàËÆ³¡¾°µÄÎÊÌâ¡£ÄÇôSparkÉú̬ϵͳÖÐÓÐÄÄЩ×é¼þÄãÖªµÀÂð£¿ÏÂÃæÈÃÎÒÃǸú×ű¾ÎÄһͬÁ˽âÏ ÕâЩ²»¿É»òȱµÄ×é¼þ¡£±¾ÎÄÑ¡×Ô¡¶Í¼½âSpark£ººËÐļ¼ÊõÓë°¸Àýʵս¡·¡£

Spark Éú̬ϵͳÒÔSpark Core ΪºËÐÄ£¬Äܹ»¶ÁÈ¡´«Í³Îļþ£¨ÈçÎı¾Îļþ£©¡¢HDFS¡¢Amazon S3¡¢Alluxio ºÍNoSQL µÈÊý¾ÝÔ´£¬ÀûÓÃStandalone¡¢YARN ºÍMesos µÈ×ÊÔ´µ÷¶È¹ÜÀí£¬Íê³ÉÓ¦ÓóÌÐò·ÖÎöÓë´¦Àí¡£ÕâЩӦÓóÌÐòÀ´×ÔSpark µÄ²»Í¬×é¼þ£¬ÈçSpark Shell »òSpark Submit ½»»¥Ê½Åú´¦Àí·½Ê½¡¢Spark Streaming µÄʵʱÁ÷´¦ÀíÓ¦Óá¢Spark SQL µÄ¼´Ï¯²éѯ¡¢²ÉÑù½üËÆ²éѯÒýÇæBlinkDB µÄȨºâ²éѯ¡¢MLbase/MLlib µÄ»úÆ÷ѧϰ¡¢GraphX µÄͼ´¦ÀíºÍSparkR µÄÊýѧ¼ÆËãµÈ£¬ÈçÏÂͼËùʾ£¬ÕýÊÇÕâ¸öÉú̬ϵͳʵÏÖÁË¡°One Stack to Rule Them All¡±Ä¿±ê¡£

Spark Core

Spark Core ÊÇÕû¸öBDAS Éú̬ϵͳµÄºËÐÄ×é¼þ£¬ÊÇÒ»¸ö·Ö²¼Ê½´óÊý¾Ý´¦Àí¿ò¼Ü¡£Spark CoreÌṩÁ˶àÖÖ×ÊÔ´µ÷¶È¹ÜÀí£¬Í¨¹ýÄÚ´æ¼ÆËã¡¢ÓÐÏòÎÞ»·Í¼£¨DAG£©µÈ»úÖÆ±£Ö¤·Ö²¼Ê½¼ÆËãµÄ¿ìËÙ£¬²¢ÒýÈëÁËRDD µÄ³éÏó±£Ö¤Êý¾ÝµÄ¸ßÈÝ´íÐÔ£¬ÆäÖØÒªÌØÐÔÃèÊöÈçÏ¡£

Spark CoreÌṩÁ˶àÖÖÔËÐÐģʽ£¬²»½ö¿ÉÒÔʹÓÃ×ÔÉíÔËÐÐģʽ´¦ÀíÈÎÎñ£¬Èç±¾µØÄ£Ê½¡¢Standalone£¬¶øÇÒ¿ÉÒÔʹÓõÚÈý·½×ÊÔ´µ÷¶È¿ò¼ÜÀ´´¦ÀíÈÎÎñ£¬ÈçYARN¡¢MESOSµÈ¡£Ïà±È½Ï¶øÑÔ£¬µÚÈý·½×ÊÔ´µ÷¶È¿ò¼ÜÄܹ»¸üϸÁ£¶È¹ÜÀí×ÊÔ´¡£

Spark CoreÌṩÁËÓÐÏòÎÞ»·Í¼£¨DAG£©µÄ·Ö²¼Ê½²¢ÐмÆËã¿ò¼Ü£¬²¢ÌṩÄÚ´æ»úÖÆÀ´Ö§³Ö¶à´Îµü´ú¼ÆËã»òÕßÊý¾Ý¹²Ïí£¬´ó´ó¼õÉÙµü´ú¼ÆËãÖ®¼ä¶ÁÈ¡Êý¾ÝµÄ¿ªÏú£¬Õâ¶ÔÓÚÐèÒª½øÐжà´Îµü´úµÄÊý¾ÝÍÚ¾òºÍ·ÖÎöÐÔÄÜÓм«´óÌáÉý¡£ÁíÍ⣬ÔÚÈÎÎñ´¦Àí¹ý³ÌÖÐÒÆ¶¯¼ÆËã¶ø·ÇÒÆ¶¯Êý¾Ý£¬RDDPartition ¿ÉÒԾͽü¶ÁÈ¡·Ö²¼Ê½ÎļþϵͳÖеÄÊý¾Ý¿éµ½¸÷¸ö½ÚµãÄÚ´æÖнøÐмÆËã¡£

ÔÚSpark ÖÐÒýÈëÁËRDDµÄ³éÏó£¬ËüÊÇ·Ö²¼ÔÚÒ»×é½ÚµãÖеÄÖ»¶Á¶ÔÏ󼯺ϣ¬ÕâЩ¼¯ºÏÊǵ¯ÐԵģ¬Èç¹ûÊý¾Ý¼¯Ò»²¿·Ö¶ªÊ§£¬Ôò¿ÉÒÔ¸ù¾Ý¡°ÑªÍ³¡±¶ÔËüÃǽøÐÐÖØ½¨£¬±£Ö¤ÁËÊý¾ÝµÄ¸ßÈÝ´íÐÔ¡£

Spark Streaming

Spark Streaming ÊÇÒ»¸ö¶ÔʵʱÊý¾ÝÁ÷½øÐиßÍÌÍ¡¢¸ßÈÝ´íµÄÁ÷ʽ´¦Àíϵͳ£¬¿ÉÒÔ¶Ô¶àÖÖÊý¾ÝÔ´£¨ÈçKafka¡¢Flume¡¢Twitter ºÍZeroMQ µÈ£©½øÐÐÀàËÆMap¡¢Reduce ºÍJoin µÈ¸´ÔÓ²Ù×÷£¬²¢½«½á¹û±£´æµ½ÍⲿÎļþϵͳ¡¢Êý¾Ý¿â»òÓ¦Óõ½ÊµÊ±ÒDZíÅÌ£¬ÈçÏÂͼ¡£

Ïà±ÈÆäËûµÄ´¦ÀíÒýÇæÒªÃ´Ö»×¨×¢ÓÚÁ÷´¦Àí£¬ÒªÃ´Ö»¸ºÔðÅú´¦Àí£¨½öÌṩÐèÒªÍⲿʵÏÖµÄÁ÷´¦ÀíAPI ½Ó¿Ú£©£¬¶øSpark Streaming ×î´óµÄÓÅÊÆÊÇÌṩµÄ´¦ÀíÒýÇæºÍRDD ±à³ÌÄ£ÐÍ¿ÉÒÔͬʱ½øÐÐÅú´¦ÀíÓëÁ÷´¦Àí¡£

¶ÔÓÚ´«Í³Á÷´¦ÀíÖÐÒ»´Î´¦ÀíÒ»Ìõ¼Ç¼µÄ·½Ê½¶øÑÔ£¬Spark Streaming ʹÓõÄÊǽ«Á÷Êý¾ÝÀëÉ¢»¯´¦Àí£¨Discretized Streams£©£¬Í¨¹ý¸Ã´¦Àí·½Ê½Äܹ»½øÐÐÃë¼¶ÒÔϵÄÊý¾ÝÅú´¦Àí¡£ÔÚSparkStreaming ´¦Àí¹ý³ÌÖУ¬Receiver ²¢ÐнÓÊÕÊý¾Ý£¬²¢½«Êý¾Ý»º´æÖÁSpark ¹¤×÷½ÚµãµÄÄÚ´æÖС£¾­¹ýÑÓ³ÙÓÅ»¯ºó£¬Spark ÒýÇæ¶Ô¶ÌÈÎÎñ£¨¼¸Ê®ºÁÃ룩Äܹ»½øÐÐÅú´¦Àí£¬²¢Çҿɽ«½á¹ûÊä³öÖÁÆäËûϵͳÖС£Ó봫ͳÁ¬ÐøËã×ÓÄ£ÐͲ»Í¬£¬ÆäÄ£ÐÍÊǾ²Ì¬·ÖÅä¸øÒ»¸ö½Úµã½øÐмÆË㣬¶øSpark ¿É»ùÓÚÊý¾ÝµÄÀ´Ô´ÒÔ¼°¿ÉÓÃ×ÊÔ´Çé¿ö¶¯Ì¬·ÖÅ䏸¹¤×÷½Úµã¡£

ʹÓÃÀëÉ¢»¯Á÷Êý¾Ý£¨DStreaming£©£¬Spark Streaming ½«¾ßÓÐÈçÏÂÌØÐÔ¡£

¶¯Ì¬¸ºÔؾùºâ£ºSpark Streaming

½«Êý¾Ý»®·ÖΪСÅúÁ¿£¬Í¨¹ýÕâÖÖ·½Ê½¿ÉÒÔʵÏÖ¶Ô×ÊÔ´¸üϸÁ£¶ÈµÄ·ÖÅä¡£ÀýÈ磬´«Í³ÊµÊ±Á÷¼Ç¼´¦ÀíϵͳÔÚÊäÈëÊý¾ÝÁ÷ÒÔ¼üÖµ½øÐзÖÇø´¦ÀíÇé¿öÏ£¬Èç¹ûÒ»¸ö½Úµã¼ÆËãѹÁ¦½Ï´ó³¬³öÁ˸ººÉ£¬¸Ã½Úµã½«³ÉΪƿ¾±£¬½ø¶øÍÏÂýÕû¸öϵͳµÄ´¦ÀíËÙ¶È¡£¶øÔÚSpark¡¡StreamingÖУ¬×÷ÒµÈÎÎñ½«»á¶¯Ì¬µØÆ½ºâ·ÖÅ䏸¸÷¸ö½Úµã£¬Èçͼ£¬¼´Èç¹ûÈÎÎñ´¦Àíʱ¼ä½Ï³¤£¬·ÖÅäµÄÈÎÎñÊýÁ¿½«ÉÙЩ£»Èç¹ûÈÎÎñ´¦Àíʱ¼ä½Ï¶Ì£¬Ôò·ÖÅäµÄÈÎÎñÊý¾Ý½«¸ü¶àЩ¡£

¿ìËÙ¹ÊÕϻָ´»úÖÆ£ºÔÚ½Úµã³öÏÖ¹ÊÕϵÄÇé¿öÏ£¬´«Í³Á÷´¦Àíϵͳ»áÔÚÆäËûµÄ½ÚµãÉÏÖØÆôʧ°ÜµÄÁ¬ÐøËã×Ó£¬²¢¿ÉÄÜÖØÐÂÔËÐÐÏÈǰÊý¾ÝÁ÷´¦Àí²Ù×÷»ñÈ¡²¿·Ö¶ªÊ§Êý¾Ý¡£Ôڴ˹ý³ÌÖÐÖ»ÓиýڵãÖØÐ´¦Àíʧ°ÜµÄ¹ý³Ì£¬Ö»ÓÐÔÚнڵãÍê³É¹ÊÕÏǰËùÓмÆËãºó£¬Õû¸öϵͳ²ÅÄܹ»´¦ÀíÆäËûÈÎÎñ¡£ÔÚSparkÖУ¬¼ÆË㽫·Ö³ÉÐí¶àСµÄÈÎÎñ£¬±£Ö¤ÄÜÔÚÈκνڵãÔËÐкóÄܹ»ÕýÈ·½øÐкϲ¢¡£Òò´Ë£¬ÔÚij½Úµã³öÏֵĹÊÕϵÄÇé¿ö£¬Õâ¸ö½ÚµãµÄÈÎÎñ½«¾ùÔȵطÖÉ¢µ½¼¯ÈºÖÐµÄ½Úµã½øÐмÆË㣬Ïà¶ÔÓÚ´«µÝ¹ÊÕϻָ´»úÖÆÄܹ»¸ü¿ìµØ»Ö¸´¡£

Åú´¦Àí¡¢Á÷´¦ÀíÓë½»»¥Ê½·ÖÎöµÄÒ»Ì廯£ºSpark Streaming Êǽ«Á÷ʽ¼ÆËã·Ö½â³ÉһϵÁжÌСµÄÅú´¦Àí×÷Òµ£¬Ò²¾ÍÊǰÑSpark Streaming µÄÊäÈëÊý¾Ý°´ÕÕÅú´¦Àí´óС£¨È缸Ã룩·Ö³ÉÒ»¶ÎÒ»¶ÎµÄÀëÉ¢Êý¾ÝÁ÷£¨DStream£©£¬Ã¿Ò»¶ÎÊý¾Ý¶¼×ª»»³ÉSpark ÖеÄRDD£¬È»ºó½«Spark Streaming ÖжÔDStream Á÷´¦Àí²Ù×÷±äΪÕë¶ÔSpark ÖжÔRDD µÄÅú´¦Àí²Ù×÷¡£ÁíÍ⣬Á÷Êý¾Ý¶¼´¢´æÔÚSpark ½ÚµãµÄÄÚ´æÀÓû§±ãÄܸù¾ÝËùÐè½øÐн»»¥²éѯ¡£ÕýÊÇÀûÓÃÁËSpark ÕâÖÖ¹¤×÷»úÖÆ½«Åú´¦Àí¡¢Á÷´¦ÀíÓë½»»¥Ê½¹¤×÷½áºÏÔÚÒ»Æð¡£

Spark SQL

Spark SQL µÄǰÉíÊÇShark£¬Ëü·¢²¼Ê±Hive ¿ÉÒÔ˵ÊÇSQL on Hadoop µÄΨһѡÔñ£¨Hive ¸ºÔð½«SQL ±àÒë³É¿ÉÀ©Õ¹µÄMapReduce ×÷Òµ£©£¬¼øÓÚHive µÄÐÔÄÜÒÔ¼°ÓëSpark µÄ¼æÈÝ£¬Shark Óɴ˶øÉú¡£

Shark ¼´Hive on Spark£¬±¾ÖÊÉÏÊÇͨ¹ýHive µÄHQL ½øÐнâÎö£¬°ÑHQL ·­Òë³ÉSpark É϶ÔÓ¦µÄRDD ²Ù×÷£¬È»ºóͨ¹ýHive µÄMetadata »ñÈ¡Êý¾Ý¿âÀïµÄ±íÐÅÏ¢£¬Êµ¼ÊΪHDFS ÉϵÄÊý¾ÝºÍÎļþ£¬×îºóÓÉShark »ñÈ¡²¢·Åµ½Spark ÉÏÔËËã¡£Shark µÄ×î´óÌØÐÔ¾ÍÊÇËٶȿ죬ÄÜÓëHive µÄÍêÈ«¼æÈÝ£¬²¢ÇÒ¿ÉÒÔÔÚShell ģʽÏÂʹÓÃrdd2sql ÕâÑùµÄAPI£¬°ÑHQL µÃµ½µÄ½á¹û¼¯¼ÌÐøÔÚScala»·¾³ÏÂÔËË㣬֧³ÖÓû§±àд¼òµ¥µÄ»úÆ÷ѧϰ»ò¼òµ¥·ÖÎö´¦Àíº¯Êý£¬¶ÔHQL ½á¹û½øÒ»²½·ÖÎö¼ÆËã¡£

ÔÚ2014 Äê7 ÔÂ1 ÈÕµÄSpark Summit ÉÏ£¬Databricks Ðû²¼ÖÕÖ¹¶ÔShark µÄ¿ª·¢£¬½«Öصã·Åµ½Spark SQL ÉÏ¡£Ôڴ˴λáÒéÉÏ£¬Databricks ±íʾ£¬Shark ¸ü¶àÊǶÔHive µÄ¸ÄÔì£¬Ìæ»»ÁËHive µÄÎïÀíÖ´ÐÐÒýÇæ£¬Ê¹Ö®ÓÐÒ»¸ö½Ï¿ìµÄ´¦ÀíËÙ¶È¡£È»¶ø£¬²»ÈݺöÊÓµÄÊÇ£¬Shark ¼Ì³ÐÁË´óÁ¿µÄHive´úÂ룬Òò´Ë¸øÓÅ»¯ºÍά»¤´øÀ´´óÁ¿µÄÂé·³¡£Ëæ×ÅÐÔÄÜÓÅ»¯ºÍÏȽø·ÖÎöÕûºÏµÄ½øÒ»²½¼ÓÉ»ùÓÚMapReduce Éè¼ÆµÄ²¿·ÖÎÞÒɳÉΪÁËÕû¸öÏîÄ¿µÄÆ¿¾±¡£Òò´Ë£¬ÎªÁ˸üºÃµÄ·¢Õ¹£¬¸øÓû§Ìṩһ¸ö¸üºÃµÄÌåÑ飬Databricks Ðû²¼ÖÕÖ¹Shark ÏîÄ¿£¬´Ó¶ø½«¸ü¶àµÄ¾«Á¦·Åµ½Spark SQL ÉÏ¡£

Spark SQL ÔÊÐí¿ª·¢ÈËÔ±Ö±½Ó´¦ÀíRDD£¬Í¬Ê±Ò²¿É²éѯÔÚ Hive ÉÏ´æÔÚµÄÍⲿÊý¾Ý¡£SparkSQL µÄÒ»¸öÖØÒªÌØµãÊÇÄܹ»Í³Ò»´¦Àí¹ØÏµ±íºÍRDD£¬Ê¹µÃ¿ª·¢ÈËÔ±¿ÉÒÔÇáËɵØÊ¹ÓÃSQL ÃüÁî½øÐÐÍⲿ²éѯ£¬Í¬Ê±½øÐиü¸´ÔÓµÄÊý¾Ý·ÖÎö¡£

Spark SQL µÄÌØµãÈçÏ¡£

ÒýÈëÁËеÄRDD ÀàÐÍSchemaRDD£¬¿ÉÒÔÏñ´«Í³Êý¾Ý¿â¶¨Òå±íÒ»ÑùÀ´¶¨ÒåSchemaRDD¡£ SchemaRDDÓɶ¨ÒåÁËÁÐÊý¾ÝÀàÐ͵ÄÐжÔÏ󹹳ɡ£SchemaRDD ¼È¿ÉÒÔ´ÓRDD ת»»¹ý À´£¬Ò²¿ÉÒÔ´ÓParquet Îļþ¶ÁÈ룬»¹¿ÉÒÔʹÓÃHiveQL´ÓHive ÖлñÈ¡¡£

ÄÚǶÁËCatalyst ²éѯÓÅ»¯¿ò¼Ü£¬ÔÚ°ÑSQL ½âÎö³ÉÂß¼­Ö´Ðмƻ®Ö®ºó£¬ÀûÓÃCatalyst °üÀïµÄһЩÀàºÍ½Ó¿Ú£¬Ö´ÐÐÁËһЩ¼òµ¥µÄÖ´Ðмƻ®ÓÅ»¯£¬×îºó±ä³ÉRDD µÄ¼ÆËã¡£

ÔÚÓ¦ÓóÌÐòÖпÉÒÔ»ìºÏʹÓò»Í¬À´Ô´µÄÊý¾Ý£¬Èç¿ÉÒÔ½«À´×ÔHiveQLµÄÊý¾ÝºÍÀ´×ÔSQLµÄÊý¾Ý½øÐÐJoin ²Ù×÷¡£ SharkµÄ³öÏÖʹµÃSQL-on-Hadoop µÄÐÔÄܱÈHive ÓÐÁË10¡«100 ±¶µÄÌá¸ß£¬ÄÇô£¬°ÚÍÑÁË Hive µÄÏÞÖÆ£¬Spark SQLµÄÐÔÄÜÓÖÓÐÔõôÑùµÄ±íÏÖÄØ£¿ËäȻûÓÐShark Ïà¶ÔÓÚHive ÄÇÑùÖõÄ¿µÄ ÐÔÄÜÌáÉý£¬µ«Ò²±íÏÖµÃÓÅÒ죬Èçͼ£¨ÆäÖУ¬ÓÒ²àÊý¾ÝΪSpark¡¡SQL£©¡£

ΪʲôSpark SQL µÄÐÔÄÜ»áµÃµ½Õâô´óµÄÌáÉýÄØ£¿Ö÷ÒªÊÇSpark SQL ÔÚÒÔϼ¸µã×öÁËÓÅ»¯¡£

ÄÚ´æÁд洢£¨In-Memory Columnar Storage£©£ºSpark SQL µÄ±íÊý¾ÝÔÚÄÚ´æÖд洢²»ÊDzÉÓÃÔ­Éú̬µÄJVM¶ÔÏó´æ´¢·½Ê½£¬¶øÊDzÉÓÃÄÚ´æÁд洢¡£

×Ö½ÚÂëÉú³É¼¼Êõ£¨Bytecode Generation£©£ºSpark 1.1.0 ÔÚCatalyst Ä£¿éµÄExpressions

Ôö¼ÓÁËCodegen Ä£¿é£¬Ê¹Óö¯Ì¬×Ö½ÚÂëÉú³É¼¼Êõ£¬¶ÔÆ¥ÅäµÄ±í´ïʽ²ÉÓÃÌØ¶¨µÄ´úÂ붯̬±àÒë¡£ÁíÍâ¶ÔSQL ±í´ïʽ¶¼×öÁËCG ÓÅ»¯¡£CGÓÅ»¯µÄʵÏÖÖ÷Òª»¹ÊÇÒÀ¿¿Scala 2.10ÔËÐÐʱµÄ·´Éä»úÖÆ£¨Runtime Reflection£©¡£

Scala ´úÂëÓÅ»¯£ºSpark SQL ÔÚʹÓÃScala±àд´úÂëµÄʱºò£¬¾¡Á¿±ÜÃâµÍЧµÄ¡¢ÈÝÒ×GCµÄ´úÂ룻¾¡¹ÜÔö¼ÓÁ˱àд´úÂëµÄÄѶȣ¬µ«¶ÔÓÚÓû§À´Ëµ½Ó¿Úͳһ¡£

BlinkDB

BlinkDB ÊÇÒ»¸öÓÃÓÚÔÚº£Á¿Êý¾ÝÉÏÔËÐн»»¥Ê½SQL ²éѯµÄ´ó¹æÄ£²¢ÐвéѯÒýÇæ£¬ËüÔÊÐíÓû§Í¨¹ýȨºâÊý¾Ý¾«¶ÈÀ´ÌáÉý²éѯÏìӦʱ¼ä£¬ÆäÊý¾ÝµÄ¾«¶È±»¿ØÖÆÔÚÔÊÐíµÄÎó²î·¶Î§ÄÚ¡£ÎªÁË´ïµ½Õâ¸öÄ¿±ê£¬BlinkDB ʹÓÃÈçϺËÐÄ˼Ï룺

×ÔÊÊÓ¦ÓÅ»¯¿ò¼Ü£¬´ÓԭʼÊý¾ÝËæ×Åʱ¼äµÄÍÆÒÆ½¨Á¢²¢Î¬»¤Ò»×é¶àάÑù±¾¡£

¶¯Ì¬Ñù±¾Ñ¡Ôñ²ßÂÔ£¬Ñ¡ÔñÒ»¸öÊʵ±´óСµÄʾÀý£¬¸ÃʾÀý»ùÓÚ²éѯµÄ׼ȷÐÔºÍÏìӦʱ¼äµÄ½ôÆÈÐÔ¡£ºÍ´«Í³¹ØÏµÐÍÊý¾Ý¿â²»Í¬£¬BlinkDBÊÇÒ»¸ö½»»¥Ê½²éѯϵͳ£¬¾ÍÏñÒ»¸öõÎõΰ壬Óû§ÐèÒªÔÚ²éѯ¾«¶ÈºÍ²éѯʱ¼äÉÏ×öȨºâ£»Èç¹ûÓû§Ïë¸ü¿ìµØ»ñÈ¡²éѯ½á¹û£¬ÄÇô½«ÎþÉü²éѯ½á¹ûµÄ¾«¶È£»·´Ö®£¬Óû§Èç¹ûÏë»ñÈ¡¸ü¸ß¾«¶ÈµÄ²éѯ½á¹û£¬¾ÍÐèÒªÎþÉü²éѯÏìӦʱ¼ä¡£ÏÂͼΪBlinkDB¼Ü¹¹¡£

MLBase/MLlib

MLBase ÊÇSpark Éú̬ϵͳÖÐרעÓÚ»úÆ÷ѧϰµÄ×é¼þ£¬ËüµÄÄ¿±êÊÇÈûúÆ÷ѧϰµÄÃż÷¸üµÍ£¬ÈÃһЩ¿ÉÄܲ¢²»Á˽â»úÆ÷ѧϰµÄÓû§Äܹ»·½±ãµØÊ¹ÓÃMLBase¡£MLBase ·ÖΪ4 ¸ö²¿·Ö£ºMLRuntime¡¢MLlib¡¢MLI ºÍML Optimizer¡£

MLRuntime£ºÊÇÓÉSpark Core ÌṩµÄ·Ö²¼Ê½ÄÚ´æ¼ÆËã¿ò¼Ü£¬ÔËÐÐÓÉOptimizerÓÅ»¯¹ýµÄËã·¨½øÐÐÊý¾ÝµÄ¼ÆËã²¢Êä³ö·ÖÎö½á¹û¡£

MLlib£ºÊÇSpark ʵÏÖһЩ³£¼ûµÄ»úÆ÷ѧϰËã·¨ºÍʵÓóÌÐò£¬°üÀ¨·ÖÀà¡¢»Ø¹é¡¢¾ÛÀࡢЭͬ¹ýÂË¡¢½µÎ¬ÒÔ¼°µ×²ãÓÅ»¯¡£¸ÃËã·¨¿ÉÒÔ½øÐпÉÀ©³ä¡£

MLI£ºÊÇÒ»¸ö½øÐÐÌØÕ÷³éÈ¡ºÍ¸ß¼¶ML ±à³Ì³éÏóË㷨ʵÏÖµÄAPI »òƽ̨¡£

MLOptimizer£º»áÑ¡ÔñËüÈÏΪ×îÊʺϵÄÒѾ­ÔÚÄÚ²¿ÊµÏÖºÃÁ˵ĻúÆ÷ѧϰËã·¨ºÍÏà¹Ø²ÎÊý£¬À´´¦ÀíÓû§ÊäÈëµÄÊý¾Ý£¬²¢·µ»ØÄ£ÐÍ»òÆäËû°ïÖú·ÖÎöµÄ½á¹û¡£

MLBase µÄºËÐÄÊÇÆäÓÅ»¯Æ÷£¨ML Optimizer£©£¬Ëü¿ÉÒÔ°ÑÉùÃ÷ʽµÄÈÎÎñת»¯³É¸´ÔÓµÄѧϰ¼Æ»®£¬×îÖÕ²ú³ö×îÓŵÄÄ£ÐͺͼÆËã½á¹û¡£MLBase ÓëÆäËû»úÆ÷ѧϰWeka ºÍMahout ²»Í¬£¬ÈýÕ߸÷ÓÐÌØÉ«£¬¾ßÌåÄÚÈÝÈçÏ¡£

MLBase »ùÓÚSpark£¬ËüÊÇʹÓõÄÊÇ·Ö²¼Ê½ÄÚ´æ¼ÆËãµÄ£»Weka ÊÇÒ»¸öµ¥»úµÄϵͳ£¬¶øMahout ÊÇʹÓÃMapReduce

½øÐд¦ÀíÊý¾Ý£¨Mahout ÕýÏòʹÓÃSpark ´¦ÀíÊý¾Ýת±ä£©¡£

MLBase ÊÇ×Ô¶¯»¯´¦ÀíµÄ£»Weka ºÍMahout ¶¼ÐèҪʹÓÃÕ߾߱¸»úÆ÷ѧϰ¼¼ÄÜ£¬À´Ñ¡Ôñ×Ô¼ºÏëÒªµÄËã·¨ºÍ²ÎÊýÀ´×ö´¦Àí¡£

MLBase ÌṩÁ˲»Í¬³éÏó³Ì¶ÈµÄ½Ó¿Ú£¬¿ÉÒÔÓÉÓû§Í¨¹ý¸Ã½Ó¿ÚʵÏÖËã·¨µÄÀ©Õ¹¡£

GraphX

GraphX ×î³õÊDz®¿ËÀûAMP ʵÑéÊÒµÄÒ»¸ö·Ö²¼Ê½Í¼¼ÆËã¿ò¼ÜÏîÄ¿£¬ºóÀ´ÕûºÏµ½Spark ÖгÉΪһ¸öºËÐÄ×é¼þ¡£ËüÊÇSpark ÖÐÓÃÓÚͼºÍͼ²¢ÐмÆËãµÄAPI£¬¿ÉÒÔÈÏΪÊÇGraphLab ºÍPregel ÔÚSpark ÉϵÄÖØÐ´¼°ÓÅ»¯¡£¸úÆäËû·Ö²¼Ê½Í¼¼ÆËã¿ò¼ÜÏà±È£¬GraphX ×î´óµÄÓÅÊÆÊÇ£ºÔÚSpark »ù´¡ÉÏÌṩÁËһջʽÊý¾Ý½â¾ö·½°¸£¬¿ÉÒÔ¸ßЧµØÍê³Éͼ¼ÆËãµÄÍêÕûµÄÁ÷Ë®×÷Òµ¡£

GraphX µÄºËÐijéÏóÊÇResilient Distributed Property Graph£¬Ò»ÖÖµãºÍ±ß¶¼´øÊôÐÔµÄÓÐÏò¶àÖØÍ¼¡£GraphX À©Õ¹ÁËSpark RDD µÄ³éÏó£¬ËüÓÐTable ºÍGraph Á½ÖÖÊÓͼ£¬µ«Ö»ÐèÒªÒ»·ÝÎïÀí´æ´¢£¬Á½ÖÖÊÓͼ¶¼ÓÐ×Ô¼º¶ÀÓеIJÙ×÷·û£¬´Ó¶ø»ñµÃÁËÁé»î²Ù×÷ºÍÖ´ÐÐЧÂÊ¡£GraphX µÄÕûÌå¼Ü¹¹Öд󲿷ֵÄʵÏÖ¶¼ÊÇÎ§ÈÆPartition µÄÓÅ»¯½øÐеģ¬ÕâÔÚijÖ̶ֳÈÉÏ˵Ã÷ÁË£¬µã·Ö¸îµÄ´æ´¢ºÍÏàÓ¦µÄ¼ÆËãÓÅ»¯µÄÈ·ÊÇͼ¼ÆËã¿ò¼ÜµÄÖØµãºÍÄѵ㡣

GraphX µÄµ×²ãÉè¼ÆÓÐÒÔϼ¸¸ö¹Ø¼üµã¡£

£¨1£©¶ÔGraph ÊÓͼµÄËùÓвÙ×÷£¬×îÖÕ¶¼»áת»»³ÉÆä¹ØÁªµÄTable ÊÓͼµÄRDD ²Ù×÷À´Íê³É¡£ÕâÑù¶ÔÒ»¸öͼµÄ¼ÆË㣬×îÖÕÔÚÂß¼­ÉÏ£¬µÈ¼ÛÓÚһϵÁÐRDD µÄת»»¹ý³Ì¡£Òò´Ë£¬Graph ×îÖվ߱¸ÁËRDD µÄ3 ¸ö¹Ø¼üÌØÐÔ£ºImmutable¡¢Distributed ºÍFault-Tolerant¡£ÆäÖÐ×î¹Ø¼üµÄÊÇImmutable£¨²»±äÐÔ£©¡£Âß¼­ÉÏ£¬ËùÓÐͼµÄת»»ºÍ²Ù×÷¶¼²úÉúÁËÒ»¸öÐÂͼ£»ÎïÀíÉÏ£¬GraphX »áÓÐÒ»¶¨³Ì¶ÈµÄ²»±ä¶¥µãºÍ±ßµÄ¸´ÓÃÓÅ»¯£¬¶ÔÓû§Í¸Ã÷¡£

£¨2£©Á½ÖÖÊÓͼµ×²ã¹²ÓõÄÎïÀíÊý¾Ý£¬ÓÉRDD[Vertex-Partition]ºÍRDD[EdgePartition]ÕâÁ½¸öRDD ×é³É¡£µãºÍ±ßʵ¼Ê¶¼²»ÊÇÒÔ±íCollection[tuple] µÄÐÎʽ´æ´¢µÄ£¬ ¶øÊÇÓÉVertexPartition/EdgePartition ÔÚÄÚ²¿´æ´¢Ò»¸ö´øË÷Òý½á¹¹µÄ·ÖƬÊý¾Ý¿é£¬ÒÔ¼ÓËÙ²»Í¬ÊÓͼϵıéÀúËÙ¶È¡£²»±äµÄË÷Òý½á¹¹ÔÚRDD ת»»¹ý³ÌÖÐÊǹ²Óõ쬽µµÍÁ˼ÆËãºÍ´æ´¢¿ªÏú¡£

£¨3£©Í¼µÄ·Ö²¼Ê½´æ´¢²ÉÓõã·Ö¸îģʽ£¬¶øÇÒʹÓÃpartitionBy ·½·¨£¬ÓÉÓû§Ö¸¶¨²»Í¬µÄ»®·Ö²ßÂÔ£¨PartitionStrategy£©¡£»®·Ö²ßÂԻὫ±ß·ÖÅäµ½¸÷¸öEdgePartition£¬¶¥µãMaster ·ÖÅäµ½¸÷¸öVertexPartition£¬EdgePartition Ò²»á»º´æ±¾µØ±ß¹ØÁªµãµÄGhost ¸±±¾¡£»®·Ö²ßÂԵIJ»Í¬»áÓ°Ïìµ½ËùÐèÒª»º´æµÄGhost ¸±±¾ÊýÁ¿£¬ÒÔ¼°Ã¿¸öEdgePartition ·ÖÅäµÄ±ßµÄ¾ùºâ³Ì¶È£¬ÐèÒª¸ù¾ÝͼµÄ½á¹¹ÌØÕ÷ѡȡ×î¼Ñ²ßÂÔ¡£

SparkR

R ÊÇ×ñÑ­GNU ЭÒéµÄÒ»¿î¿ªÔ´¡¢Ãâ·ÑµÄÈí¼þ£¬¹ã·ºÓ¦ÓÃÓÚͳ¼Æ¼ÆËãºÍͳ¼ÆÖÆÍ¼£¬µ«ÊÇËüÖ»Äܵ¥»úÔËÐС£ÎªÁËÄܹ»Ê¹ÓÃR ÓïÑÔ·ÖÎö´ó¹æÄ£·Ö²¼Ê½µÄÊý¾Ý£¬²®¿ËÀû·ÖУAMP ʵÑéÊÒ¿ª·¢ÁËSparkR£¬²¢ÔÚSpark 1.4 °æ±¾ÖмÓÈëÁ˸Ã×é¼þ¡£Í¨¹ýSparkR ¿ÉÒÔ·ÖÎö´ó¹æÄ£µÄÊý¾Ý¼¯£¬²¢Í¨¹ýR Shell ½»»¥Ê½µØÔÚSparkR ÉÏÔËÐÐ×÷Òµ¡£SparkR ÌØÐÔÈçÏ£º

ÌṩÁËSpark Öе¯ÐÔ·Ö²¼Ê½Êý¾Ý¼¯£¨RDDs£©µÄAPI£¬Óû§¿ÉÒÔÔÚ¼¯ÈºÉÏͨ¹ýR Shell½»»¥ÐÔµØÔËÐÐSpark ÈÎÎñ¡£

Ö§³ÖÐò»¯±Õ°ü¹¦ÄÜ£¬¿ÉÒÔ½«Óû§¶¨Ò庯ÊýÖÐËùÒýÓõ½µÄ±äÁ¿×Ô¶¯Ðò»¯·¢Ë͵½¼¯ÈºÖÐÆäËûµÄ»úÆ÷ÉÏ¡£

SparkR »¹¿ÉÒÔºÜÈÝÒ׵ص÷ÓÃR ¿ª·¢°ü£¬Ö»ÐèÒªÔÚ¼¯ÈºÉÏÖ´ÐвÙ×÷ǰÓÃincludePackage¶ÁÈ¡R ¿ª·¢°ü¾Í¿ÉÒÔÁË¡£

ÏÂΪSparkR µÄ´¦ÀíÁ÷³ÌʾÒâͼ¡£

Alluxio

Alluxio ÊÇÒ»¸ö·Ö²¼Ê½ÄÚ´æÎļþϵͳ£¬ËüÊÇÒ»¸ö¸ßÈÝ´íµÄ·Ö²¼Ê½Îļþϵͳ£¬ÔÊÐíÎļþÒÔÄÚ´æµÄËÙ¶ÈÔÚ¼¯Èº¿ò¼ÜÖнøÐпɿ¿µÄ¹²Ïí£¬¾ÍÏñSpark ºÍ MapReduce ÄÇÑù¡£Alluxio ÊǼܹ¹ÔÚ×îµ×²ãµÄ·Ö²¼Ê½Îļþ´æ´¢ºÍÉϲãµÄ¸÷ÖÖ¼ÆËã¿ò¼ÜÖ®¼äµÄÒ»ÖÖÖмä¼þ¡£ÆäÖ÷ÒªÖ°ÔðÊǽ«ÄÇЩ²»ÐèÒªÂ䵨µ½DFS ÀïµÄÎļþ£¬Â䵨µ½·Ö²¼Ê½ÄÚ´æÎļþϵͳÖУ¬À´´ïµ½¹²ÏíÄڴ棬´Ó¶øÌá¸ßЧÂÊ¡£Í¬Ê±¿ÉÒÔ¼õÉÙÄÚ´æÈßÓà¡¢GC ʱ¼äµÈ¡£

ºÍHadoop ÀàËÆ£¬Alluxio µÄ¼Ü¹¹ÊÇ´«Í³µÄMaster-Slave ¼Ü¹¹£¬ËùÓеÄAlluxio Worker ¶¼±»Alluxio Master Ëù¹ÜÀí£¬Alluxio Master ͨ¹ýAlluxio Worker ¶¨Ê±·¢³öµÄÐÄÌøÀ´ÅжÏWorker ÊÇ·ñÒѾ­±ÀÀ£ÒÔ¼°Ã¿¸öWorker Ê£ÓàµÄÄÚ´æ¿Õ¼äÁ¿£¬ÎªÁË·ÀÖ¹µ¥µãÎÊÌâʹÓÃÁËZooKeeper ×öÁËHA¡£

Alluxio ¾ßÓÐÈçÏÂÌØÐÔ¡£

AVA-Like File API£ºAlluxio ÌṩÀàËÆJava File ÀàµÄAPI¡£

¼æÈÝÐÔ£ºAlluxio ʵÏÖÁËHDFS ½Ó¿Ú£¬ËùÒÔSpark ºÍMapReduce ³ÌÐò²»ÐèÒªÈκÎÐ޸ļ´¿ÉÔËÐС£

¿É²å°ÎµÄµ×²ãÎļþϵͳ£ºAlluxioÊÇÒ»¸ö¿É²å°ÎµÄµ×²ãÎļþϵͳ£¬ÌṩÈÝ´í¹¦ÄÜ£¬Ëü½«ÄÚ´æÊý¾Ý¼Ç¼ÔڵײãÎļþϵͳ¡£ËüÓÐÒ»¸öͨÓõĽӿڣ¬¿ÉÒÔºÜÈÝÒ׵زåÈëµ½²»Í¬µÄµ×²ãÎļþϵͳ¡£Ä¿Ç°Ö§³ÖHDFS¡¢S3¡¢GlusterFSºÍµ¥½ÚµãµÄ±¾µØÎļþϵͳ£¬ÒÔºó½«Ö§³Ö¸ü¶àµÄÎļþϵͳ¡£Alluxio ËùÖ§³ÖµÄÓ¦ÓÃÈçÏ¡£

   
1921 ´Îä¯ÀÀ       30
Ïà¹ØÎÄÕÂ

»ùÓÚEAµÄÊý¾Ý¿â½¨Ä£
Êý¾ÝÁ÷½¨Ä££¨EAÖ¸ÄÏ£©
¡°Êý¾Ýºþ¡±£º¸ÅÄî¡¢ÌØÕ÷¡¢¼Ü¹¹Óë°¸Àý
ÔÚÏßÉ̳ÇÊý¾Ý¿âϵͳÉè¼Æ ˼·+Ч¹û
 
Ïà¹ØÎĵµ

GreenplumÊý¾Ý¿â»ù´¡Åàѵ
MySQL5.1ÐÔÄÜÓÅ»¯·½°¸
ijµçÉÌÊý¾ÝÖÐ̨¼Ü¹¹Êµ¼ù
MySQL¸ßÀ©Õ¹¼Ü¹¹Éè¼Æ
Ïà¹Ø¿Î³Ì

Êý¾ÝÖÎÀí¡¢Êý¾Ý¼Ü¹¹¼°Êý¾Ý±ê×¼
MongoDBʵս¿Î³Ì
²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿âÉè¼ÆÓëÓÅ»¯
PostgreSQLÊý¾Ý¿âʵսÅàѵ
×îл¼Æ»®
DeepSeekÔÚÈí¼þ²âÊÔÓ¦ÓÃʵ¼ù 4-12[ÔÚÏß]
DeepSeek´óÄ£ÐÍÓ¦Óÿª·¢Êµ¼ù 4-19[ÔÚÏß]
UAF¼Ü¹¹ÌåϵÓëʵ¼ù 4-11[±±¾©]
AIÖÇÄÜ»¯Èí¼þ²âÊÔ·½·¨Óëʵ¼ù 5-23[ÉϺ£]
»ùÓÚ UML ºÍEA½øÐзÖÎöÉè¼Æ 4-26[±±¾©]
ÒµÎñ¼Ü¹¹Éè¼ÆÓ뽨ģ 4-18[±±¾©]

APPÍÆ¹ãÖ®ÇÉÓù¤¾ß½øÐÐÊý¾Ý·ÖÎö
Hadoop Hive»ù´¡sqlÓï·¨
Ó¦Óö༶»º´æÄ£Ê½Ö§³Åº£Á¿¶Á·þÎñ
HBase ³¬Ïêϸ½éÉÜ
HBase¼¼ÊõÏêϸ½éÉÜ
Spark¶¯Ì¬×ÊÔ´·ÖÅä

HadoopÓëSpark´óÊý¾Ý¼Ü¹¹
HadoopÔ­ÀíÓë¸ß¼¶Êµ¼ù
HadoopÔ­Àí¡¢Ó¦ÓÃÓëÓÅ»¯
´óÊý¾ÝÌåϵ¿ò¼ÜÓëÓ¦ÓÃ
´óÊý¾ÝµÄ¼¼ÊõÓëʵ¼ù
Spark´óÊý¾Ý´¦Àí¼¼Êõ

GE Çø¿éÁ´¼¼ÊõÓëʵÏÖÅàѵ
º½Ìì¿Æ¹¤Ä³×Ó¹«Ë¾ Nodejs¸ß¼¶Ó¦Óÿª·¢
ÖÐÊ¢Òæ»ª ׿Խ¹ÜÀíÕß±ØÐë¾ß±¸µÄÎåÏîÄÜÁ¦
ijÐÅÏ¢¼¼Êõ¹«Ë¾ PythonÅàѵ
ij²©²ÊITϵͳ³§ÉÌ Ò×ÓÃÐÔ²âÊÔÓëÆÀ¹À
ÖйúÓÊ´¢ÒøÐÐ ²âÊÔ³ÉÊì¶ÈÄ£Ðͼ¯³É(TMMI)
ÖÐÎïÔº ²úÆ·¾­ÀíÓë²úÆ·¹ÜÀí