Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Modeler   Code  
»áÔ±   
 
   
 
 
     
   
 ¶©ÔÄ
  ¾èÖú
SparkʹÓÃ×ܽáÓë·ÖÏí
 
×÷Õߣºbourneli À´Ô´£º²©¿ÍÔ° ·¢²¼ÓÚ 2015-11-26
  2510  次浏览      29
 

±³¾°

ʹÓÃspark¿ª·¢ÒÑÓм¸¸öÔ¡£Ïà±ÈÓÚpython/hive£¬scala/sparkѧϰÃż÷½Ï¸ß¡£ÓÈÆä¼ÇµÃ¸Õ¿ªÊ±£¬¾Ù²½Î¬¼è£¬½øÕ¹Ê®·Ö»ºÂý¡£²»¹ýлÌìлµØ£¬Õâ¶Î¿àɬ£¨bi£©µÄÈÕ×Ó¹ýÈ¥ÁË¡£Òä¿à˼Ìð£¬ÎªÁ˱ÜÃâÏîÄ¿×éµÄÆäËûͬѧ×ßÍä·£¬¾ö¶¨×ܽáºÍÊáÀísparkµÄʹÓþ­Ñé¡£

Spark»ù´¡

»ùʯRDD

sparkµÄºËÐÄÊÇRDD£¨µ¯ÐÔ·Ö²¼Ê½Êý¾Ý¼¯£©£¬Ò»ÖÖͨÓõÄÊý¾Ý³éÏ󣬷â×°ÁË»ù´¡µÄÊý¾Ý²Ù×÷£¬Èçmap£¬filter£¬reduceµÈ¡£RDDÌṩÊý¾Ý¹²ÏíµÄ³éÏó£¬Ïà±ÈÆäËû´óÊý¾Ý´¦Àí¿ò¼Ü£¬ÈçMapReduce£¬Pegel£¬DryadLINQºÍHIVEµÈ¾ùȱ·¦´ËÌØÐÔ£¬ËùÒÔRDD¸üΪͨÓá£

¼òÒªµØ¸ÅÀ¨RDD£ºRDDÊÇÒ»¸ö²»¿ÉÐ޸ĵ쬷ֲ¼µÄ¶ÔÏ󼯺ϡ£Ã¿¸öRDDÓɶà¸ö·ÖÇø×é³É£¬Ã¿¸ö·ÖÇø¿ÉÒÔͬʱÔÚ¼¯ÈºÖеIJ»Í¬½ÚµãÉϼÆËã¡£RDD¿ÉÒÔ°üº¬Python£¬JavaºÍScalaÖеÄÈÎÒâ¶ÔÏó¡£

SparkÉú̬ȦÖÐÓ¦Óö¼ÊÇ»ùÓÚRDD¹¹½¨£¨ÏÂͼ£©£¬ÕâÒ»µã³ä·Ö˵Ã÷RDDµÄ³éÏó×㹻ͨÓ㬿ÉÒÔÃèÊö´ó¶àÊýÓ¦Óó¡¾°¡£

RDD²Ù×÷ÀàÐÍ¡ª×ª»»ºÍ¶¯×÷

RDDµÄ²Ù×÷Ö÷Òª·ÖÁ½Àࣺת»»£¨transformation£©ºÍ¶¯×÷£¨action£©¡£Á½ÀຯÊýµÄÖ÷񻂿±ðÊÇ£¬×ª»»½ÓÊÜRDD²¢·µ»ØRDD£¬¶ø¶¯×÷½ÓÊÜRDDµ«ÊÇ·µ»Ø·ÇRDD¡£×ª»»²ÉÓöèÐÔµ÷ÓûúÖÆ£¬Ã¿¸öRDD¼Ç¼¸¸RDDת»»µÄ·½·¨£¬ÕâÖÖµ÷ÓÃÁ´±í³ÆÖ®ÎªÑªÔµ£¨lineage£©£»¶ø¶¯×÷µ÷ÓûáÖ±½Ó¼ÆËã¡£

²ÉÓöèÐÔµ÷Óã¬Í¨¹ýѪԵÁ¬½ÓµÄRDD²Ù×÷¿ÉÒԹܵÀ»¯£¨pipeline£©£¬¹ÜµÀ»¯µÄ²Ù×÷¿ÉÒÔÖ±½ÓÔÚµ¥½ÚµãÍê³É£¬±ÜÃâ¶à´Îת»»²Ù×÷Ö®¼äÊý¾Ýͬ²½µÄµÈ´ý¡£

ʹÓÃѪԵ´®ÁªµÄ²Ù×÷¿ÉÒÔ±£³Öÿ´Î¼ÆËãÏà¶Ô¼òµ¥£¬¶ø²»Óõ£ÐÄÓйý¶àµÄÖмäÊý¾Ý£¬ÒòΪÕâЩѪԵ²Ù×÷¶¼¹ÜµÀ»¯ÁË£¬ÕâÑùÒ²±£Ö¤ÁËÂß¼­µÄµ¥Ò»ÐÔ£¬¶ø²»ÓÃÏñMapReduceÄÇÑù£¬ÎªÁ˾¹¿ÉÄܵļõÉÙmap reduce¹ý³Ì£¬ÔÚµ¥¸ömap reduceÖÐдÈë¹ý¶à¸´ÔÓµÄÂß¼­¡£

RDDʹÓÃģʽ

RDDʹÓþßÓÐÒ»°ãµÄģʽ£¬¿ÉÒÔ³éÏóΪÏÂÃæµÄ¼¸²½

1.¼ÓÔØÍⲿÊý¾Ý£¬´´½¨RDD¶ÔÏó

2.ʹÓÃת»»£¨Èçfilter£©£¬´´½¨ÐµÄRDD¶ÔÏó

3.»º´æÐèÒªÖØÓõÄRDD

4.ʹÓö¯×÷£¨Èçcount£©£¬Æô¶¯²¢ÐмÆËã

RDD¸ßЧµÄ²ßÂÔ

Spark¹Ù·½ÌṩµÄÊý¾ÝÊÇRDDÔÚijЩ³¡¾°Ï£¬¼ÆËãЧÂÊÊÇHadoopµÄ20X¡£Õâ¸öÊý¾ÝÊÇ·ñÓÐË®·Ö£¬ÎÒÃÇÏȲ»×·¾¿£¬µ«ÊÇRDDЧÂʸߵÄÓÉÒ»¶¨»úÖÆ±£Ö¤µÄ£º

1.RDDÊý¾ÝÖ»¶Á£¬²»¿ÉÐ޸ġ£Èç¹ûÐèÒªÐÞ¸ÄÊý¾Ý£¬±ØÐë´Ó¸¸RDDת»»£¨transformation£©µ½×ÓRDD¡£ËùÒÔ£¬ÔÚÈÝ´í²ßÂÔÖУ¬RDDûÓÐÊý¾ÝÈßÓ࣬¶øÊÇͨ¹ýRDD¸¸×ÓÒÀÀµ£¨ÑªÔµ£©¹ØÏµ½øÐÐÖØËãʵÏÖÈÝ´í¡£

2.RDDÊý¾ÝÔÚÄÚ´æÖУ¬¶à¸öRDD²Ù×÷Ö®¼ä£¬Êý¾Ý²»ÓÃÂ䵨µ½´ÅÅÌÉÏ£¬±ÜÃâ²»±ØÒªµÄI/O²Ù×÷¡£

3.RDD´æ·ÅµÄÊý¾Ý¿ÉÒÔÊÇjava¶ÔÏó£¬ËùÒÔ±ÜÃâµÄ²»±ØÒªµÄ¶ÔÏóÐòÁл¯ºÍ·´ÐòÁл¯¡£

×ܶøÑÔÖ®£¬RDD¸ßЧµÄÖ÷ÒªÒòËØÊǾ¡Á¿±ÜÃâ²»±ØÒªµÄ²Ù×÷ºÍÎþÉüÊý¾ÝµÄ²Ù×÷¾«¶È£¬ÓÃÀ´Ìá¸ß¼ÆËãЧÂÊ¡£

SparkʹÓü¼ÇÉ

RDD»ù±¾º¯ÊýÀ©Õ¹

RDDËäÈ»ÌṩÁ˺ܶຯÊý£¬µ«ÊDZϾ¹»¹ÊÇÓÐÏ޵ģ¬ÓÐʱºòÐèÒªÀ©Õ¹£¬×Ô¶¨ÒåеÄRDDµÄº¯Êý¡£ÔÚsparkÖУ¬¿ÉÒÔͨ¹ýÒþʽת»»£¬ÇáËÉʵÏÖ¶ÔRDDÀ©Õ¹¡£»­Ïñ¿ª·¢¹ý³ÌÖУ¬Æ½·²µÄ»áʹÓÃrollup²Ù×÷£¨ÀàËÆHIVEÖеÄrollup£©£¬¼ÆËã¶à¸ö¼¶±ðµÄ¾ÛºÏÊý¾Ý¡£ÏÂÃæÊǾßÌåʵ£¬

/**

* À©Õ¹spark rdd,ΪrddÌṩrollup·½·¨

*/

implicit class RollupRDD[T: ClassTag](rdd: RDD[(Array[String], T)]) extends Serializable {

/**

* ÀàËÆSqlÖеÄrollup²Ù×÷

*

* @param aggregate ¾ÛºÏº¯Êý

* @param keyPlaceHold keyռλ·û£¬Ä¬ÈϲÉÓÃFaceConf.STAT_SUMMARY

* @param isCache£¬È·ÈÏÊÇ·ñ»º´æÊý¾Ý

* @return ·µ»Ø¾ÛºÏºóµÄÊý¾Ý

*/

def rollup[U: ClassTag](

aggregate: Iterable[T] => U,

keyPlaceHold: String = FaceConf.STAT_SUMMARY,

isCache: Boolean = true): RDD[(Array[String], U)] = {

if (rdd.take(1).isEmpty) {

return rdd.map(x => (Array[String](), aggregate(Array[T](x._2))))

}

if (isCache) {

rdd.cache // Ìá¸ß¼ÆËãЧÂÊ

}

val totalKeyCount = rdd.first._1.size

val result = { 1 to totalKeyCount }.par.map(untilKeyIndex => { // ²¢ÐмÆËã

rdd.map(row => {

val combineKey = row._1.slice(0, untilKeyIndex).mkString(FaceConf.KEY_SEP) // ×éºÏkey

(combineKey, row._2)

}).groupByKey.map(row => { // ¾ÛºÏ¼ÆËã

val oldKeyList = row._1.split(FaceConf.KEY_SEP)

val newKeyList = oldKeyList ++ Array.fill(totalKeyCount - oldKeyList.size) { keyPlaceHold }

(newKeyList, aggregate(row._2))

})

}).reduce(_ ++ _) // ¾ÛºÏ½á¹û

result

}

}

ÉÏÃæ´úÂëÉùÃ÷ÁËÒ»¸öÒþʽÀ࣬¾ßÓÐÒ»¸ö³ÉÔ±±äÁ¿rdd£¬ÀàÐÍÊÇRDD[(Array[String], T)]£¬ÄÇôÈç¹ûÓ¦ÓôúÂëÖгöÏÖÁËÈκÎÕâÑùµÄrdd¶ÔÏ󣬲¢ÇÒimportµ±Ç°µÄÒþʽת»»£¬ÄÇô±àÒëÆ÷¾Í»á½«Õâ¸örddµ±×öÉÏÃæµÄÒþʽÀàµÄ¶ÔÏó£¬Ò²¾Í¿ÉÒÔʹÓÃrollupº¯Êý£¬ºÍÒ»°ãµÄmap£¬filter·½·¨Ò»Ñù¡£

RDD²Ù×÷±Õ°üÍⲿ±äÁ¿Ô­Ôò

RDDÏà¹Ø²Ù×÷¶¼ÐèÒª´«Èë×Ô¶¨Òå±Õ°üº¯Êý£¨closure£©£¬Èç¹ûÕâ¸öº¯ÊýÐèÒª·ÃÎÊÍⲿ±äÁ¿£¬ÄÇôÐèÒª×ñÑ­Ò»¶¨µÄ¹æÔò£¬·ñÔò»áÅ׳öÔËÐÐʱÒì³£¡£±Õ°üº¯Êý´«Èëµ½½Úµãʱ£¬ÐèÒª¾­¹ýÏÂÃæµÄ²½Ö裺

1.Çý¶¯³ÌÐò£¬Í¨¹ý·´É䣬ÔËÐÐʱÕÒµ½±Õ°ü·ÃÎʵÄËùÓбäÁ¿£¬²¢·â³ÉÒ»¸ö¶ÔÏó£¬È»ºóÐòÁл¯¸Ã¶ÔÏó

2.½«ÐòÁл¯ºóµÄ¶ÔÏóͨ¹ýÍøÂç´«Êäµ½worker½Úµã

3.worker½Úµã·´ÐòÁл¯±Õ°ü¶ÔÏó

4.worker½ÚµãÖ´Ðбհüº¯Êý£¬

×¢Ò⣺Íⲿ±äÁ¿ÔÚ±Õ°üÄÚµÄÐ޸IJ»»á±»·´À¡µ½Çý¶¯³ÌÐò¡£

¼ò¶øÑÔÖ®£¬¾ÍÊÇͨ¹ýÍøÂ磬´«µÝº¯Êý£¬È»ºóÖ´ÐС£ËùÒÔ£¬±»´«µÝµÄ±äÁ¿±ØÐë¿ÉÒÔÐòÁл¯£¬·ñÔò´«µÝʧ°Ü¡£±¾µØÖ´ÐÐʱ£¬ÈÔÈ»»áÖ´ÐÐÉÏÃæËIJ½¡£

¹ã²¥»úÖÆÒ²¿ÉÒÔ×öµ½ÕâÒ»µã£¬µ«ÊÇÆµ·±µÄʹÓù㲥»áʹ´úÂë²»¹»¼ò½à£¬¶øÇҹ㲥Éè¼ÆµÄ³õÖÔÊǽ«½Ï´óÊý¾Ý»º´æµ½½ÚµãÉÏ£¬±ÜÃâ¶à´ÎÊý¾Ý´«Ê䣬Ìá¸ß¼ÆËãЧÂÊ£¬¶ø²»ÊÇÓÃÓÚ½øÐÐÍⲿ±äÁ¿·ÃÎÊ¡£

RDDÊý¾Ýͬ²½

RDDĿǰÌṩÁ½¸öÊý¾Ýͬ²½µÄ·½·¨£º¹ã²¥ºÍÀÛ¼ÆÆ÷¡£

¹ã²¥ broadcast

Ç°ÃæÌáµ½¹ý£¬¹ã²¥¿ÉÒÔ½«±äÁ¿·¢Ë͵½±Õ°üÖУ¬±»±Õ°üʹÓᣵ«ÊÇ£¬¹ã²¥»¹ÓÐÒ»¸ö×÷ÓÃÊÇͬ²½½Ï´óÊý¾Ý¡£±ÈÈçÄãÓÐÒ»¸öIP¿â£¬¿ÉÄÜÓм¸G£¬ÔÚmap²Ù×÷ÖУ¬ÒÀÀµÕâ¸öip¿â¡£ÄÇô£¬¿ÉÒÔͨ¹ý¹ã²¥½«Õâ¸öip¿â´«µ½±Õ°üÖУ¬±»²¢ÐеÄÈÎÎñÓ¦Ó᣹㲥ͨ¹ýÁ½¸ö·½ÃæÌá¸ßÊý¾Ý¹²ÏíЧÂÊ£º1£¬¼¯ÈºÖÐÿ¸ö½Úµã£¨ÎïÀí»úÆ÷£©Ö»ÓÐÒ»¸ö¸±±¾£¬Ä¬ÈϵıհüÊÇÿ¸öÈÎÎñÒ»¸ö¸±±¾£»2£¬¹ã²¥´«ÊäÊÇͨ¹ýBTÏÂÔØÄ£Ê½ÊµÏֵģ¬Ò²¾ÍÊÇP2PÏÂÔØ£¬ÔÚ¼¯Èº¶àµÄÇé¿öÏ£¬¿ÉÒÔ¼«´óµÄÌá¸ßÊý¾Ý´«ÊäËÙÂÊ¡£¹ã²¥±äÁ¿Ð޸ĺ󣬲»»á·´À¡µ½ÆäËû½Úµã¡£

ÀÛ¼ÓÆ÷ Accumulator

ÀÛ¼ÓÆ÷ÊÇÒ»¸öwrite-onlyµÄ±äÁ¿£¬ÓÃÓÚÀÛ¼Ó¸÷¸öÈÎÎñÖеÄ״̬£¬Ö»ÓÐÔÚÇý¶¯³ÌÐòÖУ¬²ÅÄÜ·ÃÎÊÀÛ¼ÓÆ÷¡£¶øÇÒ£¬½ØÖ¹µ½1.2°æ±¾£¬ÀÛ¼ÓÆ÷ÓÐÒ»¸öÒÑÖªµÄȱÏÝ£¬ÔÚaction²Ù×÷ÖУ¬n¸öÔªËØµÄRDD¿ÉÒÔÈ·±£ÀÛ¼ÓÆ÷Ö»ÀÛ¼Ón´Î£¬µ«ÊÇÔÚtransformationʱ£¬spark²»È·±££¬Ò²¾ÍÊÇÀÛ¼ÓÆ÷¿ÉÄܳöÏÖn+1´ÎÀÛ¼Ó¡£

ĿǰRDDÌṩµÄͬ²½»úÖÆÁ£¶ÈÌ«´Ö£¬ÓÈÆäÊÇת»»²Ù×÷ÖбäÁ¿×´Ì¬²»ÄÜͬ²½£¬ËùÒÔRDDÎÞ·¨×ö¸´ÔӵľßÓÐ״̬µÄÊÂÎñ²Ù×÷¡£²»¹ý£¬RDDµÄʹÃüÊÇÌṩһ¸öͨÓõIJ¢ÐмÆËã¿ò¼Ü£¬¹À¼ÆÓÀÔ¶Ò²²»»áÌṩϸÁ£¶ÈµÄÊý¾Ýͬ²½»úÖÆ£¬ÒòΪÕâÓëÆäÉè¼ÆµÄ³õÖÔÊÇÎ¥±³µÄ¡£

RDDÓÅ»¯¼¼ÇÉ

RDD»º´æ

ÐèҪʹÓöà´ÎµÄÊý¾ÝÐèÒªcache£¬·ñÔò»á½øÐв»±ØÒªµÄÖØ¸´²Ù×÷¡£¾Ù¸öÀý×Ó

val data = ¡­ // read from tdw

println(data.filter(_.contains("error")).count)

println(data.filter(_.contains("warning")).count)

ÉÏÃæÈý¶Î´úÂëÖУ¬data±äÁ¿»á¼ÓÔØÁ½´Î£¬¸ßЧµÄ×ö·¨ÊÇÔÚdata¼ÓÔØÍêºó£¬Á¢¿Ì³Ö¾Ã»¯µ½ÄÚ´æÖУ¬ÈçÏÂ

val data = ¡­ // read from tdw

data.cache

println(data.filter(_.contains("error")).count)

println(data.filter(_.contains("warning")).count)

ÕâÑù£¬dataÔÚµÚÒ»¼ÓÔØºó£¬¾Í±»»º´æµ½ÄÚ´æÖУ¬ºóÃæÁ½´Î²Ù×÷¾ùÖ±½ÓʹÓÃÄÚ´æÖеÄÊý¾Ý¡£

ת»»²¢Ðл¯

RDDµÄת»»²Ù×÷ʱ²¢Ðл¯¼ÆËãµÄ£¬µ«ÊǶà¸öRDDµÄת»»Í¬ÑùÊÇ¿ÉÒÔ²¢Ðеģ¬²Î¿¼ÈçÏÂ

val dataList:Array[RDD[Int]] = ¡­

val sumList = data.list.map(_.map(_.sum))

ÉÏÃæµÄÀý×ÓÖУ¬µÚÒ»¸ömapÊDZãÀûArray±äÁ¿£¬´®ÐеļÆËãÿ¸öRDDÖеÄÿÐеÄsum¡£ÓÉÓÚÿ¸öRDDÖ®¼ä¼ÆËãÊÇûÓÐÂß¼­ÁªÏµµÄ£¬ËùÒÔÀíÂÛÉÏÊÇ¿ÉÒÔ½«RDDµÄ¼ÆËã²¢Ðл¯µÄ£¬ÔÚscalaÖпÉÒÔÇáËÉÊÔÏ£¬ÈçÏÂ

val dataList:Array[RDD[Int]] = ¡­

val sumList = data.list.par.map(_.map(_.sum))

×¢ÒâºìÉ«´úÂë¡£

¼õÉÙshuffleÍøÂç´«Êä

Ò»°ã¶øÑÔ£¬ÍøÂçI/O¿ªÏúÊǺܴóµÄ£¬¼õÉÙÍøÂ翪Ïú£¬¿ÉÒÔÏÔÖø¼Ó¿ì¼ÆËãЧÂÊ¡£ÈÎÒâÁ½¸öRDDµÄshuffle²Ù×÷£¨joinµÈ£©µÄ´óÖ¹ý³ÌÈçÏ£¬

Óû§Êý¾ÝuserDataºÍʼþeventsÊý¾Ýͨ¹ýÓû§idÁ¬½Ó£¬ÄÇô»áÔÚÍøÂçÖд«µ½ÁíÍâÒ»¸ö½Úµã£¬Õâ¸ö¹ý³ÌÖУ¬ÓÐÁ½¸öÍøÂç´«Êä¹ý³Ì¡£SparkµÄĬÈÏÊÇÍê³ÉÕâÁ½¸ö¹ý³Ì¡£µ«ÊÇ£¬Èç¹ûÄã¶à¸æËßsparkһЩÐÅÏ¢£¬spark¿ÉÒÔÓÅ»¯£¬Ö»Ö´ÐÐÒ»¸öÍøÂç´«Êä¡£¿ÉÒÔͨ¹ýʹÓá¢HashPartition£¬ÔÚuserData"±¾µØ"ÏÈ·ÖÇø£¬È»ºóÒªÇóeventsÖ±½Óshuffleµ½userDataµÄ½ÚµãÉÏ£¬ÄÇô¾Í¼õÉÙÁËÒ»²¿·ÖÍøÂç´«Ê䣬¼õÉÙºóµÄЧ¹ûÈçÏ£¬

ÐéÏß²¿·Ö¶¼ÊÇÔÚ±¾µØÍê³ÉµÄ£¬Ã»ÓÐÍøÂç´«Êä¡£ÔÚÊý¾Ý¼ÓÔØÊ±£¬¾Í°´ÕÕkey½øÐÐpartition£¬ÕâÑù¿ÉÒÔ¾­Ò»²¿µÄ¼õÉÙ±¾µØµÄHashPartitionµÄ¹ý³Ì£¬Ê¾Àý´úÂëÈçÏ£º

val userData = sc.sequenceFile[UserID, UserInfo]("hdfs://¡­")

.partitionBy(new HashPartitioner(100)) // Create 100 partitions

.persist()

×¢Ò⣬ÉÏÃæÒ»¶¨Òªpersist£¬·ñÔò»áÖØ¸´¼ÆËã¶à´Î¡£100ÓÃÀ´Ö¸¶¨²¢ÐÐÊýÁ¿¡£

SparkÆäËû

Spark¿ª·¢Ä£Ê½

ÓÉÓÚsparkÓ¦ÓóÌÐòÊÇÐèÒªÔÚ²¿Êðµ½¼¯ÈºÉÏÔËÐе쬵¼Ö±¾µØµ÷ÊԱȽÏÂé·³£¬ËùÒÔ¾­¹ýÕâ¶Îʱ¼äµÄ¾­ÑéÀÛ»ý£¬×ܽáÁËÒ»Ì׿ª·¢Á÷³Ì£¬Ä¿µÄÊÇΪÁ˾¡¿ÉÄܵÄÌá¸ß¿ª·¢µ÷ÊÔЧÂÊ£¬Í¬Ê±±£Ö¤¿ª·¢ÖÊÁ¿¡£µ±È»£¬ÕâÌ×Á÷³Ì¿ÉÄÜÒ²²»ÊÇ×îÓŵģ¬ºóÃæÐèÒª³ÖÐø¸Ä½ø¡£

Õû¸öÁ÷³Ì±È½ÏÇå³þ£¬ÕâÀïÖ÷Ҫ̸̸ΪʲôÐèÒªµ¥Ôª²âÊÔ¡£¹«Ë¾ÄڵĴó¶àÊýÏîÄ¿£¬Ò»°ã²»Ìᳫµ¥Ôª²âÊÔ£¬¶øÇÒÓÉÓÚÏîÄ¿½ø¶ÈѹÁ¦£¬¿ª·¢ÈËÔ±»á·Ç³£µÖ´¥µ¥Ôª²âÊÔ£¬ÒòΪ»á»¨·Ñ"¶îÍâ"µÄ¾«Á¦¡£BugÕâ¶«Î÷²»»áÒòΪÏîÄ¿¸Ï½ø¶È¶øÏûʧ£¬¶øÇÒÇ¡ºÃÏà·´£¬¿ÉÄÜÒòΪ¸Ï½ø¶È£¬¶ø¸ßÓÚÆ½¾ùˮƽ¡£ËùÒÔ£¬Èç¹û²»»¨Ê±¼ä½øÐе¥Ôª²âÊÔ£¬ÄÇô»á»¨Í¬Ñù¶à£¬ÉõÖÁ¸ü¶àµÄʱ¼äµ÷ÊÔ¡£ºÜ¶àʱºò£¬ÍùÍùһЩºÜСµÄbug£¬È´µ¼ÖÂÄ㻨Á˺ܳ¤Ê±¼äÈ¥µ÷ÊÔ£¬¶øÕâЩbug£¬Ç¡ºÃÊǺÜÈÝÒ×ÔÚµ¥Ôª²âÊÔÖз¢Ïֵġ£¶øÇÒ£¬µ¥Ôª²âÊÔ»¹¿ÉÒÔ´øÀ´Á½¸ö¶îÍâµÄºÃ´¦£º1£©APIʹÓ÷¶Àý£»2£©»Ø¹é²âÊÔ¡£ËùÒÔ£¬»¹Êǵ¥Ôª²âÊÔ°É£¬ÕâÊÇÒ»±ÊͶ×Ê£¬¶øÇÒROI»¹Í¦¸ß£¡²»¹ý·²ÊÂÐèÒªÕÆÎշִ磬µ¥Ôª²âÊÔÓ¦¸Ã¸ù¾ÝÏîÄ¿½ôÆÈ³Ì¶Èµ÷ÕûÁ£¶È£¬×öµ½ÓÐËùΪ£¬ÓÐËù²»Îª¡£

SparkÆäËû¹¦ÄÜ

Ç°ÃæÌáµ½ÁËsparkÉú̬Ȧ£¬spark³ýÁ˺ËÐĵÄRDD£¬»¹ÌṩÁËÖ®ÉϵöºÜʹÓõÄÓ¦Óãº

1.Spark SQL: ÀàËÆhive£¬Ê¹ÓÃrddʵÏÖsql²éѯ

2.Spark Streaming: Á÷ʽ¼ÆË㣬Ìṩʵʱ¼ÆË㹦ÄÜ£¬ÀàËÆstorm

3.MLLib£º»úÆ÷ѧϰ¿â£¬Ìṩ³£Ó÷ÖÀ࣬¾ÛÀ࣬»Ø¹é£¬½»²æ¼ìÑéµÈ»úÆ÷ѧϰËã·¨²¢ÐÐʵÏÖ¡£

4.GraphX£ºÍ¼¼ÆËã¿ò¼Ü£¬ÊµÏÖÁË»ù±¾µÄͼ¼ÆË㹦ÄÜ£¬³£ÓÃͼËã·¨ºÍpregelͼ±à³Ì¿ò¼Ü¡£

ºóÃæÐèÒª¼ÌÐøÑ§Ï°ºÍʹÓÃÉÏÃæµÄ¹¦ÄÜ£¬ÓÈÆäÊÇÓëÊý¾ÝÍÚ¾òÇ¿Ïà¹ØµÄMLLib¡£

   
2510 ´Îä¯ÀÀ       29
Ïà¹ØÎÄÕÂ

»ùÓÚEAµÄÊý¾Ý¿â½¨Ä£
Êý¾ÝÁ÷½¨Ä££¨EAÖ¸ÄÏ£©
¡°Êý¾Ýºþ¡±£º¸ÅÄî¡¢ÌØÕ÷¡¢¼Ü¹¹Óë°¸Àý
ÔÚÏßÉ̳ÇÊý¾Ý¿âϵͳÉè¼Æ ˼·+Ч¹û
 
Ïà¹ØÎĵµ

GreenplumÊý¾Ý¿â»ù´¡Åàѵ
MySQL5.1ÐÔÄÜÓÅ»¯·½°¸
ijµçÉÌÊý¾ÝÖÐ̨¼Ü¹¹Êµ¼ù
MySQL¸ßÀ©Õ¹¼Ü¹¹Éè¼Æ
Ïà¹Ø¿Î³Ì

Êý¾ÝÖÎÀí¡¢Êý¾Ý¼Ü¹¹¼°Êý¾Ý±ê×¼
MongoDBʵս¿Î³Ì
²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿âÉè¼ÆÓëÓÅ»¯
PostgreSQLÊý¾Ý¿âʵսÅàѵ
×îл¼Æ»®
DeepSeekÔÚÈí¼þ²âÊÔÓ¦ÓÃʵ¼ù 4-12[ÔÚÏß]
DeepSeek´óÄ£ÐÍÓ¦Óÿª·¢Êµ¼ù 4-19[ÔÚÏß]
UAF¼Ü¹¹ÌåϵÓëʵ¼ù 4-11[±±¾©]
AIÖÇÄÜ»¯Èí¼þ²âÊÔ·½·¨Óëʵ¼ù 5-23[ÉϺ£]
»ùÓÚ UML ºÍEA½øÐзÖÎöÉè¼Æ 4-26[±±¾©]
ÒµÎñ¼Ü¹¹Éè¼ÆÓ뽨ģ 4-18[±±¾©]

MySQLË÷Òý±³ºóµÄÊý¾Ý½á¹¹
MySQLÐÔÄܵ÷ÓÅÓë¼Ü¹¹Éè¼Æ
SQL ServerÊý¾Ý¿â±¸·ÝÓë»Ö¸´
ÈÃÊý¾Ý¿â·ÉÆðÀ´ 10´óDB2ÓÅ»¯
oracleµÄÁÙʱ±í¿Õ¼äдÂú´ÅÅÌ
Êý¾Ý¿âµÄ¿çƽ̨Éè¼Æ

²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿â
¸ß¼¶Êý¾Ý¿â¼Ü¹¹Éè¼ÆÊ¦
HadoopÔ­ÀíÓëʵ¼ù
Oracle Êý¾Ý²Ö¿â
Êý¾Ý²Ö¿âºÍÊý¾ÝÍÚ¾ò
OracleÊý¾Ý¿â¿ª·¢Óë¹ÜÀí

GE Çø¿éÁ´¼¼ÊõÓëʵÏÖÅàѵ
º½Ìì¿Æ¹¤Ä³×Ó¹«Ë¾ Nodejs¸ß¼¶Ó¦Óÿª·¢
ÖÐÊ¢Òæ»ª ׿Խ¹ÜÀíÕß±ØÐë¾ß±¸µÄÎåÏîÄÜÁ¦
ijÐÅÏ¢¼¼Êõ¹«Ë¾ PythonÅàѵ
ij²©²ÊITϵͳ³§ÉÌ Ò×ÓÃÐÔ²âÊÔÓëÆÀ¹À
ÖйúÓÊ´¢ÒøÐÐ ²âÊÔ³ÉÊì¶ÈÄ£Ðͼ¯³É(TMMI)
ÖÐÎïÔº ²úÆ·¾­ÀíÓë²úÆ·¹ÜÀí