Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Model Center   Code  
»áÔ±   
   
 
     
   
 ¶©ÔÄ
  ¾èÖú
Cloudera KuduÊÇʲô£¿
 
×÷Õߣº ´óÊý¾ÝºÍAIÌɹýµÄ¿Ó
 
  1604  次浏览      27
2020-4-23
 
±à¼­ÍƼö:
±¾ÎÄÊôÓÚ»ù´¡ÎÄÕ£¬ÊʺÏÈëÃŵÄС»ï°éÃÇ£¬Ö÷Òª½éÉÜKuduÊÇʲô£¬ÓÐʲô£¬×îºóÒ»¸öС°¸Àý£¬Ï£Íû¶Ô´ó¼ÒÓаïÖú¡£
±¾ÎÄÀ´×Ôcnblogs£¬ÓÉ»ðÁú¹ûÈí¼þAnna±à¼­¡¢ÍƼö¡£

Cloudera KuduÊÇʲô£¿

kuduÊÇclouderaÔÚ2012¿ªÊ¼ÃØÃÜÑз¢µÄÒ»¿î½éÓÚhdfsºÍhbaseÖ®¼äµÄ¸ßËÙ·Ö²¼Ê½ÁÐʽ´æ´¢Êý¾Ý¿â¡£¼æ¾ßÁËhbaseµÄʵʱÐÔ¡¢hdfsµÄ¸ßÍÌÍ£¬ÒÔ¼°´«Í³Êý¾Ý¿âµÄsqlÖ§³Ö¡£×÷Ϊһ¿îʵʱ¡¢ÀëÏßÖ®¼äµÄ´æ´¢ÏµÍ³¡£¶¨Î»ºÍsparkÔÚ¼ÆËãϵͳÖеĵØÎ»·Ç³£ÏàËÆ¡£Èç¹û°Ñmr+hdfs×÷ΪÀëÏß¼ÆËã±êÅ䣬storm+hbase×÷Ϊʵʱ¼ÆËã±êÅä¡£spark+kuduÓпÉÄܳÉΪδÀ´×îÓоºÕùÁ¦µÄÒ»Öּܹ¹¡£

Ò²¾ÍÊÇkafka -> spark -> kuduÕâÖּܹ¹£¬Î´À´´Ë¼Ü¹¹ÊÇ·ñ»á·çÃÒ£¬ÔÝÇÒ²»ÑÔÂÛ¡£ÈÃÎÒÃÇÊÃÄ¿ÒÔ´ý°É£¡

KuduÊÇCloudera¿ªÔ´µÄÐÂÐÍÁÐʽ´æ´¢ÏµÍ³£¬ÊÇApache HadoopÉú̬ȦµÄгÉÔ±Ö®Ò»£¨incubating£©£¬×¨ÃÅΪÁ˶ԿìËٱ仯µÄÊý¾Ý½øÐпìËٵķÖÎö£¬Ìî²¹ÁËÒÔÍùHadoop´æ´¢²ãµÄ¿Õȱ¡£

KuduÊÇTodd Lipcon@Cloudera´øÍ·¿ª·¢µÄ´æ´¢ÏµÍ³£¬ÆäÕûÌåÓ¦ÓÃģʽºÍHBase±È½Ï½Ó½ü£¬¼´Ö§³ÖÐм¶±ðµÄËæ»ú¶Áд£¬²¢Ö§³ÖÅúÁ¿Ë³Ðò¼ìË÷¹¦ÄÜ¡£

Kudu ÊÇÒ»¸öÕë¶Ô Apache Hadoop ƽ̨¶ø¿ª·¢µÄÁÐʽ´æ´¢¹ÜÀíÆ÷¡£Kudu ¹²Ïí Hadoop Éú̬ϵͳӦÓõij£¼û¼¼ÊõÌØÐÔ:ËüÔÚcommodity hardware£¨ÉÌÆ·Ó²¼þ£©ÉÏÔËÐУ¬horizontally scalable£¨Ë®Æ½¿ÉÀ©Õ¹£©£¬²¢Ö§³Ö highly available£¨¸ß¿ÉÓã©ÐÔ²Ù×÷¡£

KuduµÄÄ¿±êÊÇ£ºÌṩ¿ìËÙµÄÈ«Á¿Êý¾Ý·ÖÎöÓëʵʱ´¦Àí¹¦ÄÜ£»³ä·ÖÀûÓÃÏȽøCPUÓëIO×ÊÔ´£»Ö§³ÖÊý¾Ý¸üУ»¼òµ¥¡¢¿ÉÀ©Õ¹µÄÊý¾ÝÄ£ÐÍ¡£

KuduµÄ¹ÙÍø

A new addition to the open source Apache Hadoop ecosystem, Apache Kudu completes Hadoop's storage layer to enablefast analytics on fast data.

±³¾°¡ª¡ª¹¦ÄÜÉϵĿհ×

Hadoop Éú̬ϵͳÓкܶà×é¼þ£¬Ã¿Ò»¸ö×é¼þÓв»Í¬µÄ¹¦ÄÜ¡£ÔÚÏÖʵ³¡¾°ÖУ¬Óû§ÍùÍùÐèҪͬʱ²¿ÊðºÜ¶à Hadoop ¹¤¾ßÀ´½â¾öͬһ¸öÎÊÌ⣬ÕâÖּܹ¹³ÆÎª »ìºÏ¼Ü¹¹ (hybrid architecture) ¡£ ±ÈÈ磬Óû§ÐèÒªÀûÓà Hbase µÄ¿ìËÙ²åÈë¡¢¿ì¶Á random access µÄÌØÐÔÀ´µ¼ÈëÊý¾Ý£¬ HBase Ò²ÔÊÐíÓû§¶ÔÊý¾Ý½øÐÐÐ޸ģ¬ HBase ¶ÔÓÚ´óÁ¿Ð¡¹æÄ£²éѯҲ·Ç³£Ñ¸ËÙ¡£Í¬Ê±£¬Óû§Ê¹Óà HDFS/Parquet + Impala/Hive À´¶Ô³¬´óµÄÊý¾Ý¼¯½øÐвéѯ·ÖÎö£¬¶ÔÓÚÕâÀೡ¾°£¬ Parquet ÕâÖÖÁÐʽ´æ´¢Îļþ¸ñʽ¾ßÓм«´óµÄÓÅÊÆ¡£

ºÜ¶à¹«Ë¾¶¼³É¹¦µØ²¿ÊðÁË HDFS/Parquet + HBase »ìºÏ¼Ü¹¹£¬È»¶øÕâÖּܹ¹½ÏΪ¸´ÔÓ£¬¶øÇÒÔÚά»¤ÉÏҲʮ·ÖÀ§ÄÑ¡£Ê×ÏÈ£¬Óû§Óà Flume »ò Kafka µÈÊý¾Ý Ingest ¹¤¾ß½«Êý¾Ýµ¼Èë HBase £¬Óû§¿ÉÄÜÔÚ HBase É϶ÔÊý¾Ý×öһЩÐ޸ġ£È»ºóÿ¸ôÒ»¶Îʱ¼ä ( ÿÌì»òÿÖÜ ) ½«Êý¾Ý´Ó Hbase Öе¼Èëµ½ Parquet Îļþ£¬×÷Ϊһ¸öÐ嵀 partition ·ÅÔÚ HDFS ÉÏ£¬×îºóʹÓà Impala µÈ¼ÆËãÒýÇæ½øÐвéѯ£¬Éú³É×îÖÕ±¨±í¡£

ÕâÑùÒ»Ìõ¹¤¾ßÁ´·±Ëö¶ø¸´ÔÓ£¬¶øÇÒ»¹´æÔںܶàÎÊÌ⣬±ÈÈ磺

£¨1£©ÈçºÎ´¦Àíijһ¹ý³Ì³öÏÖʧ°Ü£¿¡¡¡¡

£¨2£©´Ó HBase ½«Êý¾Ýµ¼³öµ½Îļþ£¬¶à¾ÃµÄƵÂʱȽϺÏÊÊ£¿

£¨3£©µ±Éú³É×îÖÕ±¨±íʱ£¬×î½üµÄÊý¾Ý²¢ÎÞ·¨ÌåÏÖÔÚ×îÖÕ²éѯ½á¹ûÉÏ¡£

£¨4£©Î¬»¤¼¯ÈºÊ±£¬ÈçºÎ±£Ö¤¹Ø¼üÈÎÎñ²»Ê§°Ü£¿

£¨5£©Parquet ÊÇ immutable £¬Òò´Ëµ± HBase ÖÐɾ¸ÄijЩÀúÊ·Êý¾Ýʱ£¬ÍùÍùÐèÒªÈ˹¤¸ÉÔ¤½øÐÐͬ²½¡£

Õâʱºò£¬Óû§¾ÍÏ£ÍûÄܹ»ÓÐÒ»ÖÖÓÅÑŵĴ洢½â¾ö·½°¸£¬À´Ó¦¸¶²»Í¬ÀàÐ͵Ť×÷Á÷£¬²¢±£³Ö¸ßÐÔÄܵļÆËãÄÜÁ¦¡£ Cloudera ºÜÔç¾ÍÒâʶµ½Õâ¸öÎÊÌ⣬ÔÚ 2012 Äê¾Í¿ªÊ¼¼Æ»®¿ª·¢ Kudu Õâ¸ö´æ´¢ÏµÍ³£¬ÖÕÓÚÔÚ 2015 Äê·¢²¼²¢¿ªÔ´³öÀ´¡£ Kudu ÊÇ¶Ô HDFS ºÍ HBase ¹¦ÄÜÉϵIJ¹³ä£¬ÄÜÌṩ¿ìËٵķÖÎöºÍʵʱ¼ÆËãÄÜÁ¦£¬²¢ÇÒ³ä·ÖÀûÓà CPU ºÍ I/O ×ÊÔ´£¬Ö§³ÖÊý¾ÝÔ­µØÐ޸ģ¬Ö§³Ö¼òµ¥µÄ¡¢¿ÉÀ©Õ¹µÄÊý¾ÝÄ£ÐÍ¡£

±³¾°¡ª¡ªÐµÄÓ²¼þÉ豸

RAM µÄ¼¼Êõ·¢Õ¹·Ç³£¿ì£¬Ëü±äµÃÔ½À´Ô½±ãÒË£¬ÈÝÁ¿Ò²Ô½À´Ô½´ó¡£ Cloudera µÄ¿Í»§Êý¾ÝÏÔʾ£¬ËûÃǵĿͻ§Ëù²¿ÊðµÄ·þÎñÆ÷£¬ 2012 Äêÿ¸ö½Úµã½öÓÐ 32GB RAM £¬ÏÖÈç½ñÔö³¤µ½Ã¿¸ö½ÚµãÓÐ 128GB »ò 256GB RAM ¡£´æ´¢É豸ÉϸüÐÂÒ²·Ç³£¿ì£¬ ÔںܶàÆÕͨ·þÎñÆ÷Öв¿Êð SSD Ò²ÊÇÂżû²»ÏÊ¡£ HBase ¡¢ HDFS ¡¢ÒÔ¼°ÆäËûµÄ Hadoop ¹¤¾ß¶¼ÔÚ²»¶Ï×ÔÎÒÍêÉÆ£¬´Ó¶øÊÊÓ¦Ó²¼þÉϵÄÉý¼¶»»´ú¡£È»¶ø£¬´Ó¸ù±¾ÉÏ£¬ HDFS »ùÓÚ 03 Äê GFS £¬ HBase »ùÓÚ 05 Äê BigTable £¬ÔÚµ±Ê±ÏµÍ³Æ¿¾±Ö÷Ҫȡ¾öÓڵײã´ÅÅÌËÙ¶È¡£µ±´ÅÅÌËٶȽÏÂýʱ£¬ CPU ÀûÓÃÂʲ»×ãµÄ¸ù±¾Ô­ÒòÊÇ´ÅÅÌËٶȵ¼ÖÂµÄÆ¿¾±£¬µ±´ÅÅÌËÙ¶ÈÌá¸ßÁËÖ®ºó£¬ CPU ÀûÓÃÂÊÌá¸ß£¬Õâʱºò CPU ÍùÍù³ÉΪϵͳµÄÆ¿¾±¡£ HBase ¡¢ HDFS ÓÉÓÚÄê´ú¾ÃÔ¶£¬ÒѾ­ºÜÄÑ´Ó»ù±¾¼Ü¹¹ÉϽøÐÐÐ޸쬶ø Kudu ÊÇ»ùÓÚȫеÄÉè¼Æ£¬Òò´Ë¿ÉÒÔ¸ü³ä·ÖµØÀûÓà RAM ¡¢ I/O ×ÊÔ´£¬²¢ÓÅ»¯ CPU ÀûÓÃÂÊ¡£ÎÒÃÇ¿ÉÒÔÀí½âΪ£¬ Kudu Ïà±ÈÓëÒÔÍùµÄϵͳ£¬ CPU ʹÓýµµÍÁË£¬ I/O µÄʹÓÃÌá¸ßÁË£¬ RAM µÄÀûÓøü³ä·ÖÁË¡£

1. KuduµÄ¼ò½é

Kudu Éè¼ÆÖ®³õ£¬ÊÇΪÁ˽â¾öÒ»ÏÂÎÊÌ⣺

¶ÔÊý¾ÝɨÃè (scan) ºÍËæ»ú·ÃÎÊ (random access) ͬʱ¾ßÓиßÐÔÄÜ£¬¼ò»¯Óû§¸´ÔӵĻìºÏ¼Ü¹¹£»

¸ß CPU ЧÂÊ£¬Ê¹Óû§¹ºÂòµÄÏȽø´¦ÀíÆ÷µÄµÄ»¨·ÑµÃµ½×î´ó»Ø±¨£»

¸ß IO ÐÔÄÜ£¬³ä·ÖÀûÓÃÏȽø´æ´¢½éÖÊ£»

Ö§³ÖÊý¾ÝµÄÔ­µØ¸üУ¬±ÜÃâ¶îÍâµÄÊý¾Ý´¦Àí¡¢Êý¾ÝÒÆ¶¯¡£

2. KuduÖ§³Ö¿çÊý¾ÝÖÐÐÄ replication

Kudu µÄºÜ¶àÌØÐÔ¸ú HBase ºÜÏñ£¬ËüÖ§³ÖË÷Òý¼üµÄ²éѯºÍÐ޸ġ£ Cloudera Ôø¾­Ïë¹ý»ùÓÚ Hbase ½øÐÐÐ޸ģ¬È»¶ø½áÂÛÊÇ¶Ô HBase µÄ¸Ä¶¯·Ç³£´ó£¬ Kudu µÄÊý¾ÝÄ£ÐͺʹÅÅÌ´æ´¢¶¼Óë Hbase ²»Í¬¡£ HBase ±¾Éí³É¹¦µÄÊÊÓÃÓÚ´óÁ¿µÄÆäËü³¡¾°£¬Òò´ËÐÞ¸Ä HBase ºÜ¿ÉÄܳÔÁ¦²»Ìֺá£×îºó Cloudera ¾ö¶¨¿ª·¢Ò»¸öȫеĴ洢ϵͳ¡£

3. KuduµÄ¶ÔÍâ½Ó¿Ú

Kudu Ìṩ C++ ºÍ JAVA API £¬¿ÉÒÔ½øÐе¥Ìõ»òÅúÁ¿µÄÊý¾Ý¶Áд£¬ schema µÄ´´½¨Ð޸ġ£³ý´ËÖ®Í⣬ Kudu »¹½«Óë hadoop Éú̬ȦµÄÆäËü¹¤¾ß½øÐÐÕûºÏ¡£Ä¿Ç°£¬ kudu beta °æ±¾¶Ô Impala Ö§³Ö½ÏΪÍêÉÆ£¬Ö§³ÖÓà Impala ½øÐд´½¨±í¡¢É¾¸ÄÊý¾ÝµÈ´ó²¿·Ö²Ù×÷¡£ Kudu »¹ÊµÏÖÁË KuduTableInputFormat ºÍ KuduTableOutputFormat £¬´Ó¶øÖ§³Ö Mapreduce µÄ¶Áд²Ù×÷¡£Í¬Ê±Ö§³ÖÊý¾ÝµÄ locality£¨±¾µØÐÔ£© ¡£Ä¿Ç°¶Ô spark µÄÖ§³Ö»¹²»¹»ÍêÉÆ£¬ spark Ö»ÄܽøÐÐÊý¾ÝµÄ¶Á²Ù×÷¡£

4. ½Úµã

Kudu-master£ºÖ÷½Úµã£¬Î¬»¤´æ´¢±íÔªÊý¾Ý£¬¸ú×ÙЭµ÷ËùÓеÄtserverµÄ״̬ºÍÊý¾Ý£¬°²×°ÆæÊý½Úµã(×îÉÙÈý¸ö)¡£

Kudu-tserver£º´Ó½Úµã£¬´æ´¢¾ßÌå±íÊý¾ÝµÄ½Úµã£¬Ò»¸ö±íÊý¾Ý¿ÉÒÔÓжà¸ö¸±±¾£¬µ«Ö»ÓÐÒ»¸öleader²ÅÄܸºÔðдÇëÇó£¬leaderºÍfollower¶¼¿ÉÒÔ¸ºÔð¶ÁÇëÇó¡£°²×°×îÉÙÈý¸ö½Úµã¡£

ʹÓð¸Àý¡ª¡ªÐ¡Ã×

ΪʲôÕâÀïÓÃСÃ×À´×÷Ϊ°¸Àý£¬ÊÇÒòΪСÃ×ÔÚKudu×ßÔÚǰÁС£

СÃ×ÊÇHbaseµÄÖØ¶ÈÓû§£¬ËûÃÇÿÌìÓÐÔ¼50ÒÚÌõÓû§¼Ç¼¡£Ð¡Ã×ĿǰʹÓõÄÒ²ÊÇHDFS + HBaseÕâÑùµÄ»ìºÏ¼Ü¹¹¡£¿É¼û¸ÃÁ÷Ë®ÏßÏà¶Ô±È½Ï¸´ÔÓ£¬ÆäÊý¾Ý´æ´¢·ÖΪSequenceFile£¬HbaseºÍParquet¡£

ÔÚʹÓÃKuduÒÔºó£¬Kudu×÷ΪͳһµÄÊý¾Ý²Ö¿â£¬¿ÉÒÔͬʱ֧³ÖÀëÏß·ÖÎöºÍʵʱ½»»¥·ÖÎö¡£ÈçÏ£º

 

 

 
   
1604 ´Îä¯ÀÀ       27
Ïà¹ØÎÄÕÂ

»ùÓÚEAµÄÊý¾Ý¿â½¨Ä£
Êý¾ÝÁ÷½¨Ä££¨EAÖ¸ÄÏ£©
¡°Êý¾Ýºþ¡±£º¸ÅÄî¡¢ÌØÕ÷¡¢¼Ü¹¹Óë°¸Àý
ÔÚÏßÉ̳ÇÊý¾Ý¿âϵͳÉè¼Æ ˼·+Ч¹û
 
Ïà¹ØÎĵµ

GreenplumÊý¾Ý¿â»ù´¡Åàѵ
MySQL5.1ÐÔÄÜÓÅ»¯·½°¸
ijµçÉÌÊý¾ÝÖÐ̨¼Ü¹¹Êµ¼ù
MySQL¸ßÀ©Õ¹¼Ü¹¹Éè¼Æ
Ïà¹Ø¿Î³Ì

Êý¾ÝÖÎÀí¡¢Êý¾Ý¼Ü¹¹¼°Êý¾Ý±ê×¼
MongoDBʵս¿Î³Ì
²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿âÉè¼ÆÓëÓÅ»¯
PostgreSQLÊý¾Ý¿âʵսÅàѵ
×îл¼Æ»®
DeepSeekÔÚÈí¼þ²âÊÔÓ¦ÓÃʵ¼ù 4-12[ÔÚÏß]
DeepSeek´óÄ£ÐÍÓ¦Óÿª·¢Êµ¼ù 4-19[ÔÚÏß]
UAF¼Ü¹¹ÌåϵÓëʵ¼ù 4-11[±±¾©]
AIÖÇÄÜ»¯Èí¼þ²âÊÔ·½·¨Óëʵ¼ù 5-23[ÉϺ£]
»ùÓÚ UML ºÍEA½øÐзÖÎöÉè¼Æ 4-26[±±¾©]
ÒµÎñ¼Ü¹¹Éè¼ÆÓ뽨ģ 4-18[±±¾©]
 
×îÐÂÎÄÕÂ
´óÊý¾Ýƽ̨ϵÄÊý¾ÝÖÎÀí
ÈçºÎÉè¼ÆÊµÊ±Êý¾Ýƽ̨£¨¼¼Êõƪ£©
´óÊý¾Ý×ʲú¹ÜÀí×ÜÌå¿ò¼Ü¸ÅÊö
Kafka¼Ü¹¹ºÍÔ­Àí
ELK¶àÖּܹ¹¼°ÓÅÁÓ
×îпγÌ
´óÊý¾Ýƽ̨´î½¨Óë¸ßÐÔÄܼÆËã
´óÊý¾Ýƽ̨¼Ü¹¹ÓëÓ¦ÓÃʵս
´óÊý¾ÝϵͳÔËά
´óÊý¾Ý·ÖÎöÓë¹ÜÀí
Python¼°Êý¾Ý·ÖÎö
³É¹¦°¸Àý
ijͨÐÅÉ豸ÆóÒµ PythonÊý¾Ý·ÖÎöÓëÍÚ¾ò
Ä³ÒøÐÐ È˹¤ÖÇÄÜ+Python+´óÊý¾Ý
±±¾© Python¼°Êý¾Ý·ÖÎö
ÉñÁúÆû³µ ´óÊý¾Ý¼¼Êõƽ̨-Hadoop
ÖйúµçÐÅ ´óÊý¾Ýʱ´úÓëÏÖ´úÆóÒµµÄÊý¾Ý»¯ÔËӪʵ¼ù