Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Model Center   Code  
»áÔ±   
   
 
     
   
 ¶©ÔÄ
  ¾èÖú
Âþ̸´óÊý¾Ý²éѯÒýÇæÖ®ImpalaµÄ¼Ü¹¹Éè¼Æ
 
À´Ô´£º¼òÊé ·¢²¼ÓÚ£º2017-6-13
  3291  次浏览      27
 

0x00 ǰÑÔ

×î½üÔÚʹÓÃimpala£¬Ë³±ãѧϰһÏÂÏà¹ØµÄÔ­Àí²¿·Ö¡£

ÏÂÃæµÄ×éÖ¯½á¹¹»áÏȽéÉÜÒ»ÏÂimpalaµÄ´óÖÂÔ­ÀíºÍÉæ¼°µÄ¼¼Êõ£¬È»ºó¶Ôÿ¿éÉæ¼°µ½µÄ¼¼Êõ×öÒ»¸öÊáÀí£¬×îºóÔÙÉîÈëÒ»µãimpalaµÄÔ­Àí¡£

impalaÊÇʲô

  • ¿ªÔ´Êý¾Ý¿âϵͳ
  • ÀàMPP²¢ÐÐÊý¾Ý¿âÖ´ÐÐ
  • Dremelϵ
  • »ùÓÚhadoop

0x01 MPP

Ò»¡¢·þÎñÆ÷Èý´óÌåϵ£ºSMP¡¢NUMA¡¢MPP

´Óϵͳ¼Ü¹¹À´¿´£¬ÉÌÓ÷þÎñÆ÷´óÌå¿ÉÒÔ·ÖΪÈýÀࣺ

  • SMP£º¶Ô³Æ¶à´¦ÀíÆ÷½á¹¹(Symmetric Multi-Processor)£¬
  • NUMA£º·ÇÒ»Ö´洢·ÃÎʽṹ(Non-Uniform Memory Access)£¬
  • MPP£ºÒÔ¼°º£Á¿²¢Ðд¦Àí½á¹¹(Massive Parallel Processing)¡£

SMP£º

Ëùν¶Ô³Æ¶à´¦ÀíÆ÷½á¹¹£¬ÊÇÖ¸·þÎñÆ÷Öжà¸öCPU¶Ô³Æ¹¤×÷£¬ÎÞÖ÷´Î»ò´ÓÊô¹ØÏµ¡£¸÷CPU¹²ÏíÏàͬµÄÎïÀíÄڴ棬ÿ¸ö CPU·ÃÎÊÄÚ´æÖеÄÈκεØÖ·ËùÐèʱ¼äÊÇÏàͬµÄ£¬Òò´ËSMPÒ²±»³ÆÎªÒ»Ö´洢Æ÷·ÃÎʽṹ(UMA£ºUniform Memory Access)¡£

ȱµã£ºSMP·þÎñÆ÷µÄÖ÷ÒªÌØÕ÷Êǹ²Ïí£¬ÏµÍ³ÖÐËùÓÐ×ÊÔ´(CPU¡¢ÄÚ´æ¡¢I/OµÈ)¶¼Êǹ²ÏíµÄ¡£Ò²ÕýÊÇÓÉÓÚÕâÖÖÌØÕ÷£¬µ¼ÖÂÁËSMP·þÎñÆ÷µÄÖ÷ÒªÎÊÌ⣬ÄǾÍÊÇËüµÄÀ©Õ¹ÄÜÁ¦·Ç³£ÓÐÏÞ¡£

NUMA£º

ÓÉÓÚSMPÔÚÀ©Õ¹ÄÜÁ¦ÉϵÄÏÞÖÆ£¬ÈËÃÇ¿ªÊ¼Ì½¾¿ÈçºÎ½øÐÐÓÐЧµØÀ©Õ¹´Ó¶ø¹¹½¨´óÐÍϵͳµÄ¼¼Êõ£¬NUMA¾ÍÊÇÕâÖÖŬÁ¦ÏµĽá¹ûÖ®Ò»¡£ÀûÓÃNUMA¼¼Êõ£¬¿ÉÒ԰Ѽ¸Ê®¸öCPU(ÉõÖÁÉϰٸöCPU)×éºÏÔÚÒ»¸ö·þÎñÆ÷ÄÚ¡£

NUMA·þÎñÆ÷µÄ»ù±¾ÌØÕ÷ÊǾßÓжà¸öCPUÄ£¿é£¬Ã¿¸öCPUÄ£¿éÓɶà¸öCPU(Èç4¸ö)×é³É£¬²¢ÇÒ¾ßÓжÀÁ¢µÄ±¾µØÄÚ´æ¡¢I/O²Û¿ÚµÈ¡£ÓÉÓÚÆä½ÚµãÖ®¼ä¿ÉÒÔͨ¹ý»¥ÁªÄ£¿é(Èç³ÆÎªCrossbar Switch)½øÐÐÁ¬½ÓºÍÐÅÏ¢½»»¥£¬Òò´Ëÿ¸öCPU¿ÉÒÔ·ÃÎÊÕû¸öϵͳµÄÄÚ´æ¡£ÏÔÈ»£¬·ÃÎʱ¾µØÄÚ´æµÄËٶȽ«Ô¶Ô¶¸ßÓÚ·ÃÎÊÔ¶µØÄÚ´æ(ϵͳÄÚÆäËü½ÚµãµÄÄÚ´æ)µÄËÙ¶È£¬ÕâÒ²ÊÇ·ÇÒ»Ö´洢·ÃÎÊNUMAµÄÓÉÀ´¡£ÓÉÓÚÕâ¸öÌØµã£¬ÎªÁ˸üºÃµØ·¢»ÓϵͳÐÔÄÜ£¬¿ª·¢Ó¦ÓóÌÐòʱÐèÒª¾¡Á¿¼õÉÙ²»Í¬CPUÄ£¿éÖ®¼äµÄÐÅÏ¢½»»¥¡£

ȱµã£ºÓÉÓÚ·ÃÎÊÔ¶µØÄÚ´æµÄÑÓʱԶԶ³¬¹ý±¾µØÄڴ棬Òò´Ëµ±CPUÊýÁ¿Ôö¼Óʱ£¬ÏµÍ³ÐÔÄÜÎÞ·¨ÏßÐÔÔö¼Ó¡£

MPP£º

ºÍNUMA²»Í¬£¬MPPÌṩÁËÁíÍâÒ»ÖÖ½øÐÐϵͳÀ©Õ¹µÄ·½Ê½£¬ËüÓɶà¸öSMP·þÎñÆ÷ͨ¹ýÒ»¶¨µÄ½Úµã»¥ÁªÍøÂç½øÐÐÁ¬½Ó£¬Ð­Í¬¹¤×÷£¬Íê³ÉÏàͬµÄÈÎÎñ£¬´ÓÓû§µÄ½Ç¶ÈÀ´¿´ÊÇÒ»¸ö·þÎñÆ÷ϵͳ¡£Æä»ù±¾ÌØÕ÷ÊÇÓɶà¸öSMP·þÎñÆ÷(ÿ¸öSMP·þÎñÆ÷³Æ½Úµã)ͨ¹ý½Úµã»¥ÁªÍøÂçÁ¬½Ó¶ø³É£¬Ã¿¸ö½ÚµãÖ»·ÃÎÊ×Ô¼ºµÄ±¾µØ×ÊÔ´(ÄÚ´æ¡¢´æ´¢µÈ)£¬ÊÇÒ»ÖÖÍêÈ«ÎÞ¹²Ïí(Share Nothing)½á¹¹£¬Òò¶øÀ©Õ¹ÄÜÁ¦×îºÃ£¬ÀíÂÛÉÏÆäÀ©Õ¹ÎÞÏÞÖÆ¡£

ÔÚMPPϵͳÖУ¬Ã¿¸öSMP½ÚµãÒ²¿ÉÒÔÔËÐÐ×Ô¼ºµÄ²Ù×÷ϵͳ¡¢Êý¾Ý¿âµÈ¡£µ«ºÍNUMA²»Í¬µÄÊÇ£¬Ëü²»´æÔÚÒìµØÄÚ´æ·ÃÎʵÄÎÊÌâ¡£»»ÑÔÖ®£¬Ã¿¸ö½ÚµãÄÚµÄCPU²»ÄÜ·ÃÎÊÁíÒ»¸ö½ÚµãµÄÄÚ´æ¡£½ÚµãÖ®¼äµÄÐÅÏ¢½»»¥ÊÇͨ¹ý½Úµã»¥ÁªÍøÂçʵÏֵģ¬Õâ¸ö¹ý³ÌÒ»°ã³ÆÎªÊý¾ÝÖØ·ÖÅä(Data Redistribution)¡£

¶þ¡¢MPP database

»ùÓÚMPP¼Ü¹¹µÄÊý¾Ý¿âϵͳ¡£

  • greenplum
  • vertica

0x02 Dremel

Dremel ÊÇGoogle µÄ¡°½»»¥Ê½¡±Êý¾Ý·ÖÎöϵͳ¡£Google¿ª·¢ÁËDremel½«´¦Àíʱ¼äËõ¶Ìµ½Ãë¼¶£¬×÷ΪMapReduceµÄÓÐÁ¦²¹³ä¡£Dremel×÷ΪGoogle BigQueryµÄreportÒýÇæ£¬»ñµÃÁ˺ܴóµÄ³É¹¦¡£

¸ù¾ÝGoogle¹«¿ªµÄÂÛÎÄ¡¶Dremel: Interactive Analysis of WebScaleDatasets¡·¿ÉÒÔ¿´µ½DremelµÄÉè¼ÆÔ­Àí¡£»¹ÓÐһЩ²âÊÔ±¨¸æ¡£ÂÛÎÄдÓÚ2006Ä꣬¹«¿ªÓÚ2010Äê¡£

Ò»¡¢BigQuery

BigQueryÔÊÐíÓû§ÉÏ´«ËûÃǵij¬´óÁ¿Êý¾Ý²¢Í¨¹ýÆäÖ±½Ó½øÐн»»¥Ê½·ÖÎö£¬´Ó¶ø²»±ØÍ¶×ʽ¨Á¢×Ô¼ºµÄÊý¾ÝÖÐÐÄ¡£

¶þ¡¢DremelÌØµã

  • ´ó¹æÄ£ÏµÍ³¡£ÔÚÒ»¸öPB¼¶±ðµÄÊý¾Ý¼¯ÉÏÃæ£¬½«ÈÎÎñËõ¶Ìµ½Ãë¼¶£¬ÎÞÒÉÐèÒª´óÁ¿µÄ²¢·¢¡£
  • MR½»»¥Ê½²éѯÄÜÁ¦²»×ãµÄ²¹³ä¡£ÐèÒªGFSÕâÑùµÄÎļþϵͳ×÷Ϊ´æ´¢²ã¡£
  • Êý¾ÝÄ£ÐÍÊÇǶÌ×(nested)µÄ¡£DremelÖ§³ÖÒ»¸öǶÌ×(nested)µÄÊý¾ÝÄ£ÐÍ£¬ÀàËÆÓÚJson¡£
  • ÁÐʽ´æ´¢¡£¼õÉÙCPUºÍ´ÅÅ̵ķÃÎÊÁ¿¡£
  • ¶à¼¶·þÎñÊ÷²éѯ£¬½«Ò»¸öÏà¶Ô¾Þ´ó¸´ÔӵIJéѯ£¬·Ö¸î³É½ÏС½Ï¼òµ¥µÄ²éѯ¡£´óÊ»¯Ð¡£¬Ð¡Ê»¯ÁË£¬Äܲ¢·¢µÄÔÚ´óÁ¿½ÚµãÉÏÅÜ¡£
  • SQL-likeµÄ½Ó¿Ú£¬¾ÍÏñHiveºÍPigÄÇÑù¡£

Èý¡¢DremelÔ­Àí

´óÖÂ×ܽáһЩDremelµÄÔ­Àí£¬»¹ÓкöàûÃ÷°×......

1.ÁÐʽ´æ´¢

°´¼Ç¼£ºÔÚ°´¼Ç¼´æ´¢µÄģʽÖУ¬Ò»¸ö¼Ç¼µÄ¶àÁÐÊÇÁ¬ÐøµÄдÔÚÒ»ÆðµÄ¡£

°´ÁУºÔÚ°´Áд洢ÖУ¬¿ÉÒÔ½«Êý¾Ý°´Áзֿª¡£Ò²¾ÍÊÇ˵£¬¿ÉÒÔ½ö½öɨÃèA.B.C¶ø²»È¥¶ÁA.E»òÕßA.B.C¡£

×¢Ò⣺ ÈçºÎÄÜͬʱ¸ßЧµØÉ¨ÃèÈô¸ÉÁУ¬²»ÏþµÃÊÇÔõôʵÏֵġ£

2.Êý¾ÝÄ£ÐÍ

ÔÚGoogle, ÓÃProtocol Buffer³£³£×÷ΪÐòÁл¯µÄ·½°¸¡£ÆäÊý¾ÝÄ£ÐÍ¿ÉÒÔÓÃÊýѧ·½·¨ÑϸñµÄ±íʾÈçÏ£º

t=dom|<A1:t[*|?],...,An:t[*|?]>

ÆäÖÐt¿ÉÒÔÊÇÒ»¸ö»ù±¾ÀàÐÍ»òÕß×éºÏÀàÐÍ¡£ÆäÖлù±¾ÀàÐÍ¿ÉÒÔÊÇinteger,floatºÍstring¡£×éºÏÀàÐÍ¿ÉÒÔÊÇÈô¸É¸ö»ù±¾ÀàÐÍÆ´´Õ¡£ÐǺÅ(*)Ö¸µÄÊÇÈκÎÀàÐͶ¼¿ÉÒÔÖØ¸´£¬¾ÍÊÇÊý×éÒ»Ñù¡£ÎʺÅ(?)Ö¸µÄÊÇÈÎÒâÀàÐͶ¼ÊÇ¿ÉÒÔÊÇ¿ÉÑ¡µÄ¡£¼òµ¥À´Ëµ£¬³ýÁËûÓÐMapÍ⣬ºÍÒ»¸öJson¼¸ºõûÓÐÇø±ð¡£

ÏÂͼÊÇÀý×Ó£¬Schema¶¨ÒåÁËÒ»¸ö×éºÏÀàÐÍDocument.ÓÐÒ»¸ö±ØÑ¡ÁÐDocId£¬¿ÉÑ¡ÁÐLinks£¬»¹ÓÐÒ»¸öÊý×éÁÐName¡£¿ÉÒÔÓÃName.Language.CodeÀ´±íʾCodeÁС£

ÕâÖÖÊý¾Ý¸ñʽÊÇÓïÑÔÎ޹أ¬Æ½Ì¨Î޹صġ£¿ÉÒÔʹÓÃJavaÀ´Ð´MR³ÌÐòÀ´Éú³ÉÕâ¸ö¸ñʽ£¬È»ºóÓÃC++À´¶ÁÈ¡¡£ÔÚÕâÖÖÁÐʽ´æ´¢ÖУ¬Äܹ»¿ìËÙͨÓô¦ÀíÒ²ÊǷdz£µÄÖØÒªµÄ¡£

ÉÏͼ£¬ÊÇÒ»¸öʾÀýÊý¾ÝµÄ³éÏóµÄÄ£ÐÍ£»ÏÂͼÊÇÕâ·ÝÊý¾ÝÔÚDremelʵ¼ÊµÄ´æ´¢µÄ¸ñʽ¡£

3.·þÎñÊ÷½á¹¹

ÈçÏÂͼ£¬ÊÇDremelµÄ·þÎñÊ÷¼Ü¹¹µÄʾÒâͼ¡£

root server£º×îÉϲãÓÐһ̨µÄ¸ù·þÎñÆ÷£¨root server£©£¬¸ºÔð½ÓÊÕÓû§²éѯ£¬²¢¸ù¾ÝsqlÃüÁîÕÒµ½ÃüÁîÖÐÉè¼ÆµÄÊý¾Ý±í£¬¶Á³öÏà¹ØÊý¾Ý±íµÄÔªÊý¾Ý£¬¸Äдԭʼ²éѯºóÍÆÈëÏÂÒ»²ã·þÎñÆ÷£¨Öмä·þÎñÆ÷£©¡£Í¬Ê±¸ºÔð½ÓÊÕÖмä·þÎñÆ÷·µ»ØµÄ½á¹û£¬½øÐÐÈ«¾Ö¾ÛºÏ£¬²¢·µ»Ø¸øÓû§¡£

intermediate servers£ºÖмä·þÎñÆ÷¸ÄдÓÉÉϲã·þÎñÆ÷´«µÝÀ´µÄ²éѯÓï¾ä²¢ÒÔ´ËÏÂÍÆ£¬Ö±µ½×îµ×²ãµÄÒ¶½Úµã·þÎñÆ÷¡£ÔÚ½ÓÊÕµ½Ò¶½ÚµãµÄ½á¹ûºó½øÐоֲ¿¾Û¼¯µÈ²Ù×÷£¬×îºó·µ»Ø¸ú·þÎñÆ÷¡£

leaf servers£º ½Úµã·þÎñÆ÷¿ÉÒÔ·ÃÎÊÊý¾Ý´æ´¢²ã»òÕßÖ±½Ó·ÃÎʱ¾µØ´ÅÅÌ£¬Í¨¹ýɨÃè±¾µØÊý¾ÝµÄ·½Ê½Ö´ÐзÖÅ䏸×Ô¼ºµÄsqlÓï¾ä£¬ÔÚ»ñµÃ²éѯ½á¹ûºóÈÔÈ»°´ÕÕ·þÎñÊ÷²ã¼¶Óɵ͵½¸ßÖð²ã·µ»Ø½á¹û¡£

¾Ù¸öÀõ×Ó£º

stage1£º¶ÔÓÚÇëÇó£º

SELECT A, COUNT(B) FROM T GROUP BY A

stage2£º¸ù½ÚµãÊÕµ½ÇëÇ󣬴ÓÔªÊý¾ÝÖлñÈ¡Êý¾Ý±íTµÄËùÓÐ×Ó±í£¬ÒÔ¼°Æä¶ÔÓ¦µÄ·þÎñÆ÷£¬È»ºó¸Äд²éѯÈçÏ£º

SELECT A, SUM(c) FROM (R1 UNION ALL ... Rn) GROUP BY A

ÆäÖÐRi´ú±íroot serverÖдӵÚ1¸ö·þÎñÆ÷µ½µÚn¸ö·þÎñÆ÷½ÚµãÖ´Ðеķµ»Ø½á¹û¡£

stage3£º¶Ô×Ó±íµÄ²éѯ¡£

Ri = SELECT A, COUNT(B) AS c FROM Ti GROUP BY A

½á¹¹¼¯Ò»¶¨»á±ÈԭʼÊý¾ÝСºÜ¶à£¬´¦ÀíÆðÀ´Ò²¸ü¿ì¡£¸ù·þÎñÆ÷¿ÉÒԺܿìµÄ½«Êý¾Ý»ã×Ü¡£¾ßÌåµÄ¾ÛºÏ·½Ê½£¬¿ÉÒÔʹÓÃÏÖÓеIJ¢ÐÐÊý¾Ý¿â¼¼Êõ¡£

0x03 Impala

Ò»¡¢Ö÷Òª×é¼þ

The core Impala component is a daemon process that runs on each DataNode of the cluster, physically represented by the impalad process.

ImpalaµÄºËÐÄ×é¼þÊÇÔËÐÐÔÚ¸÷¸ö½ÚµãÉÏÃæµÄimpaladÕâ¸öÊØ»¤½ø³Ì£¨Impala daemon£©£¬Ëü¸ºÔð¶ÁдÊý¾ÝÎļþ£¬½ÓÊÕ´Óimpala-shell¡¢Hue¡¢JDBC¡¢ODBCµÈ½Ó¿Ú·¢Ë͵IJéѯÓï¾ä£¬²¢Ðл¯²éѯÓï¾äºÍ·Ö·¢¹¤×÷ÈÎÎñµ½Impala¼¯ÈºµÄ¸÷¸ö½ÚµãÉÏ£¬Í¬Ê±¸ºÔ𽫱¾µØ¼ÆËãºÃµÄ²éѯ½á¹û·¢Ë͸øÐ­µ÷Æ÷½Úµã£¨coordinator node£©¡£

Äã¿ÉÒÔÏòÔËÐÐÔÚÈÎÒâ½ÚµãµÄImpala daemonÌá½»²éѯ£¬Õâ¸ö½Úµã½«»á×÷ΪÕâ¸ö²éѯµÄЭµ÷Æ÷£¨coordinator node£©£¬ÆäËû½Úµã½«»á´«Ê䲿·Ö½á¹û¼¯¸øÕâ¸öЭµ÷Æ÷½Úµã¡£ÓÉÕâ¸öЭµ÷Æ÷½Úµã¹¹½¨×îÖյĽá¹û¼¯¡£ÔÚ×öʵÑé»òÕß²âÊÔµÄʱºòΪÁË·½±ã£¬ÎÒÃÇÍùÍùÁ¬½Óµ½Í¬Ò»¸öImpala daemonÀ´Ö´Ðвéѯ£¬µ«ÊÇÔÚÉú²ú»·¾³ÔËÐвúÆ·¼¶µÄÓ¦ÓÃʱ£¬ÎÒÃÇÓ¦¸ÃÑ­»·£¨°´Ë³Ðò£©µÄÔÚ²»Í¬½ÚµãÉÏÃæÌá½»²éѯ£¬ÕâÑù²ÅÄÜʹµÃ¼¯ÈºµÄ¸ºÔØ´ïµ½¾ùºâ¡£

Impala daemon²»¼ä¶ÏµÄ¸ústatestore½øÐÐͨÐŽ»Á÷£¬´Ó¶øÈ·ÈÏÄĸö½ÚµãÊǽ¡¿µµÄÄܽÓÊÕÐµĹ¤×÷ÈÎÎñ¡£Ëüͬʱ½ÓÊÕcatalogd daemon£¨´ÓImpala 1.2Ö®ºóÖ§³Ö£©´«À´µÄ¹ã²¥ÏûÏ¢À´¸üÐÂÔªÊý¾ÝÐÅÏ¢£¬µ±¼¯ÈºÖеÄÈÎÒâ½Úµãcreate¡¢alter¡¢dropÈÎÒâ¶ÔÏó¡¢»òÕßÖ´ÐÐINSERT¡¢LOAD DATAµÄʱºò´¥·¢¹ã²¥ÏûÏ¢¡£

2.Impala Statestore

Impala Statestore¼ì²é¼¯Èº¸÷¸ö½ÚµãÉÏImpala daemonµÄ½¡¿µ×´Ì¬£¬Í¬Ê±²»¼ä¶ÏµØ½«½á¹û·´À¡¸ø¸÷¸öImpala daemon¡£Õâ¸ö·þÎñµÄÎïÀí½ø³ÌÃû³ÆÊÇstatestored£¬ÔÚÕû¸ö¼¯ÈºÖÐÎÒÃǽöÐèÒªÒ»¸öÕâÑùµÄ½ø³Ì¼´¿É¡£Èç¹ûij¸öImpala½ÚµãÓÉÓÚÓ²¼þ´íÎó¡¢Èí¼þ´íÎó»òÕ߯äËûÔ­Òòµ¼ÖÂÀëÏߣ¬statestore¾Í»á֪ͨÆäËûµÄ½Úµã£¬±ÜÃâÆäËû½ÚµãÔÙÏòÕâ¸öÀëÏߵĽڵ㷢ËÍÇëÇó¡£

ÓÉÓÚstatestoreÊǵ±¼¯Èº½ÚµãÓÐÎÊÌâµÄʱºòÆð֪ͨ×÷Óã¬ËùÒÔËü¶ÔImpala¼¯Èº²¢²»ÊÇÓйؼüÓ°ÏìµÄ¡£Èç¹ûstatestoreûÓÐÔËÐлòÕßÔËÐÐʧ°Ü£¬ÆäËû½ÚµãºÍ·Ö²¼Ê½ÈÎÎñ»áÕÕ³£ÔËÐУ¬Ö»ÊÇ˵µ±½ÚµãµôÏßµÄʱºò¼¯Èº»á±äµÃûÄÇô½¡×³¡£µ±statestore»Ö¸´Õý³£ÔËÐÐʱ£¬Ëü¾ÍÓÖ¿ªÊ¼ÓëÆäËû½ÚµãͨÐŲ¢½øÐÐ¼à¿Ø¡£

3.Impala Catalog

Imppalla catalog·þÎñ½«SQLÓï¾ä×ö³öµÄÔªÊý¾Ý±ä»¯Í¨Öª¸ø¼¯ÈºµÄ¸÷¸ö½Úµã£¬catalog·þÎñµÄÎïÀí½ø³ÌÃû³ÆÊÇcatalogd£¬ÔÚÕû¸ö¼¯ÈºÖнöÐèÒªÒ»¸öÕâÑùµÄ½ø³Ì¡£ÓÉÓÚËüµÄÇëÇó»á¸ústatestore daemon½»»¥£¬ËùÒÔ×îºÃÈÃstatestoredºÍcatalogdÕâÁ½¸ö½ø³ÌÔÚͬһ½ÚµãÉÏ¡£

catalog·þÎñ¼õÉÙÁËREFRESHºÍINVALIDATE METADATAÓï¾äµÄʹÓá£ÔÚ֮ǰµÄ°æ±¾ÖУ¬µ±ÔÚij¸ö½ÚµãÉÏÖ´ÐÐÁËCREATE DATABASE¡¢DROP DATABASE¡¢CREATE TABLE¡¢ALTER TABLE¡¢»òÕßDROP TABLEÓï¾äÖ®ºó£¬ÐèÒªÔÚÆäËüµÄ¸÷¸ö½ÚµãÉÏÖ´ÐÐÃüÁîINVALIDATE METADATAÀ´È·±£ÔªÊý¾ÝÐÅÏ¢µÄ¸üС£Í¬ÑùµÄ£¬µ±ÄãÔÚij¸ö½ÚµãÉÏÖ´ÐÐÁËINSERTÓï¾ä£¬ÔÚÆäËü½ÚµãÉÏÖ´Ðвéѯʱ¾ÍµÃÏÈÖ´ÐÐREFRESH table_nameÕâ¸ö²Ù×÷£¬ÕâÑù²ÅÄÜʶ±ðµ½ÐÂÔöµÄÊý¾ÝÎļþ¡£

¶þ¡¢ImpalaµÄ²éѯ´¦Àí¹ý³Ì

ÈçͼÊÇimpalaµÄ²éѯ´¦Àí¹ý³Ì¡£

Èý¡¢²éѯ¼Æ»®

¾Ù¸öÀõ×Ó

select count(*) from trace.apptalk

Éú³ÉµÄÖ´Ðмƻ®

----------------
Estimated Per-Host Requirements: Memory=1.13GB VCores=1
WARNING: The following tables are missing relevant table and/or column statistics.
trace.apptalk

F01:PLAN FRAGMENT [UNPARTITIONED]
03:AGGREGATE [FINALIZE]
| output: count:merge(*)
| hosts=8 per-host-mem=unavailable
| tuple-ids=1 row-size=8B cardinality=1
|
02:EXCHANGE [UNPARTITIONED]
hosts=8 per-host-mem=unavailable
tuple-ids=1 row-size=8B cardinality=1

F00:PLAN FRAGMENT [RANDOM]
DATASTREAM SINK [FRAGMENT=F01, EXCHANGE=02, UNPARTITIONED]
01:AGGREGATE
| output: count(*)
| hosts=8 per-host-mem=10.00MB
| tuple-ids=1 row-size=8B cardinality=1
|
00:SCAN HDFS [trace.apptalk, RANDOM]
partitions=88/88 files=17578 size=67.11MB
table stats: unavailable
column stats: all
hosts=8 per-host-mem=1.13GB
tuple-ids=0 row-size=0B cardinality=unavailable
----------------

   
3291 ´Îä¯ÀÀ       27
Ïà¹ØÎÄÕÂ

»ùÓÚEAµÄÊý¾Ý¿â½¨Ä£
Êý¾ÝÁ÷½¨Ä££¨EAÖ¸ÄÏ£©
¡°Êý¾Ýºþ¡±£º¸ÅÄî¡¢ÌØÕ÷¡¢¼Ü¹¹Óë°¸Àý
ÔÚÏßÉ̳ÇÊý¾Ý¿âϵͳÉè¼Æ ˼·+Ч¹û
 
Ïà¹ØÎĵµ

GreenplumÊý¾Ý¿â»ù´¡Åàѵ
MySQL5.1ÐÔÄÜÓÅ»¯·½°¸
ijµçÉÌÊý¾ÝÖÐ̨¼Ü¹¹Êµ¼ù
MySQL¸ßÀ©Õ¹¼Ü¹¹Éè¼Æ
Ïà¹Ø¿Î³Ì

Êý¾ÝÖÎÀí¡¢Êý¾Ý¼Ü¹¹¼°Êý¾Ý±ê×¼
MongoDBʵս¿Î³Ì
²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿âÉè¼ÆÓëÓÅ»¯
PostgreSQLÊý¾Ý¿âʵսÅàѵ