±à¼ÍƼö: |
±¾ÎÄÖ÷Òª½²½â£ºÎªÊ²Ã´Òª¹¹½¨ÊµÊ±Êý¾Ý²Ö¿â¡¢²ËÄñ¡¢Öªºõ¡¢ÃÀÍÅ¡¢ÍøÒ×ʵʱÊý²Ö·½°¸¡¢¸÷¸ö¿ªÔ´
OLAP Êý¾Ý¿âµÄÓÅȱµã¡¢ÎÒÃǸÃÈçºÎ×ö¼¼ÊõÑ¡ÐÍ¡£
±¾ÎÄÀ´×ÔÖªºõ£¬ÓÉ»ðÁú¹ûÈí¼þAnna±à¼¡¢ÍƼö¡£ |
|
ÔÚ¿ªÔ´Ê¢ÊÀµÄ½ñÌ죬ʵʱÊý²ÖµÄ½¨ÉèÒµ½çÒѾÓÐÁ˳ÉÊìµÄ·½°¸¡£¼¼ÊõÑ¡ÐÍÉÏʵʱ¼ÆËã¡¢ÏûÏ¢¶ÓÁж¼ÓÐ×îÓŽ⣬Ψ¶ÀÔÚ
OLAP ÁìÓò£¬°Ù¼ÒÕùÃù£¬¸÷ÓÐËù³¤¡£
´óÊý¾ÝÁìÓò¿ªÔ´ OLAP ÒýÇæ°üÀ¨²»ÏÞÓÚ Hive¡¢Hawq¡¢Presto¡¢Kylin¡¢Impala¡¢SparkSQL¡¢Druid¡¢Clickhouse¡¢Greeplum
µÈµÈ¡£ÎÒÃǾ͸÷¸ö³£ÓÿªÔ´ OLAP ÒýÇæµÄÓÅȱµãºÍʹÓó¡¾°×ö³öÏêϸ¶Ô±È£¬Èÿª·¢Õß½øÐм¼ÊõÑ¡ÐÍʱ×öµ½ÐÄÖÐÓÐÊý¡£
ǰÑÔ
½ñÄêÓиöÏÖÏó£¬ÊµÊ±Êý²Ö½¨ÉèͻȻ¾Í±»´ó¼ÒËù¹Ø×¢¡£ÎÒ¸öÈËÔÚ¹«ÖÚºÅҲд¹ýºÍ×ªÔØ¹ý¼¸Æª¹ØÓÚʵʱÊý¾Ý²Ö¿âµÄÎÄÕºͷ½°¸¡£
µ«ÊǶÔÓÚʵʱÊý²ÖµÄ¿ñÈÈ×·Çó´ó¿É²»±Ø¡£
Ê×ÏÈ£¬ÔÚ¼¼ÊõÉϼ¸ºõûÓÐÄѵ㣬»ùÓÚÇ¿´óµÄ¿ªÔ´Öмä¼þʵÏÖʵʱÊý¾Ý²Ö¿âµÄÐèÇóÒѾ±äµÃûÓÐÄÇôÀ§ÄÑ¡£Æä´Î£¬ÊµÊ±Êý²ÖµÄ½¨ÉèÒ»¶¨ÊǰéËæ×ÅÒµÎñµÄ·¢Õ¹¶ø·¢Õ¹£¬Îä¶ÏµÄÈÏΪKappa¼Ü¹¹Ò»¶¨ÊÇ×îºÃµÄʵʱÊý²Ö¼Ü¹¹ÊDz»¶ÔµÄ¡£Êµ¼ÊÇé¿öÖÐËæ×ÅÒµÎñµÄ·¢Õ¹Êý²ÖµÄ¼Ü¹¹±äµÃûÓÐÄÇô·Ç´Ë¼´±Ë¡£
ÔÚÕû¸öʵʱÊý²ÖµÄ½¨ÉèÖУ¬OLAPÊý¾Ý¿âµÄÑ¡ÐÍÖ±½ÓÖÆÔ¼ÊµÊ±Êý²ÖµÄ¿ÉÓÃÐԺ͹¦ÄÜÐÔ¡£±¾ÎÄ´ÓÒµÄÚ¼¸¸öµäÐ͵ÄÊý²Ö½¨ÉèºÍ·¢Õ¹Çé¿öÈëÊÖ£¬´Ó¼Ü¹¹¡¢¼¼ÊõÑ¡ÐͺÍÓÅȱµã·Ö±ð¸ø´ó¼Ò·ÖÎöÏÖÔÚÊг¡ÉϵĿªÔ´OLAPÒýÇæ£¬Ö¼ÔÚ·½±ã´ó¼Ò¼¼ÊõÑ¡Ð͹ý³ÌÖÐÄܹ»¸ù¾Ýʵ¼ÊÒµÎñ½øÐÐÑ¡Ôñ¡£
¹ÜÖпú±ª-²ËÄñ/Öªºõ/ÃÀÍÅ/ÍøÒ×ÑÏѡʵʱÊý²Ö½¨Éè
ΪʲôҪ¹¹½¨ÊµÊ±Êý¾Ý²Ö¿â
´«Í³µÄÀëÏßÊý¾Ý²Ö¿â½«ÒµÎñÊý¾Ý¼¯ÖнøÐд洢ºó£¬ÒԹ̶¨µÄ¼ÆËãÂß¼¶¨Ê±½øÐÐETLºÍÆäËü½¨Ä£ºó²ú³ö±¨±íµÈÓ¦Óá£ÀëÏßÊý¾Ý²Ö¿âÖ÷ÒªÊǹ¹½¨T+1µÄÀëÏßÊý¾Ý£¬Í¨¹ý¶¨Ê±ÈÎÎñÿÌìÀÈ¡ÔöÁ¿Êý¾Ý£¬È»ºó´´½¨¸÷¸öÒµÎñÏà¹ØµÄÖ÷Ìâά¶ÈÊý¾Ý£¬¶ÔÍâÌṩT+1µÄÊý¾Ý²éѯ½Ó¿Ú¡£¼ÆËãºÍÊý¾ÝµÄʵʱÐÔ¾ù½Ï²î£¬ÒµÎñÈËÔ±ÎÞ·¨¸ù¾Ý×Ô¼ºµÄ¼´Ê±ÐÔÐèÒª»ñÈ¡¼¸·ÖÖÓ֮ǰµÄʵʱÊý¾Ý¡£Êý¾Ý±¾ÉíµÄ¼ÛÖµËæ×Åʱ¼äµÄÁ÷ÊÅ»áÖð²½¼õÈõ£¬Òò´ËÊý¾Ý·¢Éúºó±ØÐ뾡¿ìµÄ´ïµ½Óû§µÄÊÖÖУ¬ÊµÊ±Êý²ÖµÄ¹¹½¨ÐèÇóÒ²Ó¦Ô˶øÉú¡£
×ÜÖ®¾ÍÊÇÒ»¾ä»°£ºÊ±Ð§ÐÔµÄÒªÇó¡£
°¢Àï²ËÄñµÄʵʱÊý²ÖÉè¼Æ

²ËÄñµÄʵʱÊý²ÖÕûÌåÉè¼ÆÈçÉÏͼ£¬»ùÓÚÒµÎñϵͳµÄÊý¾Ý£¬Êý¾ÝÄ£ÐÍÊÇ´«Í³µÄ·Ö²ã»ã×ÜÉè¼Æ£¨Ã÷ϸ/Çá¶È»ã×Ü/¸ß¶È»ã×Ü£©£»¼ÆËãÒýÇæ£¬Ñ¡ÔñµÄÊǰ¢ÀïÄÚ²¿µÄBlink£»Êý¾Ý·ÃÎÊÓÃÌ칤½ÓÈë(Ì칤ÊÇÒ»¸öÁ¬½Ó¶àÖÖÊý¾ÝÔ´µÄ¹¤¾ß£¬Ä¿µÄÊÇÆÁ±Î´óÁ¿µÄ¶Ô¸÷ÖÖÊý¾Ý¿âµÄÖ±Á¬)£»Êý¾ÝÓ¦ÓöÔÓ¦µÄÊDzËÄñµÄ¸÷¸öÒµÎñ¡£
²ËÄñµÄʵʱÊý²ÖµÄ¼Ü¹¹Éè¼ÆÊÇÒ»¸öºÜµäÐͺܾµÃÆð¿¼ÑéµÄÉè¼Æ¡£ÊµÊ±Êý¾Ý½ÓÈ벿·Öͨ¹ýÏûÏ¢Öмä¼þ(¿ªÔ´´óÊý¾ÝÁìÓò·ÇKafkaĪÊô£¬PulsarÊǺóÆðÖ®Ðã)£¬Hbase×÷Ϊ¸ß¶È»ã×ܵÄK-V²éѯ¸¨Öú¡£
ÄÇô´óÁ¿µÄ¶ÔÒµÎñµÄÖ±½ÓÖ§³ÅÔÚÄÄÀÔÚÕâÀADS¡£
ADS£¨ºó¸üÃûΪADB£¬¼ÓÈëÐÂÌØÐÔ£©Êǰ¢Àï°Í°Í×ÔÖ÷Ñз¢µÄº£Á¿Êý¾Ýʵʱ¸ß²¢·¢ÔÚÏß·ÖÎö£¨Realtime
OLAP£©ÔƼÆËãÊý¾Ý¿â¡£²Î¿¼

¾µäµÄʵʱÊý¾ÝÇåÏ´³¡¾°

¾µäµÄʵʱÊý²Ö³¡¾°
ÔÚADBµÄ¹Ù·½ÎĵµÖиø³öÁËADBµÄÄÜÁ¦£º
¿ìADB²ÉÓÃMPP+DAGÈÚºÏÒýÇæ£¬²ÉÓÃÐÐÁÐ»ì´æ¼¼Êõ¡¢×Ô¶¯Ë÷ÒýµÈ¼¼Êõ£¬¿ÉÒÔ¿ìËÙÀ©ÈÝÖÁÊýǧ½Úµã¡£
Áé»îËæÒâµ÷Õû½ÚµãÊýÁ¿ºÍ¶¯Ì¬Éý½µÅäʵÀý¹æ¸ñ¡£
Ò×ÓÃÈ«Ãæ¼æÈÝMySQLÐÒéºÍSQL
³¬´ó¹æÄ£È«·Ö²¼Ê½½á¹¹£¬ÎÞÈκε¥µãÉè¼Æ£¬·½±ãºáÏòÀ©Õ¹Ôö¼ÓSQL´¦Àí²¢·¢¡£
¸ß²¢·¢Ð´ÈëС¹æÄ£µÄ10ÍòTPSдÈëÄÜÁ¦£¬Í¨¹ýºáÏòÀ©ÈݽڵãÌáÉýÖÁ200Íò+TPSµÄдÈëÄÜÁ¦¡£ÊµÊ±Ð´ÈëÊý¾Ýºó£¬Ô¼1Ãë×óÓÒ¼´¿É²éѯ·ÖÎö¡£µ¥¸ö±í×î´óÖ§³Ö2PBÊý¾Ý£¬Ê®ÍòÒڼǼ¡£
ÖªºõµÄʵʱÊý²ÖÉè¼Æ
ÖªºõµÄʵʱÊý²Öʵ¼ùÒÔ¼°¼Ü¹¹µÄÑݽø·ÖΪÈý¸ö½×¶Î£º
ʵʱÊý²Ö 1.0 °æ±¾£¬Ö÷Ì⣺ ETL Â߼ʵʱ»¯£¬¼¼Êõ·½°¸£ºSpark Streaming
ʵʱÊý²Ö 2.0 °æ±¾£¬Ö÷Ì⣺Êý¾Ý·Ö²ã£¬Ö¸±ê¼ÆËãʵʱ»¯£¬¼¼Êõ·½°¸£ºFlink Streaming
ʵʱÊý²ÖδÀ´Õ¹Íû£ºStreaming SQL ƽ̨»¯£¬ÔªÐÅÏ¢¹ÜÀíϵͳ»¯£¬½á¹ûÑéÊÕ×Ô¶¯»¯

ʵʱÊý²Ö 1.0 °æ±¾

ʵʱÊý²Ö 2.0 °æ±¾
ÔÚ¼¼Êõ¼Ü¹¹ÉÏ£¬Ôö¼ÓÁËÖ¸±ê»ã×ܲ㣬ָ±ê»ã×ܲãÊÇÓÉÃ÷ϸ²ã»òÕßÃ÷ϸ»ã×ܲãͨ¹ý¾ÛºÏ¼ÆËãµÃµ½£¬ÕâÒ»²ã²ú³öÁ˾ø´ó²¿·ÖµÄʵʱÊý²ÖÖ¸±ê£¬ÕâÒ²ÊÇÓëʵʱÊý²Ö
1.0 ×î´óµÄÇø±ð¡£
¼¼ÊõÑ¡ÐÍÉÏ£¬Öªºõ¸ù¾Ý²»Í¬ÒµÎñ³¡¾°Ñ¡ÔñÁËHBase ºÍ Redis
×÷Ϊʵʱָ±êµÄ´æ´¢ÒýÇæ£¬ÔÚOLAPÑ¡ÐÍÉÏ£¬ÖªºõÑ¡ÔñÁËDruid¡£

Öªºõʵʱ¶àά·ÖÎöƽ̨

¼Ü¹¹Druid ÕûÌå¼Ü¹¹
DruidÊÇÒ»¸ö¸ßЧµÄÊý¾Ý²éѯϵͳ£¬Ö÷Òª½â¾öµÄÊǶÔÓÚ´óÁ¿µÄ»ùÓÚʱÐòµÄÊý¾Ý½øÐоۺϲéѯ¡£Êý¾Ý¿ÉÒÔʵʱÉãÈ룬½øÈëµ½DruidºóÁ¢¼´¿É²é£¬Í¬Ê±Êý¾ÝÊǼ¸ºõÊDz»¿É±ä¡£Í¨³£ÊÇ»ùÓÚʱÐòµÄÊÂʵʼþ£¬ÊÂʵ·¢Éúºó½øÈëDruid£¬Íⲿϵͳ¾Í¿ÉÒÔ¶Ô¸ÃÊÂʵ½øÐвéѯ¡£Druid²ÉÓõļܹ¹:
shared-nothing¼Ü¹¹Óëlambda¼Ü¹¹
DruidÉè¼ÆµÄÈý¸öÔÔò:
¿ìËÙ²éѯ£º²¿·ÖÊý¾Ý¾ÛºÏ£¨Partial Aggregate£© + Äڴ滯£¨In-Memory£© +
Ë÷Òý£¨Index£©
Ë®Æ½ÍØÕ¹ÄÜÁ¦£º·Ö²¼Ê½Êý¾Ý£¨Distributed data£©+²¢Ðл¯²éѯ£¨Parallelizable
Query£©
ʵʱ·ÖÎö£ºImmutable Past , Append-Only Future
ÃÀÍŵÄʵʱÊý²ÖÉè¼Æ

ÃÀÍÅʵʱÊý²ÖÊý¾Ý·Ö²ã¼Ü¹¹
ÃÀÍŵļ¼Êõ·½°¸ÓÉÒÔÏÂËIJ㹹³É£º
ODS ²ã£ºBinlog ºÍÁ÷Á¿ÈÕÖ¾ÒÔ¼°¸÷ÒµÎñʵʱ¶ÓÁС£
Êý¾ÝÃ÷ϸ²ã£ºÒµÎñÁìÓòÕûºÏÌáÈ¡ÊÂʵÊý¾Ý£¬ÀëÏßÈ«Á¿ºÍʵʱ±ä»¯Êý¾Ý¹¹½¨ÊµÊ±Î¬¶ÈÊý¾Ý¡£
Êý¾Ý»ã×ܲ㣺ʹÓÿí±íÄ£ÐͶÔÃ÷ϸÊý¾Ý²¹³äά¶ÈÊý¾Ý£¬¶Ô¹²ÐÔÖ¸±ê½øÐлã×Ü¡£
App ²ã£ºÎªÁ˾ßÌåÐèÇó¶ø¹¹½¨µÄÓ¦Óò㣬ͨ¹ý RPC ¿ò¼Ü¶ÔÍâÌṩ·þÎñ¡£
¸ù¾Ý²»Í¬ÒµÎñ³¡¾°£¬ÊµÊ±Êý²Ö¸÷¸öÄ£ÐͲã´ÎʹÓõĴ洢·½°¸ºÍOLAPÒýÇæÈçÏ£º

Êý¾ÝÃ÷ϸ²ã ¶ÔÓÚά¶ÈÊý¾Ý²¿·Ö³¡¾°Ï¹ØÁªµÄƵÂʿɴï 10w+ TPS£¬ÎÒÃÇÑ¡Ôñ Cellar£¨ÃÀÍÅÄÚ²¿·Ö²¼Ê½K-V´æ´¢ÏµÍ³£¬ÀàËÆRedis£©
×÷Ϊ´æ´¢£¬·âװά¶È·þÎñΪʵʱÊý²ÖÌṩά¶ÈÊý¾Ý¡£
Êý¾Ý»ã×ܲ㠶ÔÓÚͨÓõĻã×ÜÖ¸±ê£¬ÐèÒª½øÐÐÀúÊ·Êý¾Ý¹ØÁªµÄÊý¾Ý£¬²ÉÓúÍά¶ÈÊý¾ÝÒ»ÑùµÄ·½°¸Í¨¹ý Cellar
×÷Ϊ´æ´¢£¬Ó÷þÎñµÄ·½Ê½½øÐйØÁª²Ù×÷¡£
Êý¾ÝÓ¦Óòã Ó¦ÓòãÉè¼ÆÏà¶Ô¸´ÔÓ£¬ÔÙ¶Ô±ÈÁ˼¸ÖÖ²»Í¬´æ´¢·½°¸ºó¡£ÎÒÃÇÖÆ¶¨ÁËÒÔÊý¾Ý¶ÁдƵÂÊ 1000 QPS
Ϊ·Ö½çµÄÅжÏÒÀ¾Ý¡£¶ÔÓÚ¶Áдƽ¾ùƵÂʸßÓÚ 1000 QPS µ«²éѯ²»Ì«¸´ÔÓµÄʵʱӦÓ㬱ÈÈçÉÌ»§ÊµÊ±µÄ¾ÓªÊý¾Ý¡£²ÉÓÃ
Cellar Ϊ´æ´¢£¬ÌṩʵʱÊý¾Ý·þÎñ¡£¶ÔÓÚһЩ²éѯ¸´ÔӵĺÍÐèÒªÃ÷ϸÁбíµÄÓ¦Óã¬Ê¹Óà Elasticsearch
×÷Ϊ´æ´¢Ôò¸üΪºÏÊÊ¡£¶øÒ»Ð©²éѯƵÂʵͣ¬±ÈÈçһЩÄÚ²¿ÔËÓªµÄÊý¾Ý¡£ Druid ͨ¹ýʵʱ´¦ÀíÏûÏ¢¹¹½¨Ë÷Òý£¬²¢Í¨¹ýÔ¤¾ÛºÏ¿ÉÒÔ¿ìËÙµÄÌṩʵʱÊý¾Ý
OLAP ·ÖÎö¹¦ÄÜ¡£¶ÔÓÚһЩÀúÊ·°æ±¾µÄÊý¾Ý²úÆ·½øÐÐʵʱ»¯¸ÄÔìʱ£¬Ò²¿ÉÒÔʹÓà MySQL ´æ´¢±ãÓÚ²úÆ·µü´ú¡£
×ÜÖ®£¬ÔÚOLAPÑ¡ÐÍÉÏͬÑùÒÔDruidΪÖ÷¡£
ÍøÒ×ÑÏÑ¡µÄʵʱÊý²ÖÉè¼Æ

ÍøÒ×ÑÏÑ¡µÄʵʱÊý²ÖÕûÌå¿ò¼ÜÒÀ¾ÝÊý¾ÝµÄÁ÷Ïò·ÖΪ²»Í¬µÄ²ã´Î£¬½ÓÈë²ã»áÒÀ¾Ý¸÷ÖÖÊý¾Ý½ÓÈ빤¾ß
ÊÕ¼¯¸÷¸öÒµÎñϵͳµÄÊý¾Ý¡£ÏûÏ¢¶ÓÁеÄÊý¾Ý¼ÈÊÇÀëÏßÊý²ÖµÄÔʼÊý¾Ý£¬Ò²ÊÇʵʱ¼ÆËãµÄÔʼÊý¾Ý£¬ÕâÑù¿ÉÒÔ±£Ö¤ÊµÊ±ºÍÀëÏßµÄÔʼÊý¾ÝÊÇͳһµÄ¡£
ÔÚ¼ÆËã²ã¾¹ý Flink+ʵʱ¼ÆËãÒýÇæ×öһЩ¼Ó¹¤´¦Àí£¬È»ºóÂ䵨µ½´æ´¢²ãÖв»Í¬´æ´¢½éÖʵ±ÖС£²»Í¬µÄ´æ´¢½éÖÊÊÇÒÀ¾Ý²»Í¬µÄÓ¦Óó¡¾°À´Ñ¡Ôñ¡£¿ò¼ÜÖл¹ÓÐFlinkºÍKafkaµÄ½»»¥£¬ÔÚÊý¾ÝÉϽøÐÐÒ»¸ö·Ö²ãÉè¼Æ£¬¼ÆËãÒýÇæ´ÓKafkaÖÐÀÌÈ¡Êý¾Ý×öһЩ¼Ó¹¤È»ºó·Å»ØKafka¡£ÔÚ´æ´¢²ã¼Ó¹¤ºÃµÄÊý¾Ý»áͨ¹ý·þÎñ²ãµÄÁ½¸ö·þÎñ£ºÍ³Ò»²éѯ¡¢Ö¸±ê¹ÜÀí£¬Í³Ò»²éѯÊÇͨ¹ýÒµÎñ·½µ÷È¡Êý¾Ý½Ó¿ÚµÄÒ»¸ö·þÎñ£¬Ö¸±ê¹ÜÀíÊǶÔÊý¾ÝÖ¸±êµÄ¶¨ÒåºÍ¹ÜÀí¹¤×÷¡£Í¨¹ý·þÎñ²ãÓ¦Óõ½²»Í¬µÄÊý¾ÝÓ¦Óã¬Êý¾ÝÓ¦ÓÿÉÄÜÊÇÎÒÃǵÄÕýʽ²úÆ·»òÕßÖ±½ÓµÄÒµÎñϵͳ¡£
»ùÓÚÒÔÉϵÄÉè¼Æ£¬¼¼ÊõÑ¡ÐÍÈçÏ£º

¶ÔÓÚ´æ´¢²ã»áÒÀ¾Ý²»Í¬µÄÊý¾Ý²ãµÄÌØµãÑ¡Ôñ²»Í¬µÄ´æ´¢½éÖÊ£¬ODS²ãºÍDWD²ã¶¼ÊÇ´æ´¢µÄһЩʵʱÊý¾Ý£¬Ñ¡ÔñµÄÊÇKafka½øÐд洢£¬ÔÚDWD²ã»á¹ØÁªÒ»Ð©ÀúÊ·Ã÷ϸÊý¾Ý£¬»á½«Æä·Åµ½
Redis ÀïÃæ¡£ÔÚDIM²ãÖ÷Òª×öһЩ¸ß²¢·¢Î¬¶ÈµÄ²éѯ¹ØÁª£¬Ò»°ã½«Æä´æ·ÅÔÚHBaseÀïÃæ£¬¶ÔÓÚDIM²ã±È¼Û¸´ÔÓ£¬ÐèÒª×ۺϿ¼ÂǶÔÓÚÊý¾ÝÂ䵨µÄÒªÇóÒÔ¼°¾ßÌåµÄ²éѯÒýÇæÀ´Ñ¡Ôñ²»Í¬µÄ´æ´¢·½Ê½¡£¶ÔÓÚ³£¼ûµÄÖ¸±ê»ã×ÜÄ£ÐÍÖ±½Ó·ÅÔÚ
MySQL ÀïÃæ£¬Î¬¶È±È½Ï¶àµÄ¡¢Ð´Èë¸üбȽϴóµÄÄ£ÐÍ»á·ÅÔÚHBaseÀïÃæ£¬»¹ÓÐÃ÷ϸÊý¾ÝÐèÒª×öһЩ¶àά·ÖÎö»òÕß¹ØÁª»á½«Æä´æ´¢ÔÚGreenplumÀïÃæ£¬»¹ÓÐÒ»ÖÖÊÇά¶È±È½Ï¶à¡¢ÐèÒª×öÅÅÐò¡¢²éѯҪÇó±È½Ï¸ßµÄ£¬Èç»î¶¯ÆÚ¼äÓû§µÄÏúÊÛÁбíµÈ´óÁбíÖ±½Ó´æ´¢ÔÚRedisÀïÃæ¡£
ÍøÒ×ÑÏѡѡÔñÁËGreenPulm¡¢Hbase¡¢RedisºÍMySQL×÷ΪÊý¾ÝµÄ¼ÆËãºÍ͸³ö²ã¡£
GreenPulmµÄ¼¼ÊõÌØµãÈçÏ£º
Ö§³Öº£Á¿Êý¾Ý´æ´¢ºÍ´¦Àí
Ö§³ÖJust In Time BI£ºÍ¨¹ý׼ʵʱ¡¢ÊµÊ±µÄÊý¾Ý¼ÓÔØ·½Ê½£¬ÊµÏÖÊý¾Ý²Ö¿âµÄʵʱ¸üУ¬½ø¶øÊµÏÖ¶¯Ì¬Êý¾Ý²Ö¿â£¨ADW£©£¬»ùÓÚ¶¯Ì¬Êý¾Ý²Ö¿â£¬ÒµÎñÓû§ÄܶԵ±Ç°ÒµÎñÊý¾Ý½øÐÐBIʵʱ·ÖÎö£¨Just
In Time BI£©
Ö§³ÖÖ÷Á÷µÄsqlÓï·¨£¬Ê¹ÓÃÆðÀ´Ê®·Ö·½±ã£¬Ñ§Ï°³É±¾µÍ
À©Õ¹ÐԺã¬Ö§³Ö¶àÓïÑÔµÄ×Ô¶¨Ò庯ÊýºÍ×Ô¶¨ÒåÀàÐ͵È
ÌṩÁË´óÁ¿µÄά»¤¹¤¾ß£¬Ê¹ÓÃά»¤ÆðÀ´ºÜ·½±ã
Ö§³ÖÏßÐÔÀ©Õ¹£º²ÉÓÃMPP²¢Ðд¦Àí¼Ü¹¹¡£ÔÚMPP½á¹¹ÖÐÔö¼Ó½Úµã¾Í¿ÉÒÔÏßÐÔÌṩϵͳµÄ´æ´¢ÈÝÁ¿ºÍ´¦ÀíÄÜÁ¦
½ÏºÃµÄ²¢·¢Ö§³Ö¼°¸ß¿ÉÓÃÐÔÖ§³Ö³ýÁËÌṩӲ¼þ¼¶µÄRaid¼¼ÊõÍ⣬»¹ÌṩÊý¾Ý¿â²ãMirror»úÖÆ±£»¤£¬ÌṩMaster/Stand
by»úÖÆ½øÐÐÖ÷½ÚµãÈÝ´í£¬µ±Ö÷½Úµã·¢Éú´íÎóʱ£¬¿ÉÒÔÇл»µ½Stand by½Úµã¼ÌÐø·þÎñ
Ö§³ÖMapReduce£ºÒ»ÖÖ´ó¹æÄ£Êý¾Ý·ÖÎö¼¼Êõ
Êý¾Ý¿âÄÚ²¿Ñ¹Ëõ
×ܽá
ÎÒÃÇͨ¹ýÒÔÉϵķÖÎö¿ÉÒÔ¿´³ö£¬ÔÚÕû¸öʵʱÊý²ÖµÄ½¨ÉèÖУ¬Òµ½çÒѾÓÐÁ˳ÉÊìµÄ·½°¸¡£ÕûÌå¼Ü¹¹Éè¼ÆÍ¨¹ý·Ö²ãÉè¼ÆÎªOLAP²éѯ·Öµ£Ñ¹Á¦£¬Èóö¼ÆËã¿Õ¼ä£¬¸´ÔӵļÆËãͳһÔÚʵʱ¼ÆËã²ã×ö£¬±ÜÃâ¸øOLAP²éѯ´øÀ´¹ý´óµÄѹÁ¦¡£»ã×ܼÆËã½Ì¸øOLAPÊý¾Ý¿â½øÐС£ÎÒÃÇ¿ÉÒÔÕâô˵£¬ÔÚÕû¸ö¼Ü¹¹ÖÐʵʱ¼ÆËãÒ»°ãÊÇSpark+FlinkÅäºÏ£¬ÏûÏ¢¶ÓÁÐKafkaÒ»¼Ò¶À´ó£¬Õû¸ö´óÊý¾ÝÁìÓòÏûÏ¢¶ÓÁеÄÓ¦ÓÃÖÐÈÔÈ»´¦Àí¢¶ÏµØÎ»£¬ºóÀ´ÕßPulsarÏë×ö³ö³¬Ô½ÄѶȺܴó£¬Hbase¡¢RedisºÍMySQL¶¼ÔÚÌØ¶¨³¡¾°ÏÂÓÐһϯ֮µØ¡£Î¨¶ÀÔÚOLAPÁìÓò£¬°Ù¼ÒÕùÃù£¬¸÷ÓÐËù³¤¡£´óÊý¾ÝÁìÓò¿ªÔ´OLAPÒýÇæ°üÀ¨µ«ÊDz»ÏÞÓÚHive¡¢Druid¡¢Hawq¡¢Presto¡¢Impala¡¢Sparksql¡¢Clickhouse¡¢GreenplumµÈµÈ¡£ÏÂһƪÎÒÃǾ͸÷¸ö¿ªÔ´OLAPÒýÇæµÄÓÅȱµãºÍʹÓó¡¾°×ö³öÏêϸ¶Ô±È£¬Èÿª·¢Õß½øÐм¼ÊõÑ¡ÐÍʱ×öµ½ÐÄÖÐÓÐÊý¡£
OLAP°Ù¼ÒÕùÃù
OLAP¼ò½é
OLAP£¬Ò²½ÐÁª»ú·ÖÎö´¦Àí£¨Online Analytical Processing£©ÏµÍ³£¬ÓеÄʱºòÒ²½ÐDSS¾ö²ßÖ§³Öϵͳ£¬¾ÍÊÇÎÒÃÇ˵µÄÊý¾Ý²Ö¿â¡£Óë´ËÏà¶ÔµÄÊÇOLTP£¨on-line
transaction processing£©Áª»úÊÂÎñ´¦Àíϵͳ¡£
Áª»ú·ÖÎö´¦Àí (OLAP) µÄ¸ÅÄî×îÔçÊÇÓɹØÏµÊý¾Ý¿âÖ®¸¸E.F.CoddÓÚ1993ÄêÌá³öµÄ¡£OLAPµÄÌá³öÒýÆðÁ˺ܴóµÄ·´Ï죬OLAP×÷ΪһÀà²úƷͬÁª»úÊÂÎñ´¦Àí
(OLTP) Ã÷ÏÔÇø·Ö¿ªÀ´¡£
CoddÈÏΪÁª»úÊÂÎñ´¦Àí£¨OLTP£©ÒѲ»ÄÜÂú×ãÖÕ¶ËÓû§¶ÔÊý¾Ý¿â²éѯ·ÖÎöµÄÒªÇó£¬SQL¶Ô´óÊý¾Ý¿âµÄ¼òµ¥²éѯҲ²»ÄÜÂú×ãÓû§·ÖÎöµÄÐèÇó¡£Óû§µÄ¾ö²ß·ÖÎöÐèÒª¶Ô¹ØÏµÊý¾Ý¿â½øÐдóÁ¿¼ÆËã²ÅÄܵõ½½á¹û£¬¶ø²éѯµÄ½á¹û²¢²»ÄÜÂú×ã¾ö²ßÕßÌá³öµÄÐèÇó¡£Òò´Ë£¬CoddÌá³öÁ˶àάÊý¾Ý¿âºÍ¶àά·ÖÎöµÄ¸ÅÄ¼´OLAP¡£
OLAPίԱ»á¶ÔÁª»ú·ÖÎö´¦ÀíµÄ¶¨ÒåΪ£º´ÓÔʼÊý¾ÝÖÐת»¯³öÀ´µÄ¡¢Äܹ»ÕæÕýΪÓû§ËùÀí½âµÄ¡¢²¢ÕæÊµ·´Ó³ÆóÒµ¶àÎ¬ÌØÐÔµÄÊý¾Ý³ÆÎªÐÅÏ¢Êý¾Ý£¬Ê¹·ÖÎöÈËÔ±¡¢¹ÜÀíÈËÔ±»òÖ´ÐÐÈËÔ±Äܹ»´Ó¶àÖֽǶȶÔÐÅÏ¢Êý¾Ý½øÐпìËÙ¡¢Ò»Ö¡¢½»»¥µØ´æÈ¡£¬´Ó¶ø»ñµÃ¶ÔÊý¾ÝµÄ¸üÉîÈëÁ˽âµÄÒ»ÀàÈí¼þ¼¼Êõ¡£OLAPµÄÄ¿±êÊÇÂú×ã¾ö²ßÖ§³Ö»ò¶àά»·¾³Ìض¨µÄ²éѯºÍ±¨±íÐèÇó£¬ËüµÄ¼¼ÊõºËÐÄÊÇ"ά"Õâ¸ö¸ÅÄÒò´ËOLAPÒ²¿ÉÒÔ˵ÊǶàάÊý¾Ý·ÖÎö¹¤¾ßµÄ¼¯ºÏ¡£
OLAPµÄ×¼ÔòºÍÌØÐÔ
E.F.CoddÌá³öÁ˹ØÓÚOLAPµÄ12Ìõ×¼Ôò£º
×¼Ôò1 OLAPÄ£ÐͱØÐëÌṩ¶àά¸ÅÄîÊÓͼ
×¼Ôò2 ͸Ã÷ÐÔ×¼Ôò
×¼Ôò3 ´æÈ¡ÄÜÁ¦×¼Ôò
×¼Ôò4 Îȶ¨µÄ±¨±íÄÜÁ¦
×¼Ôò5 ¿Í»§/·þÎñÆ÷Ìåϵ½á¹¹
×¼Ôò6 άµÄµÈͬÐÔ×¼Ôò
×¼Ôò7 ¶¯Ì¬µÄÏ¡Êè¾ØÕó´¦Àí×¼Ôò
×¼Ôò8 ¶àÓû§Ö§³ÖÄÜÁ¦×¼Ôò
×¼Ôò9 ·ÇÊÜÏ޵Ŀçά²Ù×÷
×¼Ôò10 Ö±¹ÛµÄÊý¾Ý²Ù×Ý
×¼Ôò11 Áé»îµÄ±¨±íÉú³É
×¼Ôò12 ²»ÊÜÏÞµÄάÓë¾Û¼¯²ã´Î
Ò»ÑÔÒÔ±ÎÖ®£º
OLTPϵͳǿµ÷Êý¾Ý¿âÄÚ´æÐ§ÂÊ£¬Ç¿µ÷ÄÚ´æ¸÷ÖÖÖ¸±êµÄÃüÁîÂÊ£¬Ç¿µ÷°ó¶¨±äÁ¿£¬Ç¿µ÷²¢·¢²Ù×÷£¬Ç¿µ÷ÊÂÎñÐÔ£»OLAPϵͳÔòÇ¿µ÷Êý¾Ý·ÖÎö£¬Ç¿µ÷SQLÖ´ÐÐʱ³¤£¬Ç¿µ÷´ÅÅÌI/O£¬Ç¿µ÷·ÖÇø¡£
OLAP¿ªÔ´ÒýÇæ
ĿǰÊÐÃæÉÏÖ÷Á÷µÄ¿ªÔ´OLAPÒýÇæ°üº¬²»ÏÞÓÚ£ºHive¡¢Hawq¡¢Presto¡¢Kylin¡¢Impala¡¢Sparksql¡¢Druid¡¢Clickhouse¡¢GreeplumµÈ£¬¿ÉÒÔ˵ĿǰûÓÐÒ»¸öÒýÇæÄÜÔÚÊý¾ÝÁ¿£¬Áé»î³Ì¶ÈºÍÐÔÄÜÉÏ×öµ½ÍêÃÀ£¬Óû§ÐèÒª¸ù¾Ý×Ô¼ºµÄÐèÇó½øÐÐÑ¡ÐÍ¡£
×é¼þÌØµãºÍ¼ò½é
Hive
HiveÊÇ»ùÓÚHadoopµÄÒ»¸öÊý¾Ý²Ö¿â¹¤¾ß£¬¿ÉÒÔ½«½á¹¹»¯µÄÊý¾ÝÎļþÓ³ÉäΪһÕÅÊý¾Ý¿â±í£¬²¢ÌṩÍêÕûµÄsql²éѯ¹¦ÄÜ£¬¿ÉÒÔ½«sqlÓï¾äת»»ÎªMapReduceÈÎÎñ½øÐÐÔËÐС£ÆäÓŵãÊÇѧϰ³É±¾µÍ£¬¿ÉÒÔͨ¹ýÀàSQLÓï¾ä¿ìËÙʵÏÖ¼òµ¥µÄMapReduceͳ¼Æ£¬²»±Ø¿ª·¢×¨ÃŵÄMapReduceÓ¦Óã¬Ê®·ÖÊʺÏÊý¾Ý²Ö¿âµÄͳ¼Æ·ÖÎö¡£

¶ÔÓÚhiveÖ÷ÒªÕë¶ÔµÄÊÇOLAPÓ¦Óã¬Æäµ×²ãÊÇhdfs·Ö²¼Ê½Îļþϵͳ£¬hiveÒ»°ãÖ»ÓÃÓÚ²éѯ·ÖÎöͳ¼Æ£¬¶ø²»ÄÜÊdz£¼ûµÄCUD²Ù×÷£¬HiveÐèÒª´ÓÒÑÓеÄÊý¾Ý¿â»òÈÕÖ¾½øÐÐͬ²½×îÖÕÈëµ½hdfsÎļþϵͳÖУ¬µ±Ç°Òª×öµ½ÔöÁ¿ÊµÊ±Í¬²½¶¼Ï൱À§ÄÑ¡£
HiveµÄÓÅÊÆÊÇÍêÉÆµÄSQLÖ§³Ö£¬¼«µÍµÄѧϰ³É±¾£¬×Ô¶¨ÒåÊý¾Ý¸ñʽ£¬¼«¸ßµÄÀ©Õ¹ÐÔ¿ÉÇáËÉÀ©Õ¹µ½¼¸Ç§¸ö½ÚµãµÈµÈ¡£
µ«ÊÇHive ÔÚ¼ÓÔØÊý¾ÝµÄ¹ý³ÌÖв»»á¶ÔÊý¾Ý½øÐÐÈκδ¦Àí£¬ÉõÖÁ²»»á¶ÔÊý¾Ý½øÐÐɨÃ裬Òò´ËҲûÓжÔÊý¾ÝÖеÄijЩ
Key ½¨Á¢Ë÷Òý¡£Hive Òª·ÃÎÊÊý¾ÝÖÐÂú×ãÌõ¼þµÄÌØ¶¨ÖµÊ±£¬ÐèÒª±©Á¦É¨ÃèÕû¸öÊý¾Ý¿â£¬Òò´Ë·ÃÎÊÑӳٽϸߡ£
HiveÕæµÄÌ«ÂýÁË¡£´óÊý¾ÝÁ¿¾ÛºÏ¼ÆËã»òÕßÁª±í²éѯ£¬HiveµÄºÄʱ¶¯éüÒÔСʱ¼ÆË㣬ÔÚijһ¸ö˲¼ä£¬ÎÒÉõÖÁÏë°ÑËü¿ª³ý³öOLAP"¹ú¼®"£¬µ«ÊDz»µÃ²»³ÐÈÏHiveÈÔÈ»ÊÇ»ùÓÚHadoopÌåϵӦÓÃ×î¹ã·ºµÄOLAPÒýÇæ¡£
Hawq
HawqÊÇÒ»¸öHadoopÔÉú´ó¹æÄ£²¢ÐÐSQL·ÖÎöÒýÇæ£¬Hawq²ÉÓÃ
MPP ¼Ü¹¹£¬¸Ä½øÁËÕë¶Ô Hadoop µÄ»ùÓڳɱ¾µÄ²éѯÓÅ»¯Æ÷¡£³ýÁËÄܸßЧ´¦Àí±¾ÉíµÄÄÚ²¿Êý¾Ý£¬»¹¿Éͨ¹ý
PXF ·ÃÎÊ HDFS¡¢Hive¡¢HBase¡¢JSON µÈÍⲿÊý¾ÝÔ´¡£HAWQÈ«Ãæ¼æÈÝ SQL ±ê×¼£¬Äܱàд
SQL UDF£¬»¹¿ÉÓà SQL Íê³É¼òµ¥µÄÊý¾ÝÍÚ¾òºÍ»úÆ÷ѧϰ¡£ÎÞÂÛÊǹ¦ÄÜÌØÐÔ£¬»¹ÊÇÐÔÄܱíÏÖ£¬HAWQ
¶¼±È½ÏÊÊÓÃÓÚ¹¹½¨ Hadoop ·ÖÎöÐÍÊý¾Ý²Ö¿âÓ¦Óá£
Ò»¸öµäÐ͵ÄHawq¼¯Èº×é¼þÈçÏ£º


ÍøÂçÉÏÓÐÈ˶ÔHawqÓëHive²éѯÐÔÄܽøÐÐÁ˶ԱȲâÊÔ£¬×ÜÌåÀ´¿´£¬Ê¹ÓÃHawqÄÚ²¿±í±ÈHive¿ìµÄ¶à£¨4-50±¶£©¡£
Spark SQL
SparkSQLµÄǰÉíÊÇShark£¬Ëü½«
SQL ²éѯÓë Spark ³ÌÐòÎ޷켯³É,¿ÉÒÔ½«½á¹¹»¯Êý¾Ý×÷Ϊ Spark µÄ RDD ½øÐвéѯ¡£SparkSQL×÷ΪSparkÉú̬µÄÒ»Ô±¼ÌÐø·¢Õ¹£¬¶ø²»ÔÙÊÜÏÞÓÚHive£¬Ö»ÊǼæÈÝHive¡£
Spark SQLÔÚÕû¸öSparkÌåϵÖеÄλÖÃÈçÏ£º

SparkSQLµÄ¼Ü¹¹Í¼ÈçÏ£º

Spark SQL¶ÔÊìϤSparkµÄͬѧÀ´Ëµ£¬ºÜÈÝÒ×Àí½â²¢ÉÏÊÖʹÓãºÏà±ÈÓÚSpark RDD API£¬Spark
SQL°üº¬Á˶Խṹ»¯Êý¾ÝºÍÔÚÆäÉÏÔËËãµÄ¸ü¶àÐÅÏ¢£¬Spark SQLʹÓÃÕâЩÐÅÏ¢½øÐÐÁ˶îÍâµÄÓÅ»¯£¬Ê¹¶Ô½á¹¹»¯Êý¾ÝµÄ²Ù×÷¸ü¼Ó¸ßЧºÍ·½±ã¡£SQLÌṩÁËÒ»¸öͨÓõķ½Ê½À´·ÃÎʸ÷ʽ¸÷ÑùµÄÊý¾ÝÔ´£¬°üÀ¨Hive,
Avro, Parquet, ORC, JSON, and JDBC¡£Hive¼æÈÝÐÔ¼«ºÃ¡£
Presto
Presto is an open source distributed SQL query engine
for running interactive analytic queries against data
sources of all sizes ranging from gigabytes to petabytes.Presto
allows querying data where it lives, including Hive,
Cassandra, relational databases or even proprietary
data stores. A single Presto query can combine data
from multiple sources, allowing for analytics across
your entire organization. Presto is targeted at analysts
who expect response times ranging from sub-second to
minutes. Presto breaks the false choice between having
fast analytics using an expensive commercial solution
or using a slow "free" solution that requires
excessive hardware.
ÕâÊÇPresto¹Ù·½µÄ¼ò½é¡£Presto ÊÇÓÉ Facebook ¿ªÔ´µÄ´óÊý¾Ý·Ö²¼Ê½ SQL ²éѯÒýÇæ£¬ÊÊÓÃÓÚ½»»¥Ê½·ÖÎö²éѯ£¬¿ÉÖ§³ÖÖÚ¶àµÄÊý¾ÝÔ´£¬°üÀ¨
HDFS£¬RDBMS£¬KAFKA µÈ£¬¶øÇÒÌṩÁ˷dz£ÓѺõĽӿڿª·¢Êý¾ÝÔ´Á¬½ÓÆ÷¡£
PrestoÖ§³Ö±ê×¼µÄANSI SQL£¬°üÀ¨¸´ÔÓ²éѯ¡¢¾ÛºÏ£¨aggregation£©¡¢Á¬½Ó£¨join£©ºÍ´°¿Úº¯Êý£¨window
functions)¡£×÷ΪHiveºÍPig£¨HiveºÍPig¶¼ÊÇͨ¹ýMapReduceµÄ¹ÜµÀÁ÷À´Íê³ÉHDFSÊý¾ÝµÄ²éѯ£©µÄÌæ´úÕߣ¬Presto
±¾Éí²¢²»´æ´¢Êý¾Ý£¬µ«ÊÇ¿ÉÒÔ½ÓÈë¶àÖÖÊý¾ÝÔ´£¬²¢ÇÒÖ§³Ö¿çÊý¾ÝÔ´µÄ¼¶Áª²éѯ¡£
PrestoûÓÐʹÓÃMapReduce£¬ËüÊÇͨ¹ýÒ»¸ö¶¨ÖƵIJéѯºÍÖ´ÐÐÒýÇæÀ´Íê³ÉµÄ¡£ËüµÄËùÓеIJéѯ´¦ÀíÊÇÔÚÄÚ´æÖУ¬ÕâÒ²ÊÇËüµÄÐÔÄܸܺߵÄÒ»¸öÖ÷ÒªÔÒò¡£PrestoºÍSpark
SQLÓкܴóµÄÏàËÆÐÔ£¬ÕâÊÇËüÇø±ðÓÚHiveµÄ×î¸ù±¾µÄÇø±ð¡£
µ«PrestoÓÉÓÚÊÇ»ùÓÚÄÚ´æµÄ£¬¶øhiveÊÇÔÚ´ÅÅÌÉ϶ÁдµÄ£¬Òò´Ëpresto±Èhive¿ìºÜ¶à£¬µ«ÊÇÓÉÓÚÊÇ»ùÓÚÄÚ´æµÄ¼ÆËãµ±¶àÕÅ´ó±í¹ØÁª²Ù×÷ʱÒ×ÒýÆðÄÚ´æÒç³ö´íÎó¡£

Kylin
Ìáµ½Kylin
¾Í²»µÃ²»ËµËµROLAPºÍMOLAP¡£
´«Í³OLAP¸ù¾ÝÊý¾Ý´æ´¢·½Ê½µÄ²»Í¬·ÖΪROLAP£¨relational olap£©ÒÔ¼°MOLAP£¨multi-dimension
olap£©
ROLAP ÒÔ¹ØÏµÄ£Ð͵ķ½Ê½´æ´¢ÓÃ×÷¶àΪ·ÖÎöÓõÄÊý¾Ý£¬ÓŵãÔÚÓÚ´æ´¢Ìå»ýС£¬²éѯ·½Ê½Áé»î£¬È»¶øÈ±µãÒ²ÏÔ¶øÒ×¼û£¬Ã¿´Î²éѯ¶¼ÐèÒª¶ÔÊý¾Ý½øÐоۺϼÆË㣬ΪÁ˸ÄÉÆ¶Ì°å£¬ROLAPʹÓÃÁËÁд桢²¢Ðвéѯ¡¢²éѯÓÅ»¯¡¢Î»Í¼Ë÷ÒýµÈ¼¼Êõ¡£
MOLAP ½«·ÖÎöÓõÄÊý¾ÝÎïÀíÉϴ洢Ϊ¶àάÊý×éµÄÐÎʽ£¬ÐγÉCUBE½á¹¹¡£Î¬¶ÈµÄÊôÐÔÖµÓ³Éä³É¶àάÊý×éµÄϱê»òÕßϱ귶Χ£¬ÊÂʵÒÔ¶àάÊý×éµÄÖµ´æ´¢ÔÚÊý×éµ¥ÔªÖУ¬ÓÅÊÆÊDzéѯ¿ìËÙ£¬È±µãÊÇÊý¾ÝÁ¿²»ÈÝÒ׿ØÖÆ£¬¿ÉÄÜ»á³öÏÖά¶È±¬Õ¨µÄÎÊÌâ¡£
¶øKylin×ÔÉí¾ÍÊÇÒ»¸öMOLAPϵͳ£¬¶àάÁ¢·½Ì壨MOLAP Cube£©µÄÉè¼ÆÊ¹µÃÓû§Äܹ»ÔÚKylinÀïΪ°ÙÒÚÒÔÉÏÊý¾Ý¼¯¶¨ÒåÊý¾ÝÄ£ÐͲ¢¹¹½¨Á¢·½Ìå½øÐÐÊý¾ÝµÄÔ¤¾ÛºÏ¡£
Apache KylinÊÇÒ»¸ö¿ªÔ´µÄ·Ö²¼Ê½·ÖÎöÒýÇæ£¬ÌṩHadoop/SparkÖ®ÉϵÄSQL²éѯ½Ó¿Ú¼°¶àά·ÖÎö£¨OLAP£©ÄÜÁ¦ÒÔÖ§³Ö³¬´ó¹æÄ£Êý¾Ý£¬×î³õÓÉeBay
Inc. ¿ª·¢²¢¹±Ï×ÖÁ¿ªÔ´ÉçÇø¡£ËüÄÜÔÚÑÇÃëÄÚ²éѯ¾Þ´óµÄHive±í¡£

KylinµÄÓÅÊÆÓУº
ÌṩANSI-SQL½Ó¿Ú
½»»¥Ê½²éѯÄÜÁ¦
MOLAP Cube µÄ¸ÅÄî
ÓëBI¹¤¾ß¿ÉÎÞ·ìÕûºÏ
ËùÒÔÊʺÏKylinµÄ³¡¾°°üÀ¨£º
Óû§Êý¾Ý´æÔÚÓÚHadoop HDFSÖУ¬ÀûÓÃHive½«HDFSÎļþÊý¾ÝÒÔ¹ØÏµÊý¾Ý·½Ê½´æÈ¡£¬Êý¾ÝÁ¿¾Þ´ó£¬ÔÚ500GÒÔÉÏ
ÿÌìÓÐÊýGÉõÖÁÊýÊ®GµÄÊý¾ÝÔöÁ¿µ¼Èë
ÓÐ10¸öÒÔÄÚ½ÏΪ¹Ì¶¨µÄ·ÖÎöά¶È
¼òµ¥À´Ëµ£¬KylinÖÐÊý¾ÝÁ¢·½µÄ˼Ïë¾ÍÊÇÒԿռ任ʱ¼ä£¬Í¨¹ý¶¨ÒåһϵÁеÄγ¶È£¬¶Ôÿ¸öγ¶ÈµÄ×éºÏ½øÐÐÔ¤ÏȼÆËã²¢´æ´¢¡£ÓÐN¸öγ¶È£¬¾Í»áÓÐ2µÄN´ÎÖÖ×éºÏ¡£ËùÒÔ×îºÃ¿ØÖƺÃγ¶ÈµÄÊýÁ¿£¬ÒòΪ´æ´¢Á¿»áËæ×Åγ¶ÈµÄÔö¼Ó±¬Õ¨Ê½µÄÔö³¤£¬²úÉúÔÖÄÑÐÔºó¹û¡£
Impala
ImpalaÒ²ÊÇÒ»¸öSQL
on HadoopµÄ²éѯ¹¤¾ß£¬µ×²ã²ÉÓÃMPP¼¼Êõ£¬Ö§³Ö¿ìËÙ½»»¥Ê½SQL²éѯ¡£ÓëHive¹²ÏíÔªÊý¾Ý´æ´¢¡£ImpaladÊǺËÐĽø³Ì£¬¸ºÔð½ÓÊÕ²éѯÇëÇó²¢Ïò¶à¸öÊý¾Ý½Úµã·Ö·¢ÈÎÎñ¡£statestored½ø³Ì¸ºÔð¼à¿ØËùÓÐImpalad½ø³Ì£¬²¢Ïò¼¯ÈºÖеĽڵ㱨¸æ¸÷¸öImpalad½ø³ÌµÄ״̬¡£catalogd½ø³Ì¸ºÔð¹ã²¥Í¨ÖªÔªÊý¾ÝµÄ×îÐÂÐÅÏ¢¡£
ImpalaµÄ¼Ü¹¹Í¼ÈçÏ£º

ImpalaµÄÌØÐÔ°üÀ¨£º
Ö§³ÖParquet¡¢Avro¡¢Text¡¢RCFile¡¢SequenceFileµÈ¶àÖÖÎļþ¸ñʽ
Ö§³Ö´æ´¢ÔÚHDFS¡¢HBase¡¢Amazon S3ÉϵÄÊý¾Ý²Ù×÷
Ö§³Ö¶àÖÖѹËõ±àÂ뷽ʽ£ºSnappy¡¢Gzip¡¢Deflate¡¢Bzip2¡¢LZO
Ö§³ÖUDFºÍUDAF
×Ô¶¯ÒÔ×îÓÐЧµÄ˳Ðò½øÐбíÁ¬½Ó
ÔÊÐí¶¨Òå²éѯµÄÓÅÏȼ¶ÅŶӲßÂÔ
Ö§³Ö¶àÓû§²¢·¢²éѯ
Ö§³ÖÊý¾Ý»º´æ
Ìṩ¼ÆËãͳ¼ÆÐÅÏ¢£¨COMPUTE STATS£©
Ìṩ´°¿Úº¯Êý£¨¾ÛºÏ OVER PARTITION, RANK, LEAD, LAG, NTILEµÈµÈ£©ÒÔÖ§³Ö¸ß¼¶·ÖÎö¹¦ÄÜ
Ö§³ÖʹÓôÅÅ̽øÐÐÁ¬½ÓºÍ¾ÛºÏ£¬µ±²Ù×÷ʹÓõÄÄÚ´æÒç³öʱתΪ´ÅÅ̲Ù×÷
ÔÊÐíÔÚwhere×Ó¾äÖÐʹÓÃ×Ó²éѯ
ÔÊÐíÔöÁ¿Í³¼Æ¡ª¡ªÖ»ÔÚÐÂÊý¾Ý»ò¸Ä±äµÄÊý¾ÝÉÏÖ´ÐÐͳ¼Æ¼ÆËã
Ö§³Ömaps¡¢structs¡¢arraysÉϵĸ´ÔÓǶÌײéѯ
¿ÉÒÔʹÓÃimpala²åÈë»ò¸üÐÂHBase
ͬÑù£¬Impala¾³£»áºÍHive¡¢Presto·ÅÔÚÒ»Æð×ö±È½Ï£¬ImpalaµÄÁÓÊÆÒ²Í¬ÑùÃ÷ÏÔ£º
Impala²»ÌṩÈκζÔÐòÁл¯ºÍ·´ÐòÁл¯µÄÖ§³Ö¡£
ImpalaÖ»ÄܶÁÈ¡Îı¾Îļþ£¬¶ø²»ÄܶÁÈ¡×Ô¶¨Òå¶þ½øÖÆÎļþ¡£
ÿµ±ÐµļǼ/Îļþ±»Ìí¼Óµ½HDFSÖеÄÊý¾ÝĿ¼ʱ£¬¸Ã±íÐèÒª±»Ë¢Ð¡£Õâ¸öȱµã»áµ¼ÖÂÕýÔÚÖ´ÐеIJéѯsqlÓöµ½Ë¢Ð»á¹ÒÆð£¬²éѯ²»¶¯¡£
Druid
Druid
ÊÇÒ»ÖÖÄܶÔÀúÊ·ºÍʵʱÊý¾ÝÌṩÑÇÃë¼¶±ðµÄ²éѯµÄÊý¾Ý´æ´¢¡£Druid Ö§³ÖµÍÑÓʱµÄÊý¾ÝÉãÈ¡£¬Áé»îµÄÊý¾Ý̽Ë÷·ÖÎö£¬¸ßÐÔÄܵÄÊý¾Ý¾ÛºÏ£¬¼ò±ãµÄˮƽÀ©Õ¹¡£ÊÊÓÃÓÚÊý¾ÝÁ¿´ó£¬¿ÉÀ©Õ¹ÄÜÁ¦ÒªÇó¸ßµÄ·ÖÎöÐͲéѯϵͳ¡£
Druid½â¾öµÄÎÊÌâ°üÀ¨£ºÊý¾ÝµÄ¿ìËÙÉãÈëºÍÊý¾ÝµÄ¿ìËÙ²éѯ¡£ËùÒÔÒªÀí½âDruid£¬ÐèÒª½«ÆäÀí½âΪÁ½¸öϵͳ£¬¼´ÊäÈëϵͳºÍ²éѯϵͳ¡£
DruidµÄ¼Ü¹¹ÈçÏ£º


DruidµÄÌØµã°üÀ¨£º
DruidʵʱµÄÊý¾ÝÏû·Ñ£¬ÕæÕý×öµ½Êý¾ÝÉãÈëʵʱ¡¢²éѯ½á¹ûʵʱ
DruidÖ§³Ö PB ¼¶Êý¾Ý¡¢Ç§ÒÚ¼¶Ê¼þ¿ìËÙ´¦Àí£¬Ö§³ÖÿÃëÊýǧ²éѯ²¢·¢
DruidµÄºËÐÄÊÇʱ¼äÐòÁУ¬°ÑÊý¾Ý°´ÕÕʱ¼äÐòÁзÖÅú´æ´¢£¬Ê®·ÖÊʺÏÓÃÓÚ¶Ô°´Ê±¼ä½øÐÐͳ¼Æ·ÖÎöµÄ³¡¾°
Druid°ÑÊý¾ÝÁзÖΪÈýÀࣺʱ¼ä´Á¡¢Î¬¶ÈÁС¢Ö¸±êÁÐ
Druid²»Ö§³Ö¶à±íÁ¬½Ó
DruidÖеÄÊý¾ÝÒ»°ãÊÇʹÓÃÆäËû¼ÆËã¿ò¼Ü(SparkµÈ)Ô¤¼ÆËãºÃµÄµÍ²ã´Îͳ¼ÆÊý¾Ý
Druid²»ÊʺÏÓÃÓÚ´¦Àí͸ÊÓά¶È¸´ÔÓ¶à±äµÄ²éѯ³¡¾°
DruidÉó¤µÄ²éѯÀàÐͱȽϵ¥Ò»£¬Ò»Ð©³£ÓõÄSQL(groupby µÈ)Óï¾äÔÚdruidÀïÔËÐÐËÙ¶ÈÒ»°ã
DruidÖ§³ÖµÍÑÓʱµÄÊý¾Ý²åÈë¡¢¸üУ¬µ«ÊDZÈhbase¡¢´«Í³Êý¾Ý¿âÒªÂýºÜ¶à
ÓëÆäËûµÄʱÐòÊý¾Ý¿âÀàËÆ£¬DruidÔÚ²éѯÌõ¼þÃüÖдóÁ¿Êý¾ÝÇé¿öÏ¿ÉÄÜ»áÓÐÐÔÄÜÎÊÌ⣬¶øÇÒÅÅÐò¡¢¾ÛºÏµÈÄÜÁ¦ÆÕ±é²»Ì«ºÃ£¬Áé»îÐÔºÍÀ©Õ¹ÐÔ²»¹»£¬±ÈÈçȱ·¦Join¡¢×Ó²éѯµÈ¡£
ÎÒ¸öÈ˶ÔDruidµÄÀí½âÔÚÓÚ£¬Druid±£Ö¤Êý¾ÝʵʱдÈ룬µ«²éѯÉ϶ÔSQLÖ§³ÖµÄ²»¹»ÍêÉÆ(²»Ö§³ÖJoin)£¬ÊʺϽ«ÇåÏ´ºÃµÄ¼Ç¼ʵʱ¼È룬ȻºóѸËÙ²éѯ°üº¬ÀúÊ·µÄ½á¹û£¬ÔÚÎÒÃÇĿǰµÄÒµÎñÉÏûÓÐʵ¼ÊÓ¦Óá£
Greeplum
GreenplumÊÇÒ»¸ö¿ªÔ´µÄ´ó¹æÄ£²¢ÐÐÊý¾Ý·ÖÎöÒýÇæ¡£½èÖúMPP¼Ü¹¹£¬ÔÚ´óÐÍÊý¾Ý¼¯ÉÏÖ´Ðи´ÔÓSQL·ÖÎöµÄËٶȱȺܶà½â¾ö·½°¸¶¼Òª¿ì¡£
GPDBÍêȫ֧³ÖANSI SQL 2008±ê×¼ºÍSQL OLAP 2003 À©Õ¹£»´ÓÓ¦Óñà³Ì½Ó¿ÚÉϽ²£¬ËüÖ§³ÖODBCºÍJDBC¡£ÍêÉÆµÄ±ê×¼Ö§³ÖʹµÃϵͳ¿ª·¢¡¢Î¬»¤ºÍ¹ÜÀí¶¼´óΪ·½±ã¡£Ö§³Ö·Ö²¼Ê½ÊÂÎñ£¬Ö§³ÖACID¡£±£Ö¤Êý¾ÝµÄǿһÖÂÐÔ¡£×öΪ·Ö²¼Ê½Êý¾Ý¿â£¬ÓµÓÐÁ¼ºÃµÄÏßÐÔÀ©Õ¹ÄÜÁ¦¡£GPDBÓÐÍêÉÆµÄÉú̬ϵͳ£¬¿ÉÒÔÓëºÜ¶àÆóÒµ¼¶²úÆ·¼¯³É£¬Æ©ÈçSAS£¬Cognos£¬Informatic£¬TableauµÈ£»Ò²¿ÉÒԺܶàÖÖ¿ªÔ´Èí¼þ¼¯³É£¬Æ©ÈçPentaho,Talend
µÈ¡£
GreenPulmµÄ¼Ü¹¹ÈçÏ£º

GreenPulmµÄ¼¼ÊõÌØµãÈçÏ£º
Ö§³Öº£Á¿Êý¾Ý´æ´¢ºÍ´¦Àí
Ö§³ÖJust In Time BI£ºÍ¨¹ý׼ʵʱ¡¢ÊµÊ±µÄÊý¾Ý¼ÓÔØ·½Ê½£¬ÊµÏÖÊý¾Ý²Ö¿âµÄʵʱ¸üУ¬½ø¶øÊµÏÖ¶¯Ì¬Êý¾Ý²Ö¿â£¨ADW£©£¬»ùÓÚ¶¯Ì¬Êý¾Ý²Ö¿â£¬ÒµÎñÓû§ÄܶԵ±Ç°ÒµÎñÊý¾Ý½øÐÐBIʵʱ·ÖÎö£¨Just
In Time BI£©
Ö§³ÖÖ÷Á÷µÄsqlÓï·¨£¬Ê¹ÓÃÆðÀ´Ê®·Ö·½±ã£¬Ñ§Ï°³É±¾µÍ
À©Õ¹ÐԺã¬Ö§³Ö¶àÓïÑÔµÄ×Ô¶¨Ò庯ÊýºÍ×Ô¶¨ÒåÀàÐ͵È
ÌṩÁË´óÁ¿µÄά»¤¹¤¾ß£¬Ê¹ÓÃά»¤ÆðÀ´ºÜ·½±ã
Ö§³ÖÏßÐÔÀ©Õ¹£º²ÉÓÃMPP²¢Ðд¦Àí¼Ü¹¹¡£ÔÚMPP½á¹¹ÖÐÔö¼Ó½Úµã¾Í¿ÉÒÔÏßÐÔÌṩϵͳµÄ´æ´¢ÈÝÁ¿ºÍ´¦ÀíÄÜÁ¦
½ÏºÃµÄ²¢·¢Ö§³Ö¼°¸ß¿ÉÓÃÐÔÖ§³Ö³ýÁËÌṩӲ¼þ¼¶µÄRaid¼¼ÊõÍ⣬»¹ÌṩÊý¾Ý¿â²ãMirror»úÖÆ±£»¤£¬ÌṩMaster/Stand
by»úÖÆ½øÐÐÖ÷½ÚµãÈÝ´í£¬µ±Ö÷½Úµã·¢Éú´íÎóʱ£¬¿ÉÒÔÇл»µ½Stand by½Úµã¼ÌÐø·þÎñ
Ö§³ÖMapReduce
Êý¾Ý¿âÄÚ²¿Ñ¹Ëõ
Ò»¸öÖØÒªµÄÐÅÏ¢£ºGreenplum»ùÓÚPostgresql£¬Ò²¾ÍÊÇ˵GreenPulmºÍTiDBµÄ¶¨Î»ÀàËÆ£¬ÏëÒªÔÚOLTPºÍOLAPÉϽøÐÐͳһ¡£
ClickHouse
¹ÙÍø¶ÔXClickHouseµÄ½éÉÜ£º
ClickHouse is an open source column-oriented database
management system capable of real time generation
of analytical data reports using SQL queries.
ClickhouseÓɶíÂÞ˹yandex¹«Ë¾¿ª·¢¡£×¨ÎªÔÚÏßÊý¾Ý·ÖÎö¶øÉè¼Æ¡£YandexÊǶíÂÞ˹ËÑË÷ÒýÇæ¹«Ë¾¡£¹Ù·½ÌṩµÄÎĵµ±íÃû£¬ClickHouse
ÈÕ´¦Àí¼Ç¼Êý"Ê®ÒÚ¼¶"¡£
ÌØÐÔ:²ÉÓÃÁÐʽ´æ´¢£»Êý¾ÝѹËõ£»Ö§³Ö·ÖƬ£¬²¢ÇÒͬһ¸ö¼ÆËãÈÎÎñ»áÔÚ²»Í¬·ÖƬÉϲ¢ÐÐÖ´ÐУ¬¼ÆËãÍê³Éºó»á½«½á¹û»ã×Ü£»Ö§³ÖSQL£»Ö§³ÖÁª±í²éѯ£»Ö§³Öʵʱ¸üУ»×Ô¶¯¶à¸±±¾Í¬²½£»Ö§³ÖË÷Òý£»·Ö²¼Ê½´æ´¢²éѯ¡£
´ó¼Ò¶¼Nginx²»Ä°Éú°É£¬Õ½¶·Ãñ×忪ԴµÄÈí¼þÆÕ±éµÄÌØµã°üÀ¨£ºÇáÁ¿¼¶£¬¿ì¡£
ClickHouse×î´óµÄÌØµã¾ÍÊǿ죬¿ì£¬¿ì£¬ÖØÒªµÄ»°ËµÈý±é£¡ÓëHadoop¡¢SparkÕâЩ¾ÞÎÞ°Ô×é¼þÏà±È£¬ClickHouseºÜÇáÁ¿¼¶£¬ÆäÌØµã£º
ÁÐʽ´æ´¢Êý¾Ý¿â£¬Êý¾ÝѹËõ
¹ØÏµÐÍ¡¢Ö§³ÖSQL
·Ö²¼Ê½²¢ÐмÆË㣬°Ñµ¥»úÐÔÄÜѹեµ½¼«ÏÞ
¸ß¿ÉÓÃ
Êý¾ÝÁ¿¼¶ÔÚPB¼¶±ð
ʵʱÊý¾Ý¸üÐÂ
Ë÷Òý
ʹÓÃClickHouseÒ²ÓÐÆä±¾ÉíµÄÏÞÖÆ£¬°üÀ¨£º
ȱÉÙ¸ßÆµÂÊ£¬µÍÑÓ³ÙµÄÐ޸Ļòɾ³ýÒÑ´æÔÚÊý¾ÝµÄÄÜÁ¦¡£½öÄÜÓÃÓÚÅúÁ¿É¾³ý»òÐÞ¸ÄÊý¾Ý¡£
ûÓÐÍêÕûµÄÊÂÎñÖ§³Ö
²»Ö§³Ö¶þ¼¶Ë÷Òý
ÓÐÏÞµÄSQLÖ§³Ö£¬joinʵÏÖÓëÖÚ²»Í¬
²»Ö§³Ö´°¿Ú¹¦ÄÜ
ÔªÊý¾Ý¹ÜÀíÐèÒªÈ˹¤¸ÉԤά»¤
×ܽá
ÉÏÃæ¸ø³öÁ˳£ÓõÄһЩOLAPÒýÇæ£¬ËüÃǸ÷×ÔÓи÷×ÔµÄÌØµã£¬ÎÒÃǽ«Æä·Ö×飺
Hive£¬Hawq£¬Impala - »ùÓÚSQL on Hadoop
PrestoºÍSpark SQLÀàËÆ - »ùÓÚÄÚ´æ½âÎöSQLÉú³ÉÖ´Ðмƻ®
Kylin - Óÿռ任ʱ¼ä£¬Ô¤¼ÆËã
Druid - Ò»¸öÖ§³ÖÊý¾ÝµÄʵʱÉãÈë
ClickHouse - OLAPÁìÓòµÄHbase£¬µ¥±í²éѯÐÔÄÜÓÅÊÆ¾Þ´ó
Greenpulm - OLAPÁìÓòµÄPostgresql
Èç¹ûÄãµÄ³¡¾°ÊÇ»ùÓÚHDFSµÄÀëÏß¼ÆËãÈÎÎñ£¬ÄÇôHive£¬HawqºÍImapla¾ÍÊÇÄãµÄµ÷ÑÐÄ¿±ê£»Èç¹ûÄãµÄ³¡¾°½â¾ö·Ö²¼Ê½²éѯÎÊÌ⣬ÓÐÒ»¶¨µÄʵʱÐÔÒªÇó£¬ÄÇôPrestoºÍSparkSQL¿ÉÄܸü·ûºÏÄãµÄÆÚÍû£»Èç¹ûÄãµÄ»ã×Üά¶È±È½Ï¹Ì¶¨£¬ÊµÊ±ÐÔÒªÇó½Ï¸ß£¬¿ÉÒÔͨ¹ýÓû§ÅäÖõÄά¶È+Ö¸±ê½øÐÐÔ¤¼ÆË㣬ÄÇô²»·Á³¢ÊÔKylinºÍDruid£»ClickHouseÔòÔÚµ¥±í²éѯÐÔÄÜÉ϶ÀÁì·çɧ£¬Ô¶³¬¹ýÆäËûµÄOLAPÊý¾Ý¿â£»Greenpulm×÷Ϊ¹ØÏµÐÍÊý¾Ý¿â²úÆ·£¬ÐÔÄÜ¿ÉÒÔËæ×ż¯ÈºµÄÀ©Õ¹ÏßÐÔÔö³¤£¬¸ü¼ÓÊʺϽøÐÐÊý¾Ý·ÖÎö¡£
¾ÍÏñÃÀÍÅÔÚµ÷ÑÐKylinµÄ±¨¸æÖÐËù˵µÄ£º
Ŀǰ»¹Ã»ÓÐÒ»¸öOLAPϵͳÄܹ»Âú×ã¸÷ÖÖ³¡¾°µÄ²éѯÐèÇ󡣯䱾ÖÊÔÒòÊÇ£¬Ã»ÓÐÒ»¸öϵͳÄÜͬʱÔÚÊý¾ÝÁ¿¡¢ÐÔÄÜ¡¢ºÍÁé»îÐÔÈý¸ö·½Ãæ×öµ½ÍêÃÀ£¬Ã¿¸öϵͳÔÚÉè¼ÆÊ±¶¼ÐèÒªÔÚÕâÈýÕß¼ä×ö³öÈ¡Éá¡£
|