Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Model Center   Code  
»áÔ±   
   
 
     
   
 ¶©ÔÄ
  ¾èÖú
Flink SQL ÔÚ×Ö½ÚÌø¶¯µÄÓÅ»¯Óëʵ¼ù
 
×÷Õß:Àî±¾³¬
 
  2068  次浏览      27
2021-2-4 
 
±à¼­ÍƼö:
±¾ÎĶÔÕûÌå½øÐÐÁ˽éÉÜ£¬Êµ¼ùÓÅ»¯¡¢È»ºó½éÉÜÁËÁ÷ÅúÒ»Ìå¡¢×îºó¶ÔδÀ´½øÐÐÁ˹滮¡£

±¾ÎÄÀ´×Ô΢ÐŹ«ÖÚºÅFlink ÖÐÎÄÉçÇø£¬ÓÉ»ðÁú¹ûÈí¼þAnna±à¼­¡¢ÍƼö¡£

Ò»¡¢ÕûÌå½éÉÜ

2018 Äê 12 Ô Blink Ðû²¼¿ªÔ´£¬¾­ÀúÁËÔ¼Ò»ÄêµÄʱ¼ä Flink 1.9 ÓÚ 2019 Äê 8 Ô 22 ·¢²¼¡£ÔÚ Flink 1.9 ·¢²¼Ö®Ç°×Ö½ÚÌø¶¯ÄÚ²¿»ùÓÚ master ·ÖÖ§½øÐÐÄÚ²¿µÄ SQL ƽ̨¹¹½¨¡£¾­ÀúÁË 2~3 ¸öÔµÄʱ¼ä×Ö½ÚÄÚ²¿ÔÚ 19 Äê 10 Ô·ݷ¢²¼ÁË»ùÓÚ Flink 1.9 µÄ Blink planner ¹¹½¨µÄ Streaming SQL ƽ̨£¬²¢½øÐÐÄÚ²¿Íƹ㡣ÔÚÕâ¸ö¹ý³ÌÖз¢ÏÖÁËһЩ±È½ÏÓÐÒâ˼µÄÐèÇ󳡾°£¬ÒÔ¼°Ò»Ð©½ÏÎªÆæ¹ÖµÄ BUG¡£

»ùÓÚ 1.9 µÄ Flink SQL À©Õ¹

ËäÈ»×îÐ嵀 Flink °æ±¾ÒѾ­Ö§³Ö SQL µÄ DDL£¬µ« Flink 1.9 ²¢²»Ö§³Ö¡£×Ö½ÚÄÚ²¿»ùÓÚ Flink 1.9 ½øÐÐÁË DDL µÄÀ©Õ¹Ö§³ÖÒÔÏÂÓï·¨£º

create table

create view

create function

add resource

ͬʱ Flink 1.9 °æ±¾²»Ö§³ÖµÄ watermark ¶¨ÒåÔÚ DDL À©Õ¹ºóÒ²Ö§³ÖÁË¡£

ÎÒÃÇÔÚÍÆ¼ö´ó¼Ò¾¡Á¿µÄÈ¥Óà SQL ±í´ï×÷ҵʱÊÕµ½ºÜ¶à¡°SQL ÎÞ·¨±í´ï¸´ÔÓµÄÒµÎñÂß¼­¡±µÄ·´À¡¡£Ê±¼ä¾ÃÁË·¢ÏÖÆäʵºÜ¶àÓû§ËùνµÄ¸´ÔÓÒµÎñÂß¼­ÓеÄÊÇ×öһЩÍⲿµÄ RPC µ÷Óã¬×Ö½ÚÄÚ²¿Õë¶ÔÕâ¸ö³¡¾°×öÁËÒ»¸ö RPC µÄά±íºÍ sink£¬ÈÃÓû§¿ÉÒÔÈ¥¶Áд RPC ·þÎñ£¬¼«´óµÄÀ©Õ¹ÁË SQL µÄʹÓó¡¾°£¬°üÀ¨ FaaS Æäʵ¸ú RPC Ò²ÊÇÀàËÆµÄ¡£ÔÚ×Ö½ÚÄÚ²¿Ìí¼ÓÁË Redis/Abase/Bytable/ByteSQL/RPC/FaaS µÈά±íµÄÖ§³Ö¡£

ͬʱ»¹ÊµÏÖÁ˶à¸öÄÚ²¿Ê¹ÓÃµÄ connectors£º

source: RocketMQ

sink: RocketMQ/ClickHouse/Doris/LogHouse/Redis/Abase/Bytable/ByteSQL/RPC/Print/Metrics

²¢ÇÒΪ connector ¿ª·¢ÁËÅäÌ×µÄ format£ºPB/Binlog/Bytes¡£

ÔÚÏߵĽçÃæ»¯ SQL ƽ̨

³ýÁË¶Ô Flink ±¾Éí¹¦ÄܵÄÀ©Õ¹£¬×Ö½ÚÄÚ²¿Ò²ÉÏÏßÁËÒ»¸ö SQL ƽ̨£¬Ö§³ÖÒÔϹ¦ÄÜ£º

SQL ±à¼­

SQL ½âÎö

SQL µ÷ÊÔ

×Ô¶¨Òå UDF ºÍ Connector

°æ±¾¿ØÖÆ

ÈÎÎñ¹ÜÀí

¶þ¡¢Êµ¼ùÓÅ»¯

³ýÁ˶Թ¦ÄܵÄÀ©Õ¹£¬Õë¶Ô Flink 1.9 SQL µÄ²»×ãÖ®´¦Ò²×öÁËһЩÓÅ»¯¡£

Window ÐÔÄÜÓÅ»¯

1¡¢Ö§³ÖÁË window Mini-Batch

Mini-Batch ÊÇ Blink planner µÄÒ»¸ö±È½ÏÓÐÌØÉ«µÄ¹¦ÄÜ£¬ÆäÖ÷Ҫ˼ÏëÊÇ»ýÔÜÒ»ÅúÊý¾Ý£¬ÔÙ½øÐÐÒ»´Î״̬·ÃÎÊ£¬´ïµ½¼õÉÙ·ÃÎÊ״̬µÄ´ÎÊý½µµÍÐòÁл¯·´ÐòÁл¯µÄ¿ªÏú¡£Õâ¸öÓÅ»¯Ö÷ÒªÊÇÔÚ RocksDB µÄ³¡¾°¡£Èç¹ûÊÇ Heap ״̬ Mini-Batch ²¢Ã»Ê²Ã´ÓÅ»¯¡£ÔÚһЩµäÐ͵ÄÒµÎñ³¡¾°ÖУ¬µÃµ½µÄ·´À¡ÊÇÄܼõÉÙ 20~30% ×óÓÒµÄ CPU ¿ªÏú¡£

2¡¢À©Õ¹ window ÀàÐÍ

Ŀǰ SQL ÖеÄÈýÖÖÄÚÖà window£¬¹ö¶¯´°¿Ú¡¢»¬¶¯´°¿Ú¡¢session ´°¿Ú£¬ÕâÈýÖÖÓïÒâµÄ´°¿ÚÎÞ·¨Âú×ãһЩÓû§³¡¾°µÄÐèÇó¡£±ÈÈçÔÚÖ±²¥µÄ³¡¾°£¬·ÖÎöʦÏëͳ¼ÆÒ»¸öÖ÷²¥ÔÚ¿ª²¥Ö®ºó£¬Ã¿Ò»¸öСʱµÄ UV(Unique Visitor)¡¢GMV(Gross Merchandise Volume) µÈÖ¸±ê¡£×ÔÈ»µÄ¹ö¶¯´°¿ÚµÄ»®·Ö·½Ê½²¢²»Äܹ»Âú×ãÓû§µÄÐèÇó£¬×Ö½ÚÄÚ²¿¾Í×öÁËһЩ¶¨ÖƵĴ°¿ÚÀ´Âú×ãÓû§µÄһЩ¹²ÐÔÐèÇó¡£

-- my_window Ϊ×Ô¶¨ÒåµÄ´°¿Ú£¬Âú×ãÌØ¶¨µÄ»®·Ö·½Ê½
SELECT
room_id,
COUNT(DISTINCT user_id)
FROM MySource
GROUP BY
room_id,
my_window(ts, INTERVAL '1' HOURS)

3¡¢window offset

ÕâÊÇÒ»¸ö½ÏΪͨÓõŦÄÜ£¬ÔÚ Datastream API ²ãÊÇÖ§³ÖµÄ£¬µ« SQL Öв¢Ã»ÓС£ÕâÀïÓиö±È½ÏÓÐÒâ˼µÄ³¡¾°£¬Óû§ÏëÒª¿ªÒ»ÖܵĴ°¿Ú£¬Ò»ÖܵĴ°¿Ú±ä³ÉÁË´ÓÖÜËÄ¿ªÊ¼µÄ·Ç×ÔÈ»ÖÜ¡£ÒòΪ˭Ҳ²»»áÏëµ½ 1970 Äê 1 Ô 1 ºÅÄÇÌì¾ÓÈ»ÊÇÖÜËÄ¡£ÔÚ¼ÓÈëÁË offset µÄÖ§³Öºó¾Í¿ÉÒÔÖ§³ÖÕýÈ·µÄ×ÔÈ»ÖÜ´°¿Ú¡£

SELECT
room_id,
COUNT(DISTINCT user_id)
FROM MySource
GROUP BY
room_id,
TUMBLE(ts, INTERVAL '7' DAY, INTERVAL '3', DAY)

ά±íÓÅ»¯

1¡¢ÑÓ³Ù Join

ά±í Join µÄ³¡¾°ÏÂÒòΪά±í¾­³£·¢Éú±ä»¯ÓÈÆäÊÇÐÂÔöά¶È£¬¶ø Join ²Ù×÷·¢ÉúÔÚά¶ÈÐÂÔö֮ǰ£¬¾­³£µ¼Ö¹ØÁª²»ÉÏ¡£

ËùÒÔÓû§Ï£ÍûÈç¹û Join ²»µ½£¬ÔòÔÝʱ½«Êý¾Ý»º´æÆðÀ´Ö®ºóÔÙ½øÐг¢ÊÔ£¬²¢ÇÒ¿ÉÒÔ¿ØÖƳ¢ÊÔ´ÎÊý£¬Äܹ»×Ô¶¨ÒåÑÓ³Ù Join µÄ¹æÔò¡£Õâ¸öÐèÇ󳡾°²»µ¥µ¥ÔÚ×Ö½ÚÄÚ²¿£¬ÉçÇøµÄºÜ¶àͬѧҲÓÐÀàËÆµÄÐèÇó¡£

»ùÓÚÉÏÃæµÄ³¡¾°ÊµÏÖÁËÑÓ³Ù Join ¹¦ÄÜ£¬Ìí¼ÓÁËÒ»¸ö¿ÉÒÔÖ§³ÖÑÓ³Ù Join ά±íµÄËã×Ó¡£µ± Join ûÓÐÃüÖУ¬local cache ²»»á»º´æ¿ÕµÄ½á¹û£¬Í¬Ê±½«Êý¾ÝÔÝʱ±£´æÔÚÒ»¸ö״̬ÖУ¬Ö®ºó¸ù¾ÝÉèÖö¨Ê±Æ÷ÒÔ¼°ËüµÄÖØÊÔ´ÎÊý½øÐÐÖØÊÔ¡£

2¡¢Î¬±í Keyby ¹¦ÄÜ

ͨ¹ýÍØÆËÎÒÃÇ·¢ÏÖ Cacl Ëã×ÓºÍ lookUpJoin Ëã×ÓÊÇ chain ÔÚÒ»ÆðµÄ¡£ÒòΪËüûÓÐÒ»¸ö key µÄÓïÒå¡£

µ±×÷Òµ²¢ÐжȱȽϴó£¬Ã¿Ò»¸öά±í Join µÄ subtask£¬·ÃÎʵÄÊÇËùÓеĻº´æ¿Õ¼ä£¬ÕâÑù¶Ô»º´æÀ´ËµÓкܴóµÄѹÁ¦¡£

µ«¹Û²ì Join µÄ SQL£¬µÈÖµ Join ÊÇÌìÈ»¾ßÓÐ Hash ÊôÐԵġ£Ö±½Ó¿ª·ÅÁËÅäÖã¬ÔËÐÐÓû§Ö±½Ó°Ñά±í Join µÄ key ×÷Ϊ Hash µÄÌõ¼þ£¬½«Êý¾Ý½øÐзÖÇø¡£ÕâÑù¾ÍÄܱ£Ö¤ÏÂÓÎÿһ¸öËã× subtask Ö®¼äµÄ·ÃÎʿռäÊǶÀÁ¢µÄ£¬ÕâÑù¿ÉÒÔ´ó´óµÄÌáÉý¿ªÊ¼µÄ»º´æÃüÖÐÂÊ¡£

³ýÁËÒÔÉϵÄÓÅ»¯£¬»¹ÓÐÁ½µãĿǰÕýÔÚ¿ª·¢µÄά±íÓÅ»¯¡£

¹ã²¥Î¬±í£ºÓÐЩ³¡¾°ÏÂά±í±È½ÏС£¬¶øÇÒ¸üв»Æµ·±£¬µ«×÷ÒµµÄ QPS ÌØ±ð¸ß¡£Èç¹ûÒÀÈ»·ÃÎÊÍⲿϵͳ½øÐÐ Join£¬ÄÇôѹÁ¦»á·Ç³£´ó¡£²¢ÇÒµ±×÷Òµ Failover µÄʱºò local cache »áÈ«²¿Ê§Ð§£¬½ø¶øÓÖ¶ÔÍⲿϵͳÔì³ÉºÜ´ó·ÃÎÊѹÁ¦¡£ÄÇô¸Ä½øµÄ·½°¸ÊǶ¨ÆÚÈ«Á¿ scan ά±í£¬Í¨¹ýJoin key hash µÄ·½Ê½·¢Ë͵½ÏÂÓΣ¬¸üÐÂÿ¸öά±í subtask µÄ»º´æ¡£

Mini-Batch£ºÖ÷ÒªÕë¶ÔһЩ I/O ÇëÇó±È½Ï¸ß£¬ÏµÍ³ÓÖÖ§³Ö batch ÇëÇóµÄÄÜÁ¦£¬±ÈÈç˵ RPC¡¢HBase¡¢Redis µÈ¡£ÒÔÍùµÄ·½Ê½¶¼ÊÇÖðÌõµÄÇëÇó£¬ÇÒ Async I/O Ö»Äܽâ¾ö I/O ÑÓ³ÙµÄÎÊÌâ,²¢²»Äܽâ¾ö·ÃÎÊÁ¿µÄÎÊÌ⡣ͨ¹ýʵÏÖ Mini-Batch °æ±¾µÄά±íËã×Ó£¬´óÁ¿½µµÍά±í¹ØÁª·ÃÎÊÍⲿ´æ´¢´ÎÊý¡£

Join ÓÅ»¯

Ŀǰ Flink Ö§³ÖµÄÈýÖÖ Join ·½Ê½£»·Ö±ðÊÇ Interval Join¡¢Regular Join¡¢Temporal Table Function¡£

ǰÁ½ÖÖÓïÒåÊÇÒ»ÑùµÄÁ÷ºÍÁ÷ Join¡£¶ø Temporal Table ÊÇÁ÷ºÍ±íµÄµÄ Join£¬ÓұߵÄÁ÷»áÒÔÖ÷¼üµÄÐÎʽÐγÉÒ»ÕÅ±í£¬×ó±ßµÄÁ÷È¥ Join ÕâÕÅ±í£¬ÕâÑùÒ»´Î Join Ö»ÄÜÓÐÒ»ÌõÊý¾Ý²ÎÓë²¢ÇÒÖ»·µ»ØÒ»¸ö½á¹û¡£¶ø²»ÊÇÓжàÉÙÌõ¶¼ÄÜ Join µ½¡£

ËüÃÇÖ®¼äµÄÇø±ðÁÐÁ˼¸µã£º

¿ÉÒÔ¿´µ½ÈýÖÖ Join ·½Ê½¶¼ÓÐËü±¾ÉíµÄһЩȱÏÝ¡£

Interval Join ĿǰʹÓÃÉϵÄȱÏÝÊÇËü»á²úÉúÒ»¸ö out join Êý¾ÝºÍ watermark ÂÒÐòµÄÇé¿ö¡£

Regular Join µÄ»°£¬Ëü×î´óµÄȱÏÝÊÇ retract ·Å´ó(Ö®ºó»áÏêϸ˵Ã÷Õâ¸öÎÊÌâ)¡£

Temporal table function µÄÎÊÌâ½ÏÆäËü¶àһЩ£¬ÓÐÈý¸öÎÊÌâ¡£

²»Ö§³Ö DDl

²»Ö§³Ö out join µÄÓïÒå (FLINK-7865 µÄÏÞÖÆ)

ÓÒ²àÊý¾Ý¶ÏÁ÷µ¼Ö watermark ²»¸üУ¬ÏÂÓÎÎÞ·¨ÕýÈ·¼ÆËã (FLINK-18934)

¶ÔÓÚÒÔÉϵIJ»×ãÖ®´¦×Ö½ÚÄÚ²¿¶¼×öÁ˶ÔÓ¦µÄÐ޸ġ£

ÔöÇ¿ Checkpoint »Ö¸´ÄÜÁ¦

¶ÔÓÚ SQL ×÷ÒµÀ´ËµÒ»µ©·¢ÉúÌõ¼þ±ä»¯¶¼ºÜÄÑ´Ó checkpoint Öлָ´¡£

SQL ×÷ҵȷʵ´Ó checkpoint »Ö¸´µÄÄÜÁ¦±È½ÏÈõ£¬ÒòΪÓÐʱºò×öһЩ¿´ÆðÀ´²»Ì«Ó°Ïì checkpoint µÄÐ޸ģ¬ËüÈÔÈ»ÎÞ·¨»Ö¸´¡£ÎÞ·¨»Ö¸´Ö÷ÒªÓÐÁ½µã£»

µÚÒ»µã£ºoperate ID ÊÇ×Ô¶¯Éú³ÉµÄ£¬È»ºóÒòΪijЩԭÒòµ¼ÖÂËüÉú³ÉµÄ ID ¸Ä±äÁË¡£

µÚ¶þµã£ºËã×ӵļÆËãµÄÂß¼­·¢ÉúÁ˸ı䣬¼´Ëã×ÓÄÚ²¿µÄ״̬µÄ¶¨Òå·¢ÉúÁ˱仯¡£

Àý×Ó1£º²¢Ðжȷ¢ÉúÐ޸ĵ¼ÖÂÎÞ·¨»Ö¸´¡£

source ÊÇÒ»¸ö×î³£¼ûµÄÓÐ״̬µÄËã×Ó£¬source Èç¹ûºÍÖ®ºóµÄËã× operator chain Âß¼­·¢ÉúÁ˸ı䣬ÊÇÍêÈ«ÎÞ·¨»Ö¸´µÄ¡£

ÏÂͼ×óÉÏÊÇÕý³£µÄÉçÇø°æµÄ×÷Òµ»á²úÉúµÄÒ»¸öÂß¼­£¬ source ºÍºóÃæµÄ²¢ÐжÈÒ»ÑùµÄËã×ӻᱻ chain ÔÚÒ»Æð£¬Óû§ÊÇÎÞ·¨È¥¸Ä±äµÄ¡£µ«Ëã×Ó²¢ÐжÈÊdz£»á»á·¢ÉúÐ޸쬱ÈÈç˵ source ÓÉÔ­À´µÄ 100 ÐÞ¸ÄΪ 50£¬cacl µÄ²¢·¢ÊÇ 100¡£´Ëʱ chain µÄÂß¼­¾Í»á·¢Éú±ä»¯¡£

Õë¶ÔÕâÖÖÇé¿ö£¬×Ö½ÚÄÚ²¿×öÁËÐ޸ģ¬ÔÊÐíÓû§È¥ÅäÖ㬼´Ê¹ source µÄ²¢ÐжȸúºóÃæÕûÌåµÄ×÷ÒµµÄ²¢ÐжÈÊÇÒ»ÑùµÄ£¬Ò²ÈÃÆä²»ÓëÖ®ºóµÄËã×Ó chain ÔÚÒ»Æð¡£

Àý×Ó2£ºDAG ¸Ä±äµ¼ÖÂÎÞ·¨»Ö¸´¡£

ÕâÊÇÒ»ÖֱȽÏÌØÊâµÄÇé¿ö£¬ÓÐÒ»Ìõ SQL (ÉÏͼ)£¬¿ÉÒÔ¿´µ½ source ûÓз¢Éú±ä»¯£¬Ö®ºóµÄÈý¸ö¾ÛºÏ»¥ÏàÖ®¼äûÓйØÏµ£¬×´Ì¬¾¹È»Ò²ÊÇÎÞ·¨»Ö¸´¡£

×÷ÒµÖ®ËùÒÔÎÞ·¨»Ö¸´£¬ÊÇÒòΪ operator ID Éú³É¹æÔòµ¼Öµġ£Ä¿Ç° SQL ÖÐ operator ID µÄÉú³ÉµÄ¹æÔòÓëÉÏÓΡ¢±¾ÉíÅäÖÃÒÔ¼°ÏÂÓοÉÒÔ chain ÔÚÒ»ÆðµÄËã×ÓµÄÊýÁ¿¶¼ÓйØÏµ¡£ ÒòΪÐÂÔöÖ¸±ê£¬»áµ¼ÖÂÐÂÔöÒ»¸ö Calc µÄÏÂÓνڵ㣬½ø¶øµ¼Ö operator ID ·¢Éú±ä»¯¡£

ΪÁË´¦ÀíÕâÖÖÇé¿ö£¬Ö§³ÖÁËÒ»ÖÖÌØÊâµÄÅäÖÃģʽ£¬ÔÊÐíÓû§ÅäÖÃÉú³É operator ID µÄʱºò¿ÉÒÔºöÂÔÏÂÓÎ chain ÔÚÒ»ÆðËã×ÓÊýÁ¿µÄÌõ¼þ¡£

Àý×Ó3£ºÐÂÔö¾ÛºÏÖ¸±êµ¼ÖÂÎÞ·¨»Ö¸´

Õâ¿éÊÇÓû§ËßÇó×î´óµÄ£¬Ò²ÊÇ×ÔӵIJ¿·Ö¡£Óû§ÆÚÍûÐÂÔöһЩ¾ÛºÏÖ¸±êºó£¬Ô­À´µÄÖ¸±êÒªÄÜ´Ó checkpoint Öлָ´¡£

¿ÉÒÔ¿´µ½Í¼ÖÐ×󲿷ÖÊÇ SQL Éú³ÉµÄËã×ÓÂß¼­¡£count£¬sum£¬sum£¬count£¬distinct »áÒÔÒ»¸ö BaseRow µÄ½á¹¹´æ´¢ÔÚ ValueState ÖС£distinct ±È½ÏÌØÊâһЩ£¬»¹»áµ¥¶À´æ´¢ÔÚÒ»¸ö MapState ÖС£

Õâµ¼ÖÂÁËÈçÐÂÔö»òÕß¼õÉÙÖ¸±ê£¬¶¼»áʹԭÏȵÄ״̬û°ì·¨´Ó ValueState ÖÐÕý³£»Ö¸´£¬ÒòΪ VauleState Öд洢µÄ״̬ ¡°schema¡± ºÍеģ¨ÐÞ¸ÄÖ¸±êºó£©µÄ ¡°schema¡±²»Æ¥Å䣬ÎÞ·¨Õý³£·´ÐòÁл¯¡£

ÔÚÌÖÂÛ½â¾ö·½°¸Ö®Ç°£¬ÎÒÃÇÏȻعËÒ»ÏÂÕý³£µÄ»Ö¸´Á÷¡£ÏÈ´Ó checkpoint Öлָ´³ö״̬µÄ serializer£¬ÔÙͨ¹ý serializer °Ñ״̬»Ö¸´¡£½ÓÏÂÀ´ operator È¥×¢²áеÄ״̬¶¨Ò壬еÄ״̬¶¨Òå»áºÍÔ­ÏȵÄ״̬¶¨Òå½øÐÐÒ»¸ö¼æÈÝÐԶԱȣ¬Èç¹ûÊǼæÈÝÔò״̬»Ö¸´³É¹¦£¬Èç¹û²»¼æÈÝÔòÅ׳öÒì³£ÈÎÎñʧ°Ü¡£

²»¼æÈݵÄÁíÒ»ÖÖ´¦ÀíÇé¿öÊÇÔÊÐí·µ»ØÒ»¸ö migration£¨ÊµÏÖÁ½¸ö²»Æ¥ÅäÀàÐ͵Ä״̬»Ö¸´£©ÄÇôҲ¿ÉÒÔ»Ö¸´³É¹¦¡£

Õë¶ÔÉÏÃæµÄÁ÷³Ì×ö³ö¶ÔÓ¦µÄÐ޸ģº

µÚÒ»²½Ê¹ÐÂ¾É serializer »¥ÏàÖªµÀ¶Ô·½µÄÐÅÏ¢£¬Ìí¼ÓÒ»¸ö½Ó¿Ú£¬ÇÒÐÞ¸ÄÁË statebackend resolve compatibility µÄ¹ý³Ì£¬°Ñ¾ÉµÄÐÅÏ¢´«µÝ¸øÐµģ¬²¢Ê¹Æä»ñÈ¡Õû¸ö migrate ¹ý³Ì¡£

µÚ¶þ²½ÅжÏÐÂÀÏÖ®¼äÊÇ·ñ¼æÈÝ£¬Èç¹û²»¼æÈÝÊÇ·ñÐèÒª×öÒ»´Î migration¡£È»ºóÈÃ¾ÉµÄ serializer È¥»Ö¸´Ò»±é״̬£¬²¢Ê¹ÓÃÐ嵀 serializer дÈëеÄ״̬¡£

¶Ô aggregation µÄ´úÂëÉú³É½øÐд¦Àí£¬µ±·¢ÏÖ aggregation Äõ½µÄÊÇÖ¸±êÊÇ null£¬ÄÇô½«×öһЩ³õʼ»¯µÄ¹¤×÷¡£

ͨ¹ýÒÔÉϵÄÐ޸Ļù±¾¾Í¿ÉÒÔ×öµ½Õý³£µÄ£¬ÐÂÔöµÄ¾ÛºÏÖ¸±ê´Ó²ð¿ªµÄ·½°¸»Ö¸´¡£

Èý¡¢Á÷ÅúÒ»Ìå̽Ë÷

ÒµÎñÏÖ×´

×Ö½ÚÌø¶¯ÄÚ²¿¶ÔÁ÷ÅúÒ»ÌåºÍÒµÎñÍÆ¹ã֮ǰ£¬¼¼ÊõÍŶÓÌáǰ×öÁË´óÁ¿¼¼Êõ·½ÃæµÄ̽Ë÷¡£ÕûÌåÅжÏÊÇ SQL ÕâÒ»²ãÊÇ¿ÉÒÔ×öµ½Á÷ÅúÒ»ÌåµÄÓïÒ壬µ«Êµ¼ùÖÐÈ´ÓÖ·¢ÏÖ²»ÉÙ²»Í¬¡£

±ÈÈç˵Á÷¼ÆËãµÄ session window£¬»òÊÇ»ùÓÚ´¦Àíʱ¼äµÄ window£¬ÔÚÅú¼ÆËãÖÐÎÞ·¨×öµ½¡£Í¬Ê± SQL ÔÚÅú¼ÆËãÖÐһЩ¸´Ô over window£¬ÔÚÁ÷¼ÆËãÖÐҲûÓжÔÓ¦µÄʵÏÖ¡£

µ«ÕâÐ©ÌØ±ðµÄ³¡¾°¿ÉÄÜÖ»Õ¼ 10% ÉõÖÁ¸üÉÙ£¬ËùÒÔÓà SQL È¥ÂäʵÁ÷ÅúÒ»ÌåÊÇ¿ÉÐеġ£

Á÷ÅúÒ»Ìå

ÕâÕÅͼÊDZȽϳ£¼ûµÄºÍ´ó¶àÊý¹«Ë¾ÀïµÄ¼Ü¹¹¶¼ÀàËÆ¡£ÕâÖּܹ¹ÓÐʲôȱÏÝÄØ£¿

Êý¾Ý²»Í¬Ô´£ºÅúÈÎÎñÒ»°ã»áÓÐÒ»´ÎǰÖô¦ÀíÈÎÎñ£¬²»¹ÜÊÇÀëÏßµÄÒ²ºÃʵʱµÄÒ²ºÃ£¬Ô¤ÏȽø¹ýÒ»²ã¼Ó¹¤ºóдÈë Hive¡£¶øÊµÊ±ÈÎÎñÊÇ´Ó kafka ¶ÁȡԭʼµÄÊý¾Ý£¬¿ÉÄÜÊÇ json ¸ñʽ£¬Ò²¿ÉÄÜÊÇ avro µÈµÈ¡£Ö±½Óµ¼ÖÂÅúÈÎÎñÖпÉÖ´ÐÐµÄ SQL ÔÚÁ÷ÈÎÎñÖÐûÓнá¹ûÉú³É»òÕßÖ´Ðнá¹û²»¶Ô¡£

¼ÆË㲻ͬԴ£ºÅúÈÎÎñÒ»°ãÊÇ Hive + Spark µÄ¼Ü¹¹£¬¶øÁ÷ÈÎÎñ»ù±¾¶¼ÊÇ»ùÓÚ Flink¡£²»Í¬µÄÖ´ÐÐÒýÇæÔÚʵÏÖÉ϶¼»áÓÐһЩ²îÒ죬µ¼Ö½á¹û²»Ò»Ö¡£²»Í¬µÄÖ´ÐÐÒýÇæÓв»Í¬µÄ API ¶¨Òå UDF£¬ËüÃÇÖ®¼äÒ²ÊÇÎÞ·¨±»¹«Óõġ£´ó²¿·ÖÇé¿ö϶¼ÊÇά»¤Á½Ì×»ùÓÚ²»Í¬ API ʵÏÖµÄÏàͬ¹¦ÄÜµÄ UDF¡£

¼øÓÚÉÏÃæµÄÎÊÌ⣬Ìá³öÁË»ùÓÚ Flink µÄÁ÷ÅúÒ»Ìå¼Ü¹¹À´½â¾ö¡£

Êý¾Ý²»Í¬Ô´£ºÁ÷ʽ´¦ÀíÏÈͨ¹ý Flink ´¦ÀíÖ®ºóдÈë MQ ¹©ÏÂÓÎÁ÷ʽ Flink job È¥Ïû·Ñ£¬¶ÔÓÚÅúʽ´¦ÀíÓÉ Flink ´¦ÀíºóÁ÷ʽдÈëµ½ Hive£¬ÔÙÓÉÅúʽµÄ Flink job È¥´¦Àí¡£

ÒýÇæ²»Í¬Ô´£º¼ÈÈ»¶¼ÊÇ»ùÓÚ Flink ¿ª·¢µÄÁ÷ʽ£¬Åúʽ job£¬×ÔȻûÓмÆË㲻ͬԴÎÊÌ⣬ͬʱҲ±ÜÃâÁËά»¤¶àÌ×Ïàͬ¹¦ÄÜµÄ UDF¡£

»ùÓÚ Flink ʵÏÖµÄÁ÷ÅúÒ»Ìå¼Ü¹¹£º

ÒµÎñÊÕÒæ

ͳһµÄ SQL£ºÍ¨¹ýÒ»Ì× SQL À´±í´ïÁ÷ºÍÅú¼ÆËãÁ½ÖÖ³¡¾°£¬¼õÉÙ¿ª·¢Î¬»¤¹¤×÷¡£

¸´Óà UDF£ºÁ÷ʽºÍÅúʽ¼ÆËã¿ÉÒÔ¹²ÓÃÒ»Ì× UDF¡£Õâ¶ÔÒµÎñÀ´ËµÊÇÓлý¼«ÒâÒåµÄ¡£

ÒýÇæÍ³Ò»£º¶ÔÓÚÒµÎñµÄѧϰ³É±¾ºÍ¼Ü¹¹µÄά»¤³É±¾¶¼»á½µµÍºÜ¶à¡£

ÓÅ»¯Í³Ò»£º´ó²¿·ÖµÄÓÅ»¯¶¼ÊÇ¿ÉÒÔͬʱ×÷ÓÃÔÚÁ÷ʽºÍÅúʽ¼ÆËãÉÏ£¬±ÈÈç¶Ô planner¡¢operator µÄÓÅ»¯Á÷ºÍÅú¿ÉÒÔ¹²Ïí¡£

ËÄ¡¢Î´À´¹¤×÷ºÍ¹æ»®

ÓÅ»¯ retract ·Å´óÎÊÌâ

ʲôÊÇ retract ·Å´ó£¿

ÉÏͼÓÐ 4 ÕÅ±í£¬µÚÒ»ÕÅ±í½øÐÐÈ¥ÖØ²Ù×÷ (Dedup)£¬Ö®ºó·Ö±ðºÍÁíÍâÈýÕűí×ö Join¡£Âß¼­±È½Ï¼òµ¥£¬±í A ÊäÈë(A1)£¬×îºó²ú³ö (A1,B1,C1,D1) µÄ½á¹û¡£

µ±±í A ÊäÈëÒ»¸ö A2£¬ÒòΪ Dedup Ëã×Ó£¬µ¼ÖÂÊý¾ÝÐèÒªÈ¥ÖØ£¬ÔòÏòÏÂÓη¢ËÍÒ»¸ö³·»Ø A1 µÄ²Ù×÷ -(A1) ºÍÒ»¸öÐÂÔö A2 µÄ²Ù×÷ +(A2)¡£µÚÒ»¸ö Join Ëã×ÓÊÕµ½ -(A1) ºó»á½« -(A1) ±ä³É -(A1,B1) ºÍ +(null,B1)(ΪÁ˱£³ÖËüÈÏΪµÄÕýÈ·ÓïÒå) ·¢Ë͵½ÏÂÓΡ£Ö®ºóÓÖÊÕµ½ÁË +(A2) £¬ÔòÓÖÏòÏÂÓη¢ËÍ -(null,B1) ºÍ +(A2,B1) ÕâÑù²Ù×÷¾Í·Å´óÁËÁ½±¶¡£ÔÙ¾­ÓÉÏÂÓεÄËã×Ó²Ù×÷»áÒ»Ö±±»·Å´ó£¬µ½×îÖÕµÄ sink Êä³ö¿ÉÄܻᱻ·Å´ó 1000 ±¶Ö®¶à¡£

ÈçºÎ½â¾ö£¿

½«Ô­ÏÈ retract µÄÁ½ÌõÊý¾Ý±ä³ÉÒ»Ìõ changelog µÄ¸ñʽÊý¾Ý£¬ÔÚËã×ÓÖ®¼ä´«µÝ¡£Ëã×Ó½ÓÊÕµ½ changelog ºó´¦Àí±ä¸ü£¬È»ºó½ö½öÏòÏÂÓη¢ËÍÒ»¸ö±ä¸ü changelog ¼´¿É¡£

δÀ´¹æ»®

1.¹¦ÄÜÓÅ»¯

Ö§³ÖËùÓÐÀàÐ;ۺÏÖ¸±ê±ä¸üµÄ checkpoint »Ö¸´ÄÜÁ¦

window local-global

ʼþʱ¼äµÄ Fast Emit

¹ã²¥Î¬±í

¸ü¶àËã× Mini-Batch Ö§³Ö:ά±í£¬TopN£¬Join µÈ

È«Ãæ¼æÈÝ Hive SQL Óï·¨

2.ÒµÎñÀ©Õ¹

½øÒ»²½Íƶ¯Á÷ʽ SQL ´ïµ½ 80%

̽Ë÷Â䵨Á÷ÅúÒ»Ìå²úÆ·ÐÎ̬

ÍÆ¶¯ÊµÊ±Êý²Ö±ê×¼»¯

 

 
   
2068 ´Îä¯ÀÀ       27
Ïà¹ØÎÄÕÂ

»ùÓÚEAµÄÊý¾Ý¿â½¨Ä£
Êý¾ÝÁ÷½¨Ä££¨EAÖ¸ÄÏ£©
¡°Êý¾Ýºþ¡±£º¸ÅÄî¡¢ÌØÕ÷¡¢¼Ü¹¹Óë°¸Àý
ÔÚÏßÉ̳ÇÊý¾Ý¿âϵͳÉè¼Æ ˼·+Ч¹û
 
Ïà¹ØÎĵµ

GreenplumÊý¾Ý¿â»ù´¡Åàѵ
MySQL5.1ÐÔÄÜÓÅ»¯·½°¸
ijµçÉÌÊý¾ÝÖÐ̨¼Ü¹¹Êµ¼ù
MySQL¸ßÀ©Õ¹¼Ü¹¹Éè¼Æ
Ïà¹Ø¿Î³Ì

Êý¾ÝÖÎÀí¡¢Êý¾Ý¼Ü¹¹¼°Êý¾Ý±ê×¼
MongoDBʵս¿Î³Ì
²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿âÉè¼ÆÓëÓÅ»¯
PostgreSQLÊý¾Ý¿âʵսÅàѵ
×îл¼Æ»®
DeepSeekÔÚÈí¼þ²âÊÔÓ¦ÓÃʵ¼ù 4-12[ÔÚÏß]
DeepSeek´óÄ£ÐÍÓ¦Óÿª·¢Êµ¼ù 4-19[ÔÚÏß]
UAF¼Ü¹¹ÌåϵÓëʵ¼ù 4-11[±±¾©]
AIÖÇÄÜ»¯Èí¼þ²âÊÔ·½·¨Óëʵ¼ù 5-23[ÉϺ£]
»ùÓÚ UML ºÍEA½øÐзÖÎöÉè¼Æ 4-26[±±¾©]
ÒµÎñ¼Ü¹¹Éè¼ÆÓ뽨ģ 4-18[±±¾©]
 
×îÐÂÎÄÕÂ
´óÊý¾Ýƽ̨ϵÄÊý¾ÝÖÎÀí
ÈçºÎÉè¼ÆÊµÊ±Êý¾Ýƽ̨£¨¼¼Êõƪ£©
´óÊý¾Ý×ʲú¹ÜÀí×ÜÌå¿ò¼Ü¸ÅÊö
Kafka¼Ü¹¹ºÍÔ­Àí
ELK¶àÖּܹ¹¼°ÓÅÁÓ
×îпγÌ
´óÊý¾Ýƽ̨´î½¨Óë¸ßÐÔÄܼÆËã
´óÊý¾Ýƽ̨¼Ü¹¹ÓëÓ¦ÓÃʵս
´óÊý¾ÝϵͳÔËά
´óÊý¾Ý·ÖÎöÓë¹ÜÀí
Python¼°Êý¾Ý·ÖÎö
³É¹¦°¸Àý
ijͨÐÅÉ豸ÆóÒµ PythonÊý¾Ý·ÖÎöÓëÍÚ¾ò
Ä³ÒøÐÐ È˹¤ÖÇÄÜ+Python+´óÊý¾Ý
±±¾© Python¼°Êý¾Ý·ÖÎö
ÉñÁúÆû³µ ´óÊý¾Ý¼¼Êõƽ̨-Hadoop
ÖйúµçÐÅ ´óÊý¾Ýʱ´úÓëÏÖ´úÆóÒµµÄÊý¾Ý»¯ÔËӪʵ¼ù