±à¼ÍƼö: |
±¾ÎÄÖ÷Òª½éÉÜÁË
Presto µÄpresto»ù±¾¼Ü¹¹£¬PrestoÖÐSQLÔËÐйý³Ì£¬Presto¼à¿ØºÍÅäÖ㬴óÊý¾ÝOLAPÒýÇæ¶Ô±ÈµÈÏà¹ØÄÚÈÝ¡£
±¾ÎÄÀ´×Ôcsdn£¬ÓÉ»ðÁú¹ûÈí¼þAnna±à¼¡¢ÍƼö¡£ |
|
prestoÊÇʲô ÊÇFacebook¿ªÔ´µÄ£¬ÍêÈ«»ùÓÚÄÚ´æµÄ²¢?¼ÆË㣬·Ö²¼Ê½SQL½»»¥Ê½²éѯÒýÇæ
ÊÇÒ»ÖÖMassively parallel processing (MPP)¼Ü¹¹£¬¶à¸ö½Úµã¹ÜµÀʽִ?
³ÖÈÎÒâÊý¾ÝÔ´£¨Í¨¹ýÀ©Õ¹Ê½Connector×é¼þ£©£¬Êý¾Ý¹æÄ£GB~PB¼¶
ʹÓõļ¼Êõ£¬ÈçÏòÁ¿¼ÆË㣬¶¯Ì¬±àÒëÖ´?¼Æ»®£¬ÓÅ»¯µÄORCºÍParquet ReaderµÈ
presto²»Ì«Ö§³Ö´æ´¢¹ý³Ì£¬Ö§³Ö²¿·Ö±ê×¼sql
prestoµÄ²éѯËٶȱÈhive¿ì5-10±¶
ÉÏÃæ½²ÊöÁËprestoÊÇʲô£¬²éѯËÙ¶È£¬ÏÖÔÚÀ´¿´¿´prestoÊʺϸÉʲô
ÊʺϣºPB¼¶º£Á¿Êý¾Ý¸´ÔÓ·ÖÎö£¬½»»¥Ê½SQL²éѯ£¬?³Ö¿çÊý¾ÝÔ´²éѯ
²»Êʺϣº¶à¸ö´ó±íµÄjoin²Ù×÷£¬ÒòΪprestoÊÇ»ùÓÚÄÚ´æµÄ£¬¶àÕÅ´ó±íÔÚÄÚ´æÀï¿ÉÄܷŲ»ÏÂ
ºÍhiveµÄ¶Ô±È£º
hiveÊÇÒ»¸öÊý¾Ý²Ö¿â£¬ÊÇÒ»¸ö½»»¥Ê½±È½ÏÈõÒ»µãµÄ²éѯÒýÇæ£¬½»»¥Ê½Ã»ÓÐprestoÄÇôǿ£¬¶øÇÒÖ»ÄÜ·ÃÎÊhdfsµÄÊý¾Ý
prestoÊÇÒ»¸ö½»»¥Ê½²éѯÒýÇæ£¬¿ÉÒÔÔں̵ܶÄʱ¼äÄÚ·µ»Ø²éѯ½á¹û£¬Ãë¼¶£¬·ÖÖÓ¼¶£¬ÄÜ·ÃÎʺܶàÊý¾ÝÔ´
hiveÔÚ²éѯ100Gb¼¶±ðµÄÊý¾Ýʱ£¬ÏûºÄʱ¼äÒѾÊÇ·ÖÖÓ¼¶ÁË
µ«ÊÇprestoÊÇÈ¡´ú²»ÁËhiveµÄ£¬ÒòΪpÈ«²¿µÄÊý¾Ý¶¼ÊÇÔÚÄÚ´æÖУ¬ÏÞÖÆÁËÔÚÄÚ´æÖеÄÊý¾Ý¼¯´óС£¬±ÈÈç¶à¸ö´ó±íµÄjoin£¬ÕâЩ´ó±íÊDz»ÄÜÍêÈ«·Å½øÄÚ´æµÄ£¬Êµ¼ÊÓ¦ÓÃÖУ¬¶ÔÓÚÔÚprestoµÄ²éѯÊÇÓÐÒ»¶¨¹æ¶¨Ìõ¼þµÄ£¬±È±ÈÈç˵һ¸ö²éѯÔÚpresto²éѯ³¬¹ý30·ÖÖÓ£¬ÄǾÍkillµô°É£¬ËµÃ÷²»ÊʺÏÔÚprestoÉÏʹÓã¬Ö÷ÒªÔÒòÊÇ£¬²éѯ¹ý´óµÄ»°£¬»áÕ¼ÓÃÕû¸ö¼¯ÈºµÄ×ÊÔ´£¬Õâ»áµ¼ÖÂÄãºóÐøµÄ²éѯÊÇûÓÐ×ÊÔ´½øÐвéѯµÄ£¬Õâ¸úprestoµÄÉè¼ÆÀíÄîÊdzåÍ»µÄ£¬¾ÍÏñÊÇÄã½øÐÐÒ»¸ö²éѯ£¬µ«ÊÇÒªµÈ¸ö5·ÖÖÓ²ÅÓÐ×ÊÔ´¼ÌÐø²éѯ£¬ÕâÊǺܲ»ºÏÀíµÄ£¬½»»¥Ê½¾Í±äµÃÈõÁ˺ܶà
presto»ù±¾¼Ü¹¹ ÔÚ̸presto¼Ü¹¹Ö®Ç°£¬ÏȻعËÏÂhiveµÄ¼Ü¹¹

hive£ºclient½«²éѯÇëÇó·¢Ë͵½hive server£¬Ëü»áºÍmetastor½»»¥£¬»ñÈ¡±íµÄÔªÐÅÏ¢£¬Èç±íµÄλÖýṹµÈ£¬Ö®ºóhive
server»á½øÐÐÓï·¨½âÎö£¬½âÎö³ÉÓï·¨Ê÷£¬±ä³É²éѯ¼Æ»®£¬½øÐÐÓÅ»¯ºó£¬½«²éѯ¼Æ»®½»¸øÖ´ÐÐÒýÇæ£¬Ä¬ÈÏÊÇMR£¬È»ºó·Òë³ÉMR
presto£ºprestoÊÇÔÚËüÄÚ²¿×öhiveÀàËÆµÄÂß¼

½ÓÏÂÀ´£¬ÉîÈë¿´ÏÂprestoµÄÄÚ²¿¼Ü¹¹

ÕâÀïÃæÈý¸ö·þÎñ£º
Coordinator£¨¿¼µÚÄÚter£©£¬ÊÇÒ»¸öÖÐÐĵIJéѯ½ÇÉ«£¬ËüÖ÷ÒªµÄÒ»¸ö×÷ÓÃÊǽÓÊܲéѯÇëÇ󣬽«ËûÃÇת»»³É¸÷ÖÖ¸÷ÑùµÄÈÎÎñ£¬½«ÈÎÎñ²ð½âºó·Ö·¢µ½¶à¸öworkerÈ¥Ö´Ðи÷ÖÖÈÎÎñµÄ½Úµã
1¡¢½âÎöSQLÓï¾ä
2¡¢?³ÉÖ´?¼Æ»®
3¡¢·Ö·¢Ö´?ÈÎÎñ¸øWorker½ÚµãÖ´?
Worker£¬ÊÇÒ»¸öÕæÕýµÄ¼ÆËãµÄ½Úµã£¬Ö´ÐÐÈÎÎñµÄ½Úµã£¬Ëü½ÓÊÕµ½taskºó£¬¾Í»áµ½¶ÔÓ¦µÄÊý¾ÝÔ´ÀïÃæ£¬È¥°ÑÊý¾ÝÌáÈ¡³öÀ´£¬ÌáÈ¡·½Ê½ÊÇͨ¹ý¸÷ÖÖ¸÷ÑùµÄconnector£º
1¡¢¸ºÔðʵ¼ÊÖ´?²éѯÈÎÎñ
Discovery service£¬Êǽ«coordinatorºÍwoker½áºÏµ½Ò»ÆðµÄ·þÎñ£º
1¡¢Worker½ÚµãÆô¶¯ºóÏòDiscovery Server·þÎñ×¢²á
2¡¢Coordinator´ÓDiscovery Server»ñµÃWorker½Úµã
coordinatorºÍwokerÖ®¼äµÄ¹ØÏµÊÇÔõôά»¤µÄÄØ£¿ÊÇͨ¹ýDiscovery Server£¬ËùÓеÄworker¶¼°Ñ×Ô¼º×¢²áµ½Discovery
ServerÉÏ£¬Discovery ServerÊÇÒ»¸ö·¢ÏÖ·þÎñµÄservice£¬Discovery
Server·¢ÏÖ·þÎñÖ®ºó£¬coordinator±ãÖªµÀÔÚÎҵļ¯ÈºÖÐÓжàÉÙ¸öworkerÄܹ»¸øÎÒ¹¤×÷£¬È»ºóÎÒ·ÖÅ乤×÷µ½workerʱ±ãÓÐÁ˸ù¾Ý
×îºó£¬prestoÊÇͨ¹ýconnector plugin»ñÈ¡Êý¾ÝºÍÔªÐÅÏ¢µÄ£¬Ëü²»ÊÇ?¸öÊý¾Ý´æ´¢ÒýÇæ£¬²»ÐèÒªÓÐÊý¾Ý£¬prestoΪÆäËûÊý¾Ý´æ´¢ÏµÍ³ÌṩÁËSQLÄÜ?£¬¿Í»§¶ËÐÒéÊÇHTTP+JSON
PrestoÖ§³ÖµÄÊý¾ÝÔ´ºÍ´æ´¢¸ñʽ Hadoop/Hive connectorÓë´æ´¢¸ñʽ£º
HDFS£¬ORC£¬RCFILE£¬Parquet£¬SequenceFile£¬Text
¿ªÔ´Êý¾Ý´æ´¢ÏµÍ³£º
MySQL & PostgreSQL£¬Cassandra£¬Kafka£¬Redis
ÆäËû£º
MongoDB£¬ElasticSearch£¬HBase
PrestoÖÐSQLÔËÐйý³Ì£ºÕûÌåÁ÷³Ì

1¡¢µ±ÎÒÃÇÖ´ÐÐÒ»Ìõsql²éѯ£¬coordinator½ÓÊÕµ½ÕâÌõsqlÓï¾äÒÔºó£¬Ëü»áÓÐÒ»¸ösqlµÄÓï·¨½âÎöÆ÷È¥°ÑsqlÓï·¨½âÎö±ä³ÉÒ»¸ö³éÏóµÄÓï·¨Ê÷AST£¬Õâ³éÏóµÄÓï·¨ÊéËüÀïÃæÖ»ÊǽøÐÐһЩÓï·¨½âÎö£¬Èç¹ûÄãµÄsqlÓï¾äÀïÃæ£¬±ÈÈç˵¹Ø¼ü×ÖÄãÓõÄÊÇint¶ø²»ÊÇInteger£¬¾Í»áÔÚÓï·¨½âÎöÕâÀï¸ø±©Â¶³öÀ´
2¡¢Èç¹ûÓï·¨ÊÇ·ûºÏsqlÓï·¨¹æ·¶£¬Ö®ºó»á¾¹ýÒ»¸öÂß¼²éѯ¼Æ»®Æ÷µÄ×é¼þ£¬ËûµÄÖ÷Òª×÷ÓÃÊÇ£¬±ÈÈç˵ÄãsqlÀïÃæ³öÏÖµÄ±í£¬Ëû»áͨ¹ýconnectorµÄ·½Ê½È¥metaÀïÃæ°Ñ±íµÄschema£¬ÁÐÃû£¬ÁеÄÀàÐ͵ȣ¬È«²¿¸øÕÒ³öÀ´£¬½«ÕâЩÐÅÏ¢£¬¸úÓï·¨Ê÷¸ø¶ÔÓ¦ÆðÀ´£¬Ö®ºó»áÉú³ÉÒ»¸öÎïÀíµÄÓï·¨Ê÷½Úµã£¬Õâ¸öÓï·¨Ê÷½ÚµãÀïÃæ£¬²»½öÓµÓÐÁËËüµÄ²éѯ¹ØÏµ£¬»¹ÓµÓÐÀàÐ͵ĹØÏµ£¬Èç¹ûÔÚÕâÒ»²½£¬Êý¾Ý¿â±íÀïijһÁеÄÀàÐÍ£¬¸úÄãsqlµÄÀàÐͲ»Ò»Ö£¬¾Í»áÔÚÕâÀﱨ´í
3¡¢Èç¹ûͨ¹ý£¬¾Í»áµÃµ½Ò»¸öÂß¼µÄ²éѯ¼Æ»®£¬È»ºóÕâ¸öÂß¼²éѯ¼Æ»®£¬»á±»Ë͵½Ò»¸ö·Ö²¼Ê½µÄÂß¼²éѯ¼Æ»®Æ÷ÀïÃæ£¬½øÐÐÒ»¸ö·Ö²¼Ê½µÄ½âÎö£¬·Ö²¼Ê½½âÎöÀïÃæ£¬Ëû¾Í»áÈ¥°Ñ¶ÔÓ¦µÄÿһ¸ö²éѯ¼Æ»®×ª»¯Îªtask
4¡¢ÔÚÿһ¸ötaskÀïÃæ£¬Ëû»á°Ñ¶ÔÓ¦µÄλÖÃÐÅϢȫ²¿¸øÌáÈ¡³öÀ´£¬½»¸øÖ´ÐеÄplan£¬ÓÉplan°Ñ¶ÔÓ¦µÄtask·¢¸ø¶ÔÓ¦µÄworkerÈ¥Ö´ÐУ¬Õâ¾ÍÊÇÕû¸öµÄÒ»¸ö¹ý³Ì
ÕâÊÇÒ»¸öͨÓõÄsql½âÎöÁ÷³Ì£¬ÏñhiveÒ²ÊÇ×ñÑÀàËÆÕâÑùµÄÁ÷³Ì£¬²»Ò»ÑùµÄµØ·½ÊÇdistribution
plannerºÍexecutor pan£¬ÕâÀïÊǸ÷¸öÒýÇæ²»Ò»ÑùµÄµØ·½£¬Ç°Ãæ»ù±¾É϶¼Ò»ÖµÄ
PrestoÖÐSQLÔËÐйý³Ì£ºMapReduce vs Presto

taskÊÇ·ÅÔÚÿ¸öworkerÉϸÃÖ´Ðеģ¬Ã¿¸ötaskÖ´ÐÐÍêÖ®ºó£¬Êý¾ÝÊÇ´æ·ÅÔÚÄÚ´æÀïÁË£¬¶ø²»ÏñmrҪд´ÅÅÌ£¬È»ºóµ±¶à¸ötaskÖ®¼äÒª½øÐÐÊý¾Ý½»»»£¬±ÈÈçshuffleµÄʱºò£¬Ö±½Ó´ÓÄÚ´æÀï´¦Àí
Presto¼à¿ØºÍÅäÖãº¼à¿Ø Web UI
Query»ù±¾×´Ì¬µÄ²éѯ
JMX HTTP API
GET /v1/jmx/mbean[/{objectName}] com.facebook.presto.execution:name=TaskManager com.facebook.presto.execution:name=QueryManager com.facebook.presto.execution:name=NodeScheduler ʼþ֪ͨ Event Listener query start, query complete
Presto¼à¿ØºÍÅäÖãºÅäÖà ִÐмƻ®¼Æ»®£¨Coordinator£© node-scheduler.include-coordinator
ÊÇ·ñÈÃcoordinatorÔËÐÐtask
query.initial-hash-partitions
ÿ¸öGROUP BY²Ù×÷ʹ?µÄhash bucket(=tasks)×î´óÊýÄ¿(default:
8)
node-scheduler.min-candidates
ÿ¸östage²¢·¢ÔËÐйý³ÌÖпÉʹÓõÄ×î´óworkerÊýÄ¿£¨default£º10£©
query.schedule-split-batch-size
ÿ¸ösplitÊý¾ÝÁ¿
ÈÎÎñÖ´ÐУ¨Worker£© query.max-memory (default: 20 GB)
Ò»¸ö²éѯ¿ÉÒÔʹÓõÄ×î´ó¼¯ÈºÄÚ´æ
¿ØÖƼ¯Èº×ÊԴʹÓ㬷ÀÖ¹Ò»¸ö´ó²éѯռס¼¯ÈºËùÓÐ×ÊÔ´
ʹÓÃresource_overcommit¿ÉÒÔÍ»ÆÆÏÞÖÆ
query.max-memory-per-node (default: 1 GB)
Ò»¸ö²éѯÔÚÒ»¸ö½ÚµãÉÏ¿ÉÒÔʹÓõÄ×î´óÄÚ´æ
¾ÙÀý
Presto¼¯ÈºÅäÖ㺠120G * 40
query.max-memory=1 TB
query.max-memory-per-node=20 GB
query.max-run-time (default: 100 d)
Ò»¸ö²éѯ¿ÉÒÔÔËÐеÄ×î´óʱ¼ä
·ÀÖ¹Óû§Ìá½»Ò»¸ö³¤Ê±¼ä²éѯ×èÈûÆäËû²éѯ
task.max-worker-threads (default: Node CPUs * 4)
ÿ¸öworkerͬʱÔËÐеÄsplit¸öÊý
µ÷´ó¿ÉÒÔÔö¼ÓÍÌÍÂÂÊ£¬µ«ÊÇ»áÔö¼ÓÄÚ´æµÄÏûºÄ
¶ÓÁУ¨Queue£© ÈÎÎñÌá½»»òÕß×ÊԴʹÓõÄһЩÅäÖã¬ÊÇͨ¹ý¶ÓÁеÄÅäÖÃÀ´ÊµÏÖµÄ
×ÊÔ´¸ôÀ룬²éѯ¿ÉÒÔÌá½»µ½ÏàÓ¦¶ÓÁÐÖÐ
×ÊÔ´¸ôÀ룬²éѯ¿ÉÒÔÌá½»µ½ÏàÓ¦¶ÓÁÐÖРÿ¸ö¶ÓÁпÉÒÔÅäÖÃACL£¨È¨ÏÞ£© ÿ¸ö¶ÓÁпÉÒÔÅäÖÃQuota ¿ÉÒÔ²¢·¢ÔËÐвéѯµÄÊýÁ¿ ÅŶӵÄ×î´óÊýÁ¿

´óÊý¾ÝOLAPÒýÇæ¶Ô±È Presto£ºÄÚ´æ¼ÆË㣬mpp¼Ü¹¹
Druid£ºÊ±Ðò£¬Êý¾Ý·ÅÄڴ棬Ë÷Òý£¬Ô¤¼ÆËã
Spark SQL£º»ùÓÚSpark Core£¬mpp¼Ü¹¹
Kylin£ºCubeÔ¤¼ÆËã¡¡¡¡
×îºó£¬Ò»Ð©ÁãÉ¢µÄ֪ʶµãs
prestoÊʺÏpb¼¶µÄº£Á¿Êý¾Ý²éѯ·ÖÎö£¬²»ÊÇ˵°ÑpbµÄÊý¾Ý·Å½øÄڴ棬±ÈÈçÒ»ÕÅpb±í£¬²éѯcount£¬vagÕâÖÖÓиöÌØµã£¬ËäÈ»Êý¾ÝºÜ¶à£¬µ«ÊÇ×îÖյIJéѯ½á¹ûºÜС£¬ÕâÖ־Ͳ»»á°ÑÊý¾Ý¶¼·Åµ½ÄÚ´æÀïÃæ£¬Ö»ÊÇÔÚÔËËãµÄ¹ý³ÌÖУ¬ÄóöһЩÊý¾Ý·ÅÄڴ棬Ȼºó¼ÆË㣬ÔÚÅ׳ö£¬ÔÚÄã¬ÕâÖÖµÄÄÚ´æÕ¼ÓÃÁ¿ÊǺÜСµÄ£¬µ«ÊÇjoinÕâÖÖ£¬ÔÚÔËËãµÄÖмä¹ý³Ì»á²úÉú´óÁ¿µÄÊý¾Ý£¬»òÕß˵ÄÇÖÖ²éѯµÄÊý¾Ý²»´ó£¬µ«ÊÇÉú³ÉµÄÊý¾ÝÁ¿ºÜ´ó£¬ÕâÖÖÒ²ÊDz»ºÏÊÊÓÃprestoµÄ£¬µ«²»ÊÇ˵²»ÄÜ×ö£¬Ö»ÊÇ»áÕ¼ÓôóÁ¿Äڴ棬ÏûºÄºÜ³¤µÄʱ¼ä£¬ÕâÖÖhiveºÏÊʵã
prestoËãÊÇhiveµÄÒ»¸ö²¹³ä£¬ÐèÒª¾¡¿ìµÃ³ö½á¹ûµÄÓÃpresto£¬·ñÔòÓÃhive
workÊDz¿ÊðµÄʱºò¾ÍÊÂÏȲ¿ÊðºÃµÄ£¬workÆô¶¯100¸ö£¬Ê¹ÓõÄwork²»Ò»¶¨100¸ö£¬¶øÊǸù¾ÝcoordinatorÀ´¾ö¶¨²ð·Ö³É¶àÉÙ¸ötask£¬È»ºó·Ö·¢µ½¶àÉÙ¸öworkÈ¥
Ò»¸öcoordinator¿ÉÄÜͬʱÓÖ¶à¸öÓû§ÔÚÇëÇóquery£¬È»ºó¹²ÏíworkµÄÈ¥Ö´ÐУ¬ÕâÊÇÒ»¸ö¹²ÏíµÄ¼¯Èº
coordinatorºÍdiscovery server¿ÉÒÔÆô¶¯ÔÚÒ»¸ö½ÚµãÒ»¸ö½ø³Ì£¬Ò²¿ÉÒÔ·ÅÔÚ²»Í¬µÄnodeÉÏ£¬µ«ÊÇÏÖÔÚ¹«Ë¾´ó²¿·Ö¶¼ÊÇ·ÅÔÚÒ»¸ö½ÚµãÉÏ£¬Ò»¸ölauncher
start»áͬʱ°ÑÉÏÊöÁ½¸öÆô¶¯ÆðÀ´
¶ÔÓÚprestoµÄÈÝ´í£¬Èç¹ûij¸öworker¹ÒµôÁË£¬discovery server»á·¢ÏÖ²¢Í¨Öªcoordinator
µ«ÊǶÔÓÚÒ»¸öquery£¬ÊÇûÓÐÈÝ´íµÄ£¬Ò»µ©Ò»¸öwork¹ÒÁË£¬ÄÇôÕû¸öqurey¾ÍÊǰÜÁË
ÒòΪ¶ÔÓÚpresto£¬ËûµÄ²éѯʱ¼äÊǺ̵ܶģ¬ÓëÆä²éѯÕâÀï×öÈÝ´íÄÜÁ¦£¬²»ÈçÖØÐÂÖ´ÐÐÀ´µÄ¿ìÀ´µÄ¼òµ¥
¶ÔÓÚcoordinatorºÍdiscovery server½ÚµãµÄµ¥µã¹ÊÕÏ£¬presto»¹Ã»ÓпªÊ¼´¦ÀíÕâ¸öÎÊÌâÃ²ËÆ
|