¿´µ½±êÌ⣬¿ÉÄÜÓÐÓû§ÒªÎÊ£ºOSS²»ÊÇÓÃÀ´´æÍ¼Æ¬¡¢ÊÓÆµ¡¢¼°ÎļþµÄÂ𣬻¹¿ÉÒÔÔÚÉÏÃæ½¨±í¡¢Êý²Ö£¿¼ÆËãЧÂʺ;¼ÃÐÔ±íÏÖÔõôÑù£¿
±¾ÎÄÏȸø³ö»ù±¾½áÂÛ£º
¶ÔÏó´æ´¢£¨Object Storage Service£¬¼ò³ÆOSS£© ÊÇ»ùÓÚ°¢ÀïÔÆ·ÉÌì·Ö²¼Ê½ÏµÍ³µÄº£Á¿¡¢°²È«ºÍ¸ß¿É¿¿µÄÔÆ´æ´¢·þÎñ£¬ÊÇÒ»ÖÖÃæÏò»¥ÁªÍøµÄ´ó¹æÄ£¡¢Í¨Óô洢£¬ÌṩRESTful API£¬¾ß±¸ÈÝÁ¿ºÍ´¦ÀíµÄµ¯ÐÔÀ©Õ¹ÄÜÁ¦¡£
- »ùÓÚOSSÊÇ·ñ¿ÉÒÔ´´½¨Êý¾Ý±í£¿
¼ÈÈ»¿ÉÒÔ°ÑÉãÏñÍ·ÍÆÁ÷½Óµ½OSS£¬½¨±íÊôÓÚСCaseÁË¡£²¢ÇÒ2016ÄêÔÚÒàÁú´óÉñµÄ°ïÖúÏ£¬HadoopÉçÇøÔÚ¹Ù·½°æ±¾ÖÐÖ§³ÖOSS£¬¿ªÆôÁ˰¢ÀïÔÆ´æ´¢Ó뿪ԴÈںϵÄÐÂÀï³Ì±®¡£
½ñÌìΪÁ˽µµÍOSSÉϽ¨±íµÄÃż÷£¬ÈÕÖ¾·þÎñ£¨ÔSLS£©LogHub¿ÉÒÔÖ§³ÖOSSÉϱíµÄʵʱдÈ루±íÀàÐͰüÀ¨TextFile£¬Áд洢Parquet£©£¬Ö§³ÖѹËõ¼°Êý¾ÝPartitionÅäÖá£ÔÚ¼ÆËãÒýÇæ¶Ë£¬ÎÒÃÇÒѾºÍ°¢ÀïÔÆ£¨MaxCompute¡¢E-MapReduce£©ºÍÖ÷Á÷¿ªÔ´¼ÆËãÒýÇæ£¨PrestoµÈ£©´òͨ£¬ÎÞ·ìʹÓöàÖÖ¼ÆËãÒýÇæÈȲå°Î¶Ô½Ó¡£
¼ÈÈ»¿ÉÒÔ°ÑÊý¾Ý±íÖ±½Ó½¨ÔÚHDFS¡¢MaxCompute£¨ÔODPS£©ÉÏ£¬Ñ¡ÔñOSSÀ´´æ´¢±íÊý¾ÝÓÖÊÇÎªÊ²Ã´ÄØ£¿
´æ´¢Óë¼ÆËã·ÖÀëµÄÇ÷ÊÆ
ÔÚ2009Äê×ö´ó¹æÄ£¼ÆËãµÄºËÐÄ´ÊÊÇ¡°Locality¡±£ºÈüÆË㾡Á¿¿¿½üÊý¾ÝÒÔÌáÉýЧÂÊ¡£µ±Ê±Ò»¸ö¹«ÈϵÄÄ£ÐÍÊÇ£º¹¹½¨Ò»¸ö×ã¹»´óµÄ×ÊÔ´³Ø£¬°ÑÊý¾ÝºÍ¼ÆËãÈÚºÏÔÚÀïÃæ·¢»Ó¹æÄ£Ð§Ó¦¡£
µ«×î½ü¼¸ÄêÒÔÀ´£¬Éú̬ºÍ»·¾³¶¼ÇÄÈ»·¢ÉúÁËһЩ±ä»¯£º
- ¼ÆËãģʽ£ºÈ«Á¿Êý¾Ý¼ÆËãģʽ£¬Öð²½±»Impala¡¢PrestoµÈ¸ü¸ßЧ¼ÆËãģʽ¸ÏÉÏ
- ´æ´¢¸ñʽ£ºORC/Parquet/KuduµÈÁд桢Ë÷Òý¼¼Êõµ®Éú£¬Ê¹µÃ¼ÆËã²»ÐèÒªScan´ó¿éÊý¾Ý
- ÍøÂç¼Ü¹¹£º25GÍøÂ翪ʼÉÏÏߣ¬FPGAµÈ¼¼ÊõÒ²¼Ó¿ìÁËÍøÂçÌåÑé
- ´æ´¢½éÖÊ£ºSSD¡¢AliFlash¡¢3D X-Point ´óÁ¿»ìºÏ¼¼ÊõʹµÃ´æ´¢¿ÉÒÔ¡°¼È¿ìÓÖÃÍ¡±
- ¼ÆËãÆ½Ì¨£ºGPU¡¢FGPA¡¢ÉõÖÁÊÇδÀ´µÄTPUµÈ¸Ä±ä¼ÆËãÐÎ̬
´ÓÕâЩ±ä»¯Ê¹µÃÎÒÃÇ·¢ÏÖ£º
ͨ¹ýÒ»¿î»úÐÍͨ³Ô´æ´¢+¼ÆËã·½°¸£¬ÒѾÑݱä³É´æ´¢+¼ÆËã¸÷×Ô·þÎñ»¯£¬Í¨¹ý¸ßËÙÍøÂç½øÐÐÁ¬½ÓµÄÇ÷ÊÆ

ÕâÖÖ·½Ê½¿ÉÒÔʹµÃ´æ´¢¡¢¼ÆËã²»ÓÃÔÙ±»¡±»úÐÍ¡°£¬¡±»ú¹ñ¡°£¬¡±µçÁ¦¡°µÈ·½°¸Êø¸¿£¬ÔÚ¸÷×Ô×îÉó¤µÄÁìÓò½øÐд´Ð¡£´ÓÒµ½ç¶ÔÓÚ¡±·Ö²ã¡°µÄ¹¤×÷ÖУ¬ÎÒÃÇÒ²¿´µ½ÁËÕâÀàµÄ³¢ÊÔ£º
°¸Àý1£ºNetflix »ùÓÚS3½â¾ö·½°¸
NetflixÊÇAWS´´Ð´ú±í£¬ÌرðÊÇËûÃǵĴóÊý¾ÝÒµÎñ¡£¸ù¾Ý2016 Re:InventÉÏSlidesÃèÊö£¬NetflixÿÌìÐÂÔö500 BillionÌõÈÕÖ¾£¨Êý¾ÝÁ¿500 TB£©¡¢´æÁ¿Êý²Ö¹æÄ£ 60PB¡¢Ã¿Ìì»á¶ÔÆäÖÐ3PBÊý¾Ý×ö¼ÆËã¡£
ÔÚSlidesÖÐNetflix̸µ½£º´Ó2014Ä꿪ʼ¾Í¾ö¶¨¿ªÊ¼ÞðÆú¸÷ÖÖϵͳ¸ôºÒ£¬µ×²ãʹÓÃÁËͳһ´æ´¢S3£¬Ö®ÉϹ¹½¨¸÷ÖÖ¼ÆËãÒýÇæÏµÍ³¡£ÊÂʵ֤Ã÷NetflixÕâÒ»²½×ßµÃÕýÈ·£¬º£Á¿µÄ´æ´¢Óë¼ÆËãÄÜÁ¦Ê¹µÃÉÌÒµµÄ´´Ðµõ½Á˳ä·ÖÊÍ·Å£¬³ÉΪAWSÉÏÁîÈËÒýÒÔΪ°ÁµÄѧϰ°ñÑù¡£

ÊÜNetflixÆô·¢£¬AWS ÔÚ2016 Re:Invent ÉÏÍÆ³öÁËÒ»¿îеļÆËã²úÆ·Athena£º¸Ã²úÆ·½«Presto·þÎñ»¯Ìṩ»ùÓÚ¸÷ÖÖ´æ´¢Àà·þÎñµÄ Ad-Hoc QueryÄÜÁ¦¡£
AWS AthenaÀûÓöà¸ö¿ÉÓÃÇø(Availability Zones)ÖеļÆËã×ÊÔ´Ö´Ðвéѯ£¬²¢½«S3ÓÃ×÷µ×²ãÊý¾Ý´æ´¢ÏµÍ³£¬ÓÉÓÚÊý¾ÝÈßÓàµØ´æ´¢ÔÚ¶à¸öµØµãºÍÿ¸öµØµãµÄ¶à¸öÉ豸ÖУ¬·þÎñ¾ß±¸ºÜ¸ßµÄ¿ÉÓÃÐԺͿɿ¿ÐÔ¡£
°¸Àý2£ºFacebook RocksDBÏîÄ¿
Google¿ªÔ´ÁËLevel DB£¬¶øFacebookͨ¹ý¸ÄÔì³ÉRocksDBʹËüÉÏÉýµ½Ð¸߶ȡ£RocksDB³ýÁ˶ÔLSMÄ£Ð͵Ķà¸öÓÅ»¯Í⣬ÁíÒ»¸ö·Ç³£ÎüÒýÈ˵ĵط½ÔÚ¶Ô´æ´¢½éÖÊ¡¢¼ÆËã²ãÊÊÅäµÃ·Ç³£ÓѺ㬿ÉÒÔ³ä·Ö·¢»Ó¼ÆËãºÍ´æ´¢µÄÐÔÄÜ¡£µ×²ãµÄ½éÖÊÓë´æ´¢¶ÔÉϲãAPI͸Ã÷ÈȲå°Î£¬ÊÇÔÚÈí¼þÉè¼Æ²ãÃæ´æ´¢+¼ÆËã·ÖÀëµÄÒ»¸öÓÅÃÀ°¸Àý¡£

OSSÉϽ¨Á¢Êý²ÖµÄÓÅÊÆ
ÓÅÊÆ1£º²»ÊÜÏÞÖÆµÄ´æ´¢¿Õ¼ä
¶ÔÓÚÊý¾Ý²Ö¿âÀ´Ëµ×îÖØÒªÒ»µãÊǺ£Á¿´æ´¢£¬ÄÜΪ¼ÆËã·ÖÎöÌṩ´óÊý¾ÝÍÌÍÂÖ§³Ö¡£ÔÚÕâ¸öµãÉÏOSSÊǷdz£ºÏÊʵġ£
½áºÏOSSµÄĿ¼ÉèÖ㬶Դó¹æÄ££¨°ÙÍò¼¶±ðÒÔÉÏ£©Îļþ×öºÏÀí»®·Ö£¬²¢Óë¼ÆËãÒýÇæÅäºÏÄõ½¸ü¸ßµÄ¼ÆËãЧÂÊ¡£LogHubͶµÝOSS´æ´¢Ö§³ÖHive-style·ÖÇøÄ¿Â¼£¬½«Êý¾Ý°´ÕÕÈÕÆÚ´æ´¢£¬¿ÉÒÔÉèÖöàά·ÖÇø¡£
¾Ù¸öÀý×Ó£¬ÎÒÃÇÓÐÒ»¸öÓ¦ÓýÐmy-app£¬ÎªÓ¦Óô´½¨Ò»¸ödwÏîÄ¿ my-dw£¬ÔÚÏîÄ¿Öд´½¨ÁËÒ»×é±í£¬ÒÔÆäÖÐÒ»¸ö±ímy-table×÷ΪÀý×Ó£º±íÖеÄÊý¾ÝÒÔʱ¼ä£¨Ì죩×÷Ϊpartition£¨ÀýÈçdate='20170330' ´ú±íµ±ÌìµÄÊý¾ÝĿ¼)¡£
Õû¸öÊý²ÖµÄ²ã¼¶½á¹¹¿ÉÒÔÓ³ÉäΪOSSµÄÒ»¸ö·ÃÎÊ·¾¶£º
- my-app Ϊ OSS ÉÏbucketÃû³Æ
- my-dw Ö®ºóÔòΪÊý²ÖµÄÏîÄ¿Ãû£¨namespace£©
- my-tableÊDZíÃû
- date=20170330ÊÇһά·ÖÇø

ÓÅÊÆ2£º¼«µÍµÄ´æ´¢³É±¾
OSS ÊÇÌṩʵʱÊý¾Ý¶Áд¡°×î±ãÒË¡±´æ´¢²úÆ·Ö®Ò»£¬¶ÔÓÚ100GBÈÕÖ¾Êý¾Ý£º
- ʹÓÃÁд洢±àÂ루ÒÔParquet¸ñʽΪÀý£©£¬Í¨¹ýsnappyѹËõºó£¬´æ´¢Êý¾ÝÁ¿ÔÚ8 GB×óÓÒ
- ÒÔOSSµ±Ç°¹ÙÍø¼Û¸ñ¼ÆË㣬ʹÓÃOSS´æ´¢Ò»¸öÔ·ÑÓÃΪ 8 * 0.148 = 1.184 Ôª
- ³ý´ËÖ®Í⣬OSSÓÐÁ½ÖÖ¸ù¾Ý·ÃÎÊÆµÂÊ¿ÉÈÎÒâת»»ÐÎ̬£ºIA£¨µÍƵ£©¡¢Archive£¨À䱸£©£¬×îµÍ¿ÉÒÔ½µµÍ60%³É±¾¡£OSS Óë IA£¬ArchiveÖ®¼äÊý¾ÝÄ£ÐÍÊÇÒ»Öµģ¬Êý¾ÝÐÎ̬¿ÉÒԷdz£±ã½ÝµÄת»»¡£

ÓÅÊÆ3£ºÒ»·ÝÊý¾Ý£¬¶Ô½Ó¶àÖÖ¼ÆËãÒýÇæ
ÎÒÃÇ¿ÉÒÔ½«Êý¾ÝÒÔÒ»ÖÖͨÓõÄÐÒé´æ´¢£¨ÀýÈçtextfile£¬sequence file»òparquetµÈ£©£¬Ä¿Ç°OSSÉÏÊý¾ÝÖ§³ÖÈçϼÆËãÒýÇæ£º
- ¿ªÔ´£ºSpark¡¢Presto¡¢Druid£¬Pig£¬HiveµÈ
- °¢ÀïÔÆ£ºMaxCompute£¬E-MapReduce¡¢RDS-PG¡¢Batch ComputeµÈ
ÒÔÉϼÆËãÒýÇæºÍ´æ´¢Ö®¼ä¶¼ÊÇÈȲå°Î£¬¿ÉÒÔ·½±ãµØÔÚ²»Í¬´óСµÄ²âÊÔ¡¢Éú²úÊý¾Ý¼¯ÉϽøÐÐÇл»×éºÏ¡£
¶Ô±ÈÓ봫ͳÊý²Ö·½°¸£¬Êý¾Ý´æ´¢ÓÚOSS£¬¼ÆËãʵÏÖÁËSchema on Read£¬Ê¹µÃÊý¾Ý·ÖÎöµÄ×ÔÓɶȵõ½Á˺ܴóÌáÉý¡£

³ýÁËÖ§³Ö¶àÖÖ¼ÆËãÒýÇæÍ⣬OSS ±¾Éí»¹ÓÐGeo-Replication¹¦ÄÜ£¬¿ÉÒÔÔÚ²»Í¬Region¼ä׼ʵʱ½øÐÐͬ²½£¬²»°Ñ¼¦µ°·ÅÔÚÒ»¸öÀº×ÓÀÒÔ½øÒ»²½ÌáÉýÖØÒªÊý¾ÝµÄ°²È«ÐÔ¡£
ÓÅÊÆ4£ºÔÚ¼ÆËãЧÂÊÉϱȼçHDFSÀà´æ´¢
OSS´ÓAPIÉÏ¿´ÆðÀ´²»ÏñHDFSÀà´æ´¢Õâôϸ£¬ÐÔÄܲ¢²»Ò»¶¨ºÃ£¿
ÕâÀïÒÔÒ»¸öMap-Reduce×÷Òµ¾ÙÀý£¬ÔÚ×÷ÒµµÄÖ´Ðйý³ÌÖУ¬OSS»áÔÚ3¸öµØ·½±»Óõ½£º
- µ÷¶È£ºµ±²éѯÌύʱ£¬ÐèÒª¸ù¾Ý¼ÆËãÊý¾Ý·¶Î§ List OSSÄ¿Â¼ÖÆ¶¨plan£¬È·¶¨¶àÉÙÎļþĿ¼²ÎÓë¼ÆËã
- ÔËÐУºÃ¿¸öWorker¸ù¾ÝplanɨÃèÖ¸¶¨Ä¿Â¼ÏÂÎļþ£¬¶ÁÈ¡²¢½øÐÐ×Ô¶¨Ò弯Ëã
- ½á¹û£ºµ±¼ÆËãÍê³Éʱ£¬Ð´ÈëOSS£¨¼ÆËãÖмä½á¹û²úÉúµÄShuffleÎļþ¿ÉÒÔдÔÚ±¾»úÒÔÓÅ»¯ÐÔÄÜ£¬²¿·Ö³¡¾°ÏÂÒ²¿ÉÒÔÑ¡ÔñʹÓÃOSS£©

¿É¼û£¬¶ÔÓÚAd-Hoc QueryÀೡ¾°£¬OSSÔÚʹÓÃģʽÉ϶¼¿ÉÒÔÍêȫʤÈΡ£
¿ªÊ¼ÔÚOSS·ÖÎöÊý¾Ý
Êý¾ÝдÈë
Ö±½Ó½«ÈÕÖ¾ÒÔ׼ʵʱ·½Ê½Ð´ÈëOSS£¬Ö§³ÖJSON¡¢Parquet¸ñʽ£¬Í¶µÝ¹æÔòÅäÖÃÈçÏ£º

Êý¾ÝÔÚOSS´æ´¢ÈçÏ£º

ͨ¹ýLogHubдÈëÓÅÊÆ£ºÊý¾Ý½ÓÈëLogHub¶àÖÖÑ¡Ôñ£¬È«Íйܹ鵵·þÎñ£¬×¼ÊµÊ±Í¶µÝ£¬Ö§³ÖÒì³£ÖØÊÔ£¬STSÊÚȨ¡£Á˽âOSSͶµÝÇë²Î¿¼Îĵµ¡£
ʹÓÃOSS ¸÷ÖÖSDK»òAPIдÈ룬ÍêÈ«×ÔÖ÷µÄдÈ뷽ʽ£¬²Î¿¼Îĵµ¡£
¼ÆËãÒýÇæ
- E-MapReduce/Spark/Hive Óû§£º²Î¿¼ÉçÇøÎĵµ¡£
- MaxCompute Óû§£¨ODPS£©£º¹¦ÄÜÄÚ²âÖС£
- PGÓû§£ºÇëÁªÏµ ÌúâÖ¡£
- PrestoÓû§£ºLocal Fileģʽ£¬²Î¿¼ÉçÇøÎĵµ¡£
- ÆäËü£ºËæÊ±Ò»¸öGet£¬Êý¾ÝÈ«²¿ÄÃ×ß¡£
|