ÕªÒª£º±¾ÎÄʵÀý½éÉÜÁËSpatialHadoopƽ̨£¬ËüÊǵÚÒ»¸ö»ùÓÚ³ÉÊìMapReduce¶Ô¿Õ¼äÊý¾Ý¾ßÓÐÔÉúÖ§³ÖµÄ¿ò¼Ü¡£SpatialHadoopÊǶÔHadoopµÄ×öÁËÒ»¸öÈ«ÃæµÄÀ©Õ¹£¬Ê¹ÆäºËÐŦÄÜ¿ÉÒÔÖ§³Ö¿Õ¼äÊý¾Ý¡£Òò´Ë£¬¶ÔÓÚ´¦Àí¿Õ¼äÊý¾Ý£¬SpatialHadoopÓëĿǰ´æÔÚµÄHadoopÏîÄ¿Ïà±È¾ßÓиüºÃµÄÐÔÄÜ¡£SpatialHadoopÖ÷Òª°üÀ¨Ò»¸ö¼òµ¥µÄ¿Õ¼ä¸ß¼¶ÓïÑÔ¡¢Á½¼¶¿Õ¼äË÷Òý½á¹¹£¬ÒÔ¼°½¨Á¢ÔÚMapReduce²ãµÄ»ù±¾¿Õ¼ä×é¼þºÍÈý¸ö»ù±¾¿Õ¼ä²Ù×÷£¨·¶Î§²éѯ¡¢K-NN²éѯºÍ¿Õ¼äÁ´½Ó£©¡£ÆäËûµÄ¿Õ¼ä²Ù×÷ͬÑùÒ²¿ÉÒÔÔÚSpatialHadoopƽ̨ÉϽøÐв¿Êð¡£±¾ÎÄչʾÁËÒ»¸ö»ùÓÚSpatialHadoopµÄÔÐÍϵͳ¡£ÏµÍ³ÔËÐл·¾³ÎªAmazon EC2¼¯Èº£¬¿Õ¼äÊý¾ÝÊÇ´ÓTigerÎļþºÍOpenStreetMapÉÏ»ñÈ¡£¬´óС·Ö±ðΪ60GBºÍ300GB¡£
1¡¢ÒýÑÔ
Ðí¶àÀàËÆÓÚMapReduceϵͳ£¬ÀýÈçHadoopµÈ£¬·¢Õ¹µÄÒѾ±È½Ï³ÉÊ죬¶øÇÒÒ²ÓÐÐí¶à»ùÓڴ˵ÄÓ¦ÓóÌÐò£¬Èç»úÆ÷ѧϰ[3]¡¢Õ××Ö½ÚÅÅÐò[9]¡¢Í¼Ïñ´¦Àí[1]µÈ£¬¶àÄêÀ´Ò²±»Ö¤ÊµÁ˶ÔÓÚ´óÊý¾Ý·ÖÎöÀ´ËµÊÇÒ»¸öÓÐЧµÄ¿ò¼Ü¡£Óë´Ëͬʱ£¬¶ÔÓÚ¿Õ¼äÊý¾ÝÒ²½øÈëÁËÒ»¸ö±¬Õ¨µÄʱ´ú£¬ÈçÖÇÄÜÊÖ»ú¡¢Ò½ÁÆÉ豸¡¢Ì«¿ÕÍûÔ¶¾µµÈ²»Í¬À´Ô´µÄÊý¾Ý¡£È»¶ø£¬²»ÐÒµÄÊÇ£¬¶ÔÓÚÖ§³Ö¿Õ¼äÊý¾Ý¶øÑÔ£¬Hadoop´æÔÚ×ÅÏÈÌìµÄ²»×㣬ËüµÄºËÐÄ¿ò¼Ü²¢²»ÄܺܺõÄÖ§³Ö¿Õ¼äÊý¾ÝµÄÌØÐÔ¡£ÏÖÓлùÓÚHadoop´¦Àí¿Õ¼äÊý¾ÝÖ÷Òª¼¯ÖÐÔÚÌØ¶¨µÄÊý¾ÝÀàÐͺÍÊý¾Ý²Ù×÷µÈ·½Ã棬Èç¸ù¾Ý¹ì¼£½øÐз¶Î§²éѯ[6]¡¢»ùÓÚµã×´Êý¾Ý½øÐÐKNNÁ¬½Ó[5,13]µÈ¡£¶øÇÒÕâЩ¶Ô¿Õ¼äÊý¾Ý²Ù×÷µÄЧÂÊÒ²Êܵ½HadoopÄÚÔÚÒòËØµÄÏÞÖÆ¡£

±¾ÎÄÌá³öµÄSpatialHadoopƽ̨¿ÉÒÔͨ¹ýÔÚÏßµÄ×ÊÔ´»ñÈ¡£¨http:// spatialhadoop.cs.umn.edu/.£©¡£SpatialHadoopÊÇ»ùÓÚHadoopÒ»¸öÈ«ÃæµÄÀ©Õ¹£¨Ô¼12000ÐкËÐÄ´úÂ룩£¬Ê¹´Ó´úÂë²ã¶Ô¿Õ¼ä½á¹¹ºÍ¿Õ¼äÊý¾Ý½øÐÐÁËÖ§³Ö¡£Õâ±£Ö¤ÁËSpatialHadoopµÄ¹¤×÷·½Ê½ÓëHadoopµÄÒ»ÖÂÐÔ£¬Í¨¹ýµ÷ÓÃMapºÍReduceº¯Êý¿âÀ´Íê³É¹¤×÷£¬Òò´ËÏÖÓÐHadoopÏîĿҲÄܹ»ÔÚSpatialHadoopÉÏÔËÐС£È»¶ø£¬¶ÔÓÚ´¦Àí¿Õ¼äÊý¾Ý¶øÑÔ£¬SpatialHadoopÓëHadoopÏà±È¾ßÓиüºÃµÄÐÔÄÜ¡£Èçͼ1Ëùʾ£¬£¨a£©ºÍ£¨b£©·Ö±ð±íʾ»ùÓÚHadoopºÍSpatialHadoopºÎ½øÐпռ䷶Χ²éѯ¡£70000000ÌõµÄ¿Õ¼äÊý¾ÝÒªËØÔÚ20¸ö½ÚµãµÄ¼¯ÈºÉÏÔËÐÐͬÑùµÄ²éѯ£¬HadoopÐèÒª200s£¬¶øSpatialHadoopÖ»Ðè2s¡£
SpatialHadoop»ùÓÚHadoopËùÓв㶼ǶÈëÁ˿ռä½á¹¹£¬°üÀ¨ÓïÑԲ㡢´æ´¢²ã¡¢MapReduce²ãÒÔ¼°ÒµÎñ²ã¡£ÔÚÓïÑԲ㣬ÌṩÁËÒ»ÖÖ¼òµ¥¸ß¼¶ÓïÑÔÓÃÓÚ¿Õ¼äÊý¾Ý·ÖÎö£¬¼´Ê¹·Ç¼¼ÊõÈËÔ±Ò²¿ÉÒÔ½øÐвÙ×÷¡£ÔÚ´æ´¢²ã£¬ÌṩÁËÒ»¸öÁ½¼¶¿Õ¼äË÷Òý»úÖÆ£¬¼´½ÚµãÖ®¼ä·ÖÇøÊý¾ÝµÄÈ«¾ÖË÷ÒýºÍÿ¸ö½Úµã×éÖ¯Êý¾ÝµÄ¾Ö²¿Ë÷Òý¡£Í¨¹ýÕâÑùµÄË÷Òý»úÖÆ½¨Á¢Á˸ñÍøË÷Òý[7]¡¢R-tree[4]ºÍR+-tree[11]Ë÷Òý¡£ÔÚMapReduce²ã£¬Ç¶ÈëÁËÁ½¸öеĿռä×é¼þ£¬Í¨¹ý¸Ã×é¼þ¿ÉÒÔ»ñÈ¡Ë÷ÒýÎļþ£¬¼´SaptialFileSplitterºÍSpatialRecordReader¡£SaptialFileSplitterͨ¹ýÐÞ¼ô·ÖÇøÀ´ÀûÓÃÈ«¾ÖË÷Òý£¬µ«²»»áµ¼ÖÂÉú³É²éѯ½á¹û£»¶øSpatialRecordReaderÀûÓþֲ¿Ë÷ÒýÀ´»ñµÃÿ¸ö·ÖÇøÄÚÓÐЧµÄ·ÃÎʼǼ¡£ÔÚÒµÎñ²ã£¬ÌṩÁËһϵÁпռä²Ù×÷£¨·¶Î§²éѯ¡¢KNNºÍ¿Õ¼äÁ¬½Ó£©£¬ÊµÏÖÁËÔÚMapReduce²ãÓ¦ÓÃË÷ÒýºÍеĿռä×é¼þ¡£ÆäËûµÄ¿Õ¼ä²Ù×÷Ò²¿ÉÒÔͨ¹ýͬÑùµÄ·½Ê½Ç¶Èëµ½¸Ãƽ̨ÖС£
SpatialHadoopÊÇÒ»¸ö¿ªÔ´¹²ÏíµÄƽ̨£¬ÔÊÐíÑо¿ÉçÇøµÄÿһλ¹±Ï×Õß¶ÔÆä¹¦ÄܽøÐÐÍØÕ¹¡£Õë¶Ô²»Í¬µÄÓ¦Óã¬SpatialHadoopÖеĺËÐÄ»ù´¡×é¼þ¶¼Äܹ»°ïÖúÓû§¸ßЧµÄʵÏÖ¸ü¶à¿Õ¼ä²Ù×÷¡£Í¨¹ýÒ»¸ö°¸ÀýÑо¿£¬SpatialHadoopÒѾӵÓÐÁËÈý¸ö¿Õ¼ä²Ù×÷£¬¼´·¶Î§²éѯ£¬K-nearest-neighbor ²éѯºÍ¿Õ¼äÁ¬½Ó¡£ÎÒÃÇÉèÏ룬ÔÚδÀ´SpatialHadoop½«°çÑÝÕßÒ»¸öÑо¿ÔØÌåµÄ½ÇÉ«£¬¸ü¶àµÄÑо¿Õß½«ÔÚ´Ë»ù´¡ÉϹ²ÏíËûÃǵĿռä²Ù×÷ºÍ·ÖÎö¹¤¾ß£¬ÐγÉÒ»Ì׷ḻµÄÌåϵ¹©¿ª·¢Õß¡¢Êµ¼ùÕߺͿÆÑÐÕßʹÓá£
±¾ÎĽ«Í¨¹ýÒ»¸öÕæÊµµÄÔÐÍϵͳÀ´½éÉÜSpatialHadoop¡£¸Ãϵͳ²ÉÓÃÁËÁ½Ì×Êý¾Ý£¬Êý¾Ý·Ö±ðÀ´×ÔTigerÎļþ¼¯[12]ºÍOpenStreetMap[10]£¬ÔËÐл·¾³ÎªAmazon EC2¼¯Èº¡£TigerÎļþ¼¯°üº¬7000,0000Ìõ¼Ç¼£¨´óСΪ60GB£©£¬ÓеÀ·¡¢Ë®ÌåºÍÆäËûµÄÃÀ¹úµØÀíÐÅÏ¢¡£OpenStreetMap°üº¬È«ÊÀ½çµÄµÀ·¡¢ÈȵãºÍ½¨ÖþÎï±ß½ç£¬Êý¾Ý´óСΪ300GB¡£
2¡¢SpatialHadoop¿ò¼Ü
ͼ2ΪSpatialHadoopϵͳ¿ò¼Ü¡£SpatialHadoop¼¯ÈºÖ÷Òª°üÀ¨Ò»¸öÖ÷½Úµã£¬ÓÃÀ´½ÓÊÕÓû§µÄ²éѯ£¬²¢½«Æä·Ö¸îΪ¸üСµÄÈÎÎñ£¬²¢Í¨¹ý¶à¸ö´Ó½ÚµãÀàÖ´ÐÐÕâЩÈÎÎñ¡£¸ù¾ÝÓëSpatialHadoop½»»¥Ä¿µÄ£¬Óû§¿ÉÒÔ·ÖΪÈýÀࣺÆÕͨÓû§¡¢¿ª·¢Õߺ͹ÜÀíÕß¡£ÆÕͨÓû§£¨·Ç¼¼ÊõÈËÔ±£©¿ÉÒÔͨ¹ý¸Ãƽ̨ÌṩµÄÓïÑÔ´¦ÀíËûÃǵÄÊý¾Ý¼¯£»¿ª·¢Õߣ¨¸ü¸ß¼¶Óû§£©¿ÉÒÔʵÏÖһЩÕë¶Ô¾ßÌåÓ¦ÓõÄпռä²Ù×÷¹¦ÄÜ£»¹ÜÀíÕßÄܹ»Í¨¹ýµ÷ÕûÅäÖÃÎļþÖеÄϵͳ²ÎÊýÀ´¿ØÖÆÕû¸öϵͳ¡£
SpatialHadoop²ÉÓÃÁË·Ö²ãÉè¼Æ£¬Ö÷Òª°üº¬ËIJ㣬¼´ÓïÑԲ㡢´æ´¢²ã¡¢MapReduceºÍÒµÎñ²ã¡£ÓïÑÔ²ãÌṩÁËÒ»¸ö¼òµ¥¸ß¼¶ÀàSQLÓïÑÔ£¬Ö§³Ö¿Õ¼äÊý¾ÝÀàÐͺͲÙ×÷¡£´æ´¢²ã°üº¬ÁËÈ«¾ÖºÍ¾Ö²¿Á½¸ö¿Õ¼äË÷Òý½á¹¹¡£È«¾ÖË÷ÒýÓÃÓÚ¼ÆËã½Úµã¼äµÄÊý¾Ý»®·Ö£¬¾Ö²¿Ë÷ÒýÓÃÓÚ½ÚµãÄÚ²¿Êý¾Ý×éÖ¯¡£MapReduce²ãÓµÓÐÁ½¸öеĿռä×é¼þ£¬¼´SpatialFileSplitterºÍSpatialRecordReader£¬·Ö±ðÀûÓÃÈ«¾Ö£¨ÐÞ¼ôÊý¾Ýµ«²»²úÉú²éѯ½á¹û£©ºÍ¾Ö²¿Ë÷Òý¡£ÒµÎñ²ã¶Ô»ùÓÚ¿Õ¼äË÷ÒýºÍMapReduce²ãÐÂ×é¼þʵÏֵĶàÖÖ¿Õ¼ä²Ù×÷½øÐÐÁË·â×°¡£SpatialHadoopÓëÉú¾ßÓиßЧʵÏÖÈý¸ö»ù´¡¿Õ¼ä²Ù×÷£¬¼´·¶Î§²éѯ¡¢KNNºÍ¿Õ¼äÁ¬½Ó¡£ÆäËûµÄ¿Õ¼ä²Ù×÷Ò²¿ÉÒÔͨ¹ýÀàËÆµÄ·½·¨Ç¶Èëµ½¸Ãƽ̨ÖС£

3¡¢ÓïÑÔ²ã
SpatialHadoopÌṩÁËÒ»ÖÖ¼òµ¥¸ß¼¶ÓïÑÔ£¬·Ç¼¼ÊõÈËÔ±Ò²¿ÉÒÔͨ¹ý¸ÃÓïÑÔÓëϵͳ½øÐн»»¥¡£¸ÃÓïÑÔÄÚÖÃÖ§³Ö¿Õ¼äÊý¾ÝÀàÐÍ¡¢¿Õ¼ä»ù´¡¹¦ÄÜÒÔ¼°¿Õ¼ä²Ù×÷¡£¿Õ¼äÊý¾ÝÀࣨµã¡¢¾ØÐκͶà±ßÐΣ©¶¨ÒåÁËÎļþ¼ÓÔØ¹ý³ÌÖеÄÊäÈëÎļþģʽ¡£¿Õ¼ä»ù´¡¹¦ÄܰüÀ¨²â¾à¡¢µþ¼ÓÒÔ¼°MRB£¨×îСÍâ°ü¾ØÐΣ©¡£²â¾à¼´Í¨¹ý¿Õ¼äÊôÐÔ¼ÆËãÁ½ÒªËØÖÊÐÄÖ®¼äµÄ¾àÀ룻µþ¼Ó·ÖÎöÊÇ·¢ÏÖÁ½¸öÒªËØÖ®¼äÊÇ·ñÓÐÖØµþÇøÓò£»¶øMRBÊÇÓÃÀ´¼ÆËãÃæ×´ÒªËØµÄ×îСÍâ°ü¾ØÐΡ£¿Õ¼ä²Ù×÷°üÀ¨·¶Î§²éѯ¡¢KNNºÍ¿Õ¼ä°®ÄãÁ¬½ÓÓÃÀ´ÊäÈë´øÓпռäÊôÐÔµÄÎļþºÍÉú³ÉÊä³öÎļþ½á¹û¡£
SpatialHadoop²¢Ã»Óдӵײ㿪·¢Ò»¸öеĿռäÓïÑÔ£¬¶øÊÇÀ©Õ¹ÁËPig Latin[8]¡£ÕâÑù²»½ö±£ÁôPig LatinÓïÑÔµÄÔʼ¹¦ÄÜ£¬Í¬Ê±Ò²¼ÓÈëÁ˿ռä½á¹¹¡£ÓÈÆäÊÇSpatialHadoopÓïÑÔÖØÐ´Á˹ؼüµÄFILTERºÍJOINÀà¿â£¬µ±ÊäÈë²ÎÊý¾ßÓпռäν´Êʱ£¬½«·Ö±ðÖ´Ðз¶Î§²éѯºÍ¿Õ¼äÁ¬½Ó¡£ÀýÈ磬µ±FILTER¹Ø¼ü´Ê´øÓÐOverlaysν´Êʱ£¬SpatialHadoop½«Ö´Ðз¶Î§²éѯ²Ù×÷¡£¶ÔÓÚKNN²éѯ£¬ÒýÈëÁËÒ»ÖÖеÄKNNËã·¨¡£ÀýÈç¼ÆËã²éѯµãquery_loc¾àÀë×î½üµÄ100¼ä·¿ÎÝ¡£
houses = LOAD ’houses’ AS (id:int,loc:point);
nearest_houses = KNN houses WITH_K=100USING Distance(loc, query_loc);
4¡¢´æ´¢²ã
ÔÚ´æ´¢²ã£¬SaptialHadoopÔö¼ÓÁËеĿռäË÷Òý¡£¶øÇÒË÷ÒýÊʺÏMapReduceÔËÐл·¾³¡£Í¨¹ýË÷Òý¿Í·þÁËHadoop½öÖ§³ÖÎÞË÷Òý¶ÑÎļþµÄÏÞÖÆ¡£ÔÚHadoopÉÏÖ±½ÓÔËÓô«Í³µÄ¿Õ¼äË÷Òý¾ßÓÐÁ½´óÌôÕ½¡£Ò»·½Ã洫ͳ¿Õ¼äË÷ÒýÊDzÉÓùý³Ì±à³Ì·¶Ê½£¬¶øSpatialHadoop²ÉÓõÄÊÇMapReduce±à³Ì·¶Ê½£»ÁíÒ»·½Ã洫ͳË÷Òý²ÉÓþֲ¿Îļþϵͳ£¬¶øSpatialHadoop²ÉÓõÄÊÇHadoop·Ö²¼Ê½Îļþϵͳ£¬ÕâÑùµÄ·½Ê½ÓÐÒ»¸öÄÚÔÚµÄÏÞÖÆ£¬Îļþ½öÒÔÒ»ÖÖ¸½¼ÓµÄ·½Ê½±»Ð´È룬ͬʱһµ©Ð´Èë¾Í²»Äܱ»Ð޸ġ£ÎªÁ˿˷þÕâЩÌôÕ½£¬SpatialHadoopͨ¹ýÁ½¼¶×éÖ¯ÆäË÷Òý£¬¼´È«¾ÖË÷ÒýºÍ¾Ö²¿Ë÷Òý¡£È«¾ÖË÷Òýͨ¹ý¼¯ÈºÖеĽڵã·Ö¸îÊý¾Ý£¬¶ø¾Ö²¿Ë÷ÒýÔÚÿһ¸ö½ÚµãÄÚ²¿¸ßЧ×éÖ¯Êý¾Ý¡£È«¾ÖºÍ¾Ö²¿Ë÷ÒýµÄ·ÖÀëÊʺÏMapReduce±àÂ뷶ʽ¡£È«¾ÖË÷ÒýÓÃÓÚ×¼±¸MapReduce¹¤×÷£¬¶ø¾Ö²¿Ë÷ÒýÓÃÓÚ´¦ÀíMapÈÎÎñ¡£½«Îļþ²ð·Ö³É¸üСµÄÎļþ£¬ÔÊÐíÿ¸öÄÚ´æ·ÖÇøË÷Òý²¢ÒÔ˳ÐòµÄ·½Ê½½«ÆäдÈëÎļþ¡£
È«¾ÖË÷Òý±£´æÔÚÖ÷½ÚµãµÄÄÚ´æÖУ¬¶øÃ¿Ò»¸ö¾Ö²¿Ë÷Òý´æ´¢ÔÚ´Ó½ÚµãµÄÎļþ¿é£¨Í¨³£Îª64M£©ÖС£SpatialHadoopÖ§³Ö¸ñÍøÎļþ[7]£¬R-tree[4]ºÍR+-tree[11]Ë÷Òý¡£Í¨¹ý·¢ÐÐеÄÎļþϵͳÃüÁîwriteSpatialFile£¨SaptialHadoopÖУ©ÎªÒѾ´æÔÚµÄÎļþ½¨Á¢Ë÷Òý£¬Óû§ÐèÒªÃ÷È·ÊäÈëÎļþ¡¢Áн¨Á¢Ë÷ÒýºÍË÷ÒýÀàÐÍ¡£
ͨ¹ýMapReduce¹¤×÷½¨Á¢Ë÷Òý¾¹ýÈý¸ö½×¶Î£¬¼´·ÖÇø£¬¾Ö²¿Ë÷ÒýºÍÈ«¾ÖË÷Òý¡£ÔÚ·ÖÇø½×¶Î£¬Ò»¸öÎļþ±»°´ÕÕ¿Õ¼ä·ÖÇø£¬Ã¿Ò»¸ö·ÖÇø°üº¬Ò»¸ö¾ØÐÎÊʺÏÒ»¸öÎļþ¿é£¨64MB£©¡£¸ñÍøË÷Òýͨ¹ýÒ»ÖµÄÍø¸ñ½øÐзÖÇø£¬¶øR-treeºÍR+-treeͨ¹ýÒ»¸ö·Ö²¼ÇåÎúµÄR-tree·ÖÇø£¬´ÓÊäÈëÎļþÖÐËæ»ú¶Áȡһ¸öÑù±¾¡¢ÅúÁ¿¼ÓÔØ´ËÑù±¾µ½ÁÙʱÄÚ´æR-tree£¬È»ºóʹÓñ߽çµÄÒ¶½Úµã·Ö¸îÕû¸öÎļþ¡£ÖµµÃ×¢ÒâµÄÊÇ£¬ÔÚ¸ñÍøºÍR+-treeË÷ÒýÖУ¬µ±Ã¿Ò»¸ö¼Ç¼±»Ð´Èë×îºÏÊʵķÖÇøÊ±£¬Èç¹ûÖØµþ¶à¸ö·ÖÇø£¬ÄÇôÕâЩ¼Ç¼¿ÉÄܱ»¸´ÖÆ[4]¡£ÔÚ²éѯ¹ý³ÌÖУ¬Öظ´µÄ¼Ç¼»á±»ºóÆÚ´¦Àíµô£¬ÕâÑù¾Í±ÜÃâÁ˲úÉúÖØ¸´µÄ½á¹û¡£ÔÚ¾Ö²¿Ë÷Òý½×¶Î£¬¸ù¾Ý±»¹¹ÔìµÄË÷ÒýÀàÐÍ£¬Ã¿Ò»¸ö·ÖÇø¶ÀÁ¢´´½¨²¢Í¬²½µ½Ò»¸öHDFS¿éÎļþÖУ¬Õâ¸ö¿éÎļþÐèÒª±ê¼Ç·ÖÇøµÄMBR¡£Òò´Ë£¬Ã¿Ò»¸ö·ÖÇø¶¼ÓÐÒ»¸ö¹Ì¶¨´óСµÄÎļþ£¨64M£©£¬¾Ö²¿Ë÷ÒýÔÚÒ»´ÎÐÔдÈë´Ë±¾Ö®Ç°ÔÚÄÚ´æÖй¹½¨¡£×îºóÒ»¸ö½×¶ÎÊÇÈ«¾ÖË÷Òý¡£°üº¬¾Ö²¿Ë÷ÒýµÄÎļþ×é³ÉÒ»¸ö´óµÄÎļþ£¬È«¾ÖË÷Òýͨ¹ýËûÃǵÄMBRSÀ´½¨Á¢ËùÓзÖÇøµÄË÷Òý²¢´æ´¢ÔÚÖ÷½ÚµãµÄÖ÷´æÖС£Ò»µ©ÏµÍ³·¢Éú¹ÊÕÏ£¬È«¾ÖËùÓоͻá¸ù¾ÝÐèÒªÖØÐ½¨Á¢¡£
5¡¢MapReduce²ã
´«Í³µÄHadoop MapReduce²ãÉè¼ÆµÄÄ¿µÄÊÇΪÁË´¦Àí²»´øÓÐË÷ÒýµÄ¶ÑÎļþ¡£¶øSpatialHadoopÖеĿռä²Ù×÷ÊÇÒÔ´øÓпռäË÷ÒýµÄÎļþΪÊäÈëµÄ£¬´¦Àí·½Ê½ÊÇÓÐÇø±ðµÄ¡£´ËÍ⣬һЩ¿Õ¼ä²Ù×÷£¬Èç¿Õ¼äÁ¬½ÓµÈ£¬ÊǶԶþÔª²Ù×÷£¬ÐèÒªÁ½¸öÊäÈëÎļþ×÷ΪÊäÈëÌõ¼þ¡£ÎªÁËÄܹ»´¦ÀíÕâЩË÷ÒýÎļþ£¬SpatialHadoopÔÚMapReduce²ãÒýÈëÁËÁ½¸öеÄ×é¼þ£¬¼´SpatialFileSplitterºÍSpatialRecordReader£¬ÀûÓÃÈ«¾ÖºÍ¾Ö²¿Ë÷Òý·Ö±ð¶Ô²»Í¬µÄÊý¾Ý½øÐиßЧ·ÃÎÊ¡£
SpatialFileSplitterÐèÒªÊäÈëÒ»¸ö»òÁ½¸ö¿Õ¼äË÷ÒýÎļþ£¬³ý·ÇÓû§Ìṩ¹ýÂ˹¦ÄÜ¡£È»ºó£¬ÀûÓÃÈ«¾ÖË÷ÒýÐÞ¼ôÎļþ¿é£¬ÕâЩÐÞ¼ô¿é²»»áµ¼Ö²éѯ½á¹û£¨ÈçÍâΧ²éѯ·¶Î§£©£¬Ë÷Òý´´½¨µÄͬʱ£¬»ùÓÚ×îСÍâ°ü¾ØÐνøÐзÖÅä¡£ÔÚ½øÐÐÐèÒªÁ½¸öÊäÈëÎļþµÄ¶þÔª²Ù×÷ÖУ¬SpatialFileSplitter²ÉÓÃÁ½¸öÈ«¾ÖË÷ÒýȥѡÔñÐèÒª±»Ò»Æð´¦ÀíµÄÎļþ¿éµÄ¶Ô×飬×÷Ϊһ¸öÎļþ£¨ÀýÈ磬ÔÚ¿Õ¼äÁ¬½ÓÖнøÐеþ¼Ó·ÖÎö¿é£©¡£SpatialRecordReaderÀûÓþֲ¿Ë÷Òý£¬Í¨¹ý¾Ö²¿Ë÷Òý»ñȡһ¸ö·Ö¿éÖÐÔÊÐíµÄ¼Ç¼£¬¶ø²»ÊÇÑ»·±éÀúËùÓмǼ¡£Ëü´ÓÖ¸¶¨µÄ·ÖÇøÖжÁÈ¡¾Ö²¿Ë÷Òý£¬½«Õâ¸öË÷ÒýµÄÖ¸Õë´«µÝ¸øMapº¯Êý£¬¸Ãº¯Êýͨ¹ýÕâ¸öË÷ÒýȥѡÔñÔÚÕû¸ö¼Ç¼Öв»ÐèÒªµü´úµÄ´¦Àí¼Ç¼¡£Í¬Ê±£¬SpatialFileSplitterºÍSpatialRecordReader°ïÖú¿ª·¢Õß±àдÐí¶àÀàËÆÓÚMapReduce³ÌÐòµÄ¿Õ¼ä²Ù×÷¡£
6¡¢ÒµÎñ²ã
´æ´¢²ã½¨Á¢µÄ¿Õ¼äË÷Òý£¬ÒÔ¼°MapReduce²ãеÄ×é¼þ±£Ö¤ÁËSpatialHadoop¿ÉÒÔʵÏÖ¸ßЧµÄ¿Õ¼ä²Ù×÷¹¦ÄÜ¡£ÔÚÕâ¸öʵÀýÖУ¬±¾ÎÄչʾÁË·¶Î§²éѯ¡¢KNNºÍ¿Õ¼äÁ¬½ÓÈý¸ö°¸Àý¹¦ÄܵÄʵÏÖ¡£Õ¹Ê¾ÁËÈçºÎʹÓÃSpatialHadoopÖд洢²ãºÍMapReduce²ã¡£ÆäËûµÄ¿Õ¼ä²Ù×÷ÈçKNNÁ¬½ÓºÍ×î¶Ì·¾¶·ÖÎöÒ²Äܹ»Í¨¹ýÈçÏÂÀàËÆµÄ·½·¨ÊµÏÖ¡£
ÔÚ·¶Î§²éѯµ±ÖУ¬SpatialFileSplitterÀûÓÃÈ«¾ÖË÷Òýѡȡ½ö½ö¸²¸Ç²éѯ·¶Î§µÄÇø¿é¡£Ã¿Ò»¸ö²éѯ³öÀ´µÄÇø¿é¶¼½«Í¨¹ýSpatialRecordReaderÌáÈ¡ÔڸÿéÖеľֲ¿Ë÷Òý£¬È»ºó»ùÓÚÕâ¸öË÷ÒýÖ´ÐÐÒ»¸ö´«Í³µÄ·¶Î§²éѯȥѰÕÒÆ¥ÅäµÄ¼Ç¼¡£¶ÔÓÚ½¨Á¢Ë÷Òý¹ý³ÌÖÐÖØ¸´µÄ¼Ç¼£¬²ÉÓòο¼µã¸±±¾±ÜÃâ¼¼Êõ[2]À´È·±£Ã¿Ò»¸ö½á¹û¼Ç¼¶¼Ö»³öÏÖÒ»´Î¡£
KNN²Ù×÷ÔËÓÃÓÚÁ½´Îµü´ú²Ù×÷µ±ÖС£µÚÒ»´Îµü´ú£¬SpatialFileSplitterÀûÓÃÈ«¾ÖË÷Òýѡȡµ½°üº¬²éѯµãµÄÇø¿é¡£Í¨¹ýSpatialRecordReaderÀ´ÌáÈ¡³öÕâ¸öÇø¿éÖеľֲ¿Ë÷Òý£¬È»ºóÔÚÕâ¸öÇø¿éÖвéÕÒKNN¡£ÎªÁËÑéÖ¤²éѯµÄ½á¹ûÊÇ·ñÕýÈ·£¬ÒÔ²éѯµã×÷ΪԲÐÄ£¬ÒÔKthÁÚ½üÄ¿±ê×÷Ϊ°ë¾¶£¬»æÖÆÒ»¸ö²âÊÔÔ²¡£Èç¹û²âÊÔÔ²ÔÚ´¦ÀíµÄÇø¿éÖÐÍêÈ«·ûºÏ£¬ÄÇô½á¹û¾ÍÈÏΪÊÇÕýÈ·µÄ¡£Èç¹û²âÊÔÔ²¸²¸Çµ½ÁËÆäËûµÄ·ÖÇø£¬½«Í¨¹ýµÚ¶þ¸öµü´úÀ´´¦ÀíÕâÐ©ÖØµþÇøÓò¡£
¶ÔÓÚ¿Õ¼äÁ´½Ó£¬SpatialFileSplitterÔÚÁ½¸öÎļþÖÐÀûÓÃÁ½¸öÈ«¾ÖË÷ÒýÈ¥²éÕÒËùÓÐÖØµþÇøÓò×é¶Ô¡£Ã¿Ò»¶Ô¶¼Í¨¹ýSpatialRecordReaderÀ´´¦Àí£¬SpatialRecordReader²ÉÓþֲ¿Ë÷ÒýÈ¥²éÕÒÖØµþµÄ¼Ç¼¡£
7¡¢ÑÝʾÇé¾°
±¾ÎÄչʾÁËÒ»¸öSpatialFileSplitterÔÐÍϵͳ£¨http://spatialhadoop.cs.umn.edu/£©£¬¸Ãϵͳ»·¾³Îª¾ßÓУ²£°¸ö½ÚµãµÄAmazon EC2¼¯Èº¡£²ÉÓÃÁËÁ½·ÝÊý¾Ý¼¯£¬°üÀ¨Tiger[12]Îļþ¼¯ºÍOpenStreetMap[10]¡£¶ÔÒÑTigerÎļþ¼¯£¬±¾ÎÄÌáÈ¡³öÁËÈý¸öÎļþ°üÀ¨ÃÀ¹úµÄÏÖÓеĵÀ·¶Î¡¢ºÓÁ÷ºÍºþ²´¡£OpenStreetMap£¬±¾ÎÄÌáÈ¡ÁËÈ«ÇòÏÖÓеĵÀ·¶Î¡¢Èȵ㡢¹«Ô°¡¢½¨Öþ·¶Î§µÈ¡£²ÎÓëÕß¿ÉÒÔͨ¹ýǰ¶Ë»úÆ÷£¨ÀýÈ磬±Ê¼Ç±¾£©·ÃÎÊAmazon EC2£¬¶øËùÓеĴ¦Àí¶¼ÔÚ¼¯Èººó¶ËÖ´ÐС£

7.1 ǰ¶Ë
ͼ3չʾÁËϵͳµÄǰ¶Ë£¬Ö÷Òª°ïÖúÓû§ºÍ¹ÜÀíÕßÓëSpatialHadoop½»»¥£¬ÌṩÁ˲éѯºÍ¿ÉÊÓ»¯¹¤¾ß¡£×ó±ßÓÐÒ»¸öÑ¡Ôñ¿Ø¼þ£¬ÏÔʾϵͳ¼ÓÔØµÄÎļþÁÐ±í¡£Óû§¿ÉÒÔͨ¹ý¼ÓÔØ°´Å¥ÉÏ´«ÐµÄÎļþ£¬Ò²¿ÉÒÔͨ¹ýɾ³ý°´Å¥È¥³ýÒѾ´æÔÚµÄÎļþ¡£Èç¹ûÒ»¸öÎļþ±»Ñ¡ÖУ¬ÎļþÖеÄÄÚÈÝ»áÔÚÓÒ²àÆÁÄ»ÖÐÏÔʾ¡£µ±¸ü¶àµÄÎļþ±»Ñ¡ÖÐʱ£¬ËûÃǽ«ÒÔ²»Í¬µÄÑÕÉ«ÏÔʾÒÔ¼ÓÒÔÇø·Ö¡£Èçͼ3Ëùʾ£¬À¶É«ºÍºìÉ«µÄÏß×´µØÎï·Ö±ð´ú±íÃÀ¹úµÄË®Ì壨ºÓÁ÷ºÍºþ²´£©ºÍµÀ·¡£È»ºóÓû§¿ÉÒÔͨ¹ýÉÏÃæµÄ¹¤¾ßÌõÖ´Ðвéѯ£¨·¶Î§²éѯ¡¢KNN»òÕ߿ռäÁ¬½Ó£©²Ù×÷¡£Ç°¶ËչʾÁ˲éѯִÐйý³Ì£¬µ±²éѯ½áÊøÊ±£¬Æä½á¹û»áÔÚǰ¶Ë½øÐÐÏÔʾ¡£

7.2 ÒµÎñ²Ù×÷
Ê×ÏÈ£¬Óû§Í¨¹ýÑ¡ÔñÒ»¸öÎļþ²¢µã»÷ÈÃËüÔÚÆÁÄ»ÖÐÏÔʾ¡£ÏÔʾ¹ý³ÌÊÇͨ¹ýMapReduce¹¤×÷½«Ñ¡ÔñÎļþÖеÄÊý¾ÝÉú³ÉÁËÒ»¸±Í¼Ïñ½øÐÐÊä³ö¡£Éú³ÉµÄͼÏñ½ö½ö°üº¬ÁËÎļþÖеĿռäÊôÐÔ£¬²¢¸ù¾ÝÊý¾ÝÀàÐÍ£¨µã£¬¾ØÐλòÕß¶à±ßÐΣ©»æÖƼǼ¡£Èçͼ3Ëùʾ£¬È«¾ÖµÄË÷Òý±ß½çÒ²¿ÉÒÔÔÚÆÁÄ»ÖÐÏÔʾ£¬±ãÓÚÓû§½øÐÐË÷Òýչʾ¡£ÏµÍ³ÔÊÐíÓû§¶Ô¸ñÍøË÷ÒýºÍR-treeË÷Òý½øÐжԱȣ¬»á·¢ÏÖ¸ñÍøË÷Òý¸üÊʺÏÒ»Öµķֲ¼Ê½Êý¾Ý¼¯£¬¶øR-treeË÷Òý¸üÊʺϲ»Ò»ÖµÄÊý¾Ý¡£ÓÉÓÚÊý¾Ý²»Ò»Ö£¨²»¹æÔò£©£¬Í¼Öеı߽çÊÇÓÐR-treeË÷ÒýÉú³ÉµÄ¡£ÏÔʾË÷Òý±ß½çÊÇ¿ÉÑ¡µÄ£¬¶øÇÒ½öÏÔʾϵͳÄÚ²¿¡£
Óû§Ñ¡ÖÐÒ»¸öÎļþ£¬¾Í¿ÉÒÔͨ¹ýÑ¡ÔñÉÏÃæ¹¤¾ßÌõÖеIJÙ×÷À´Ö´ÐÐÒ»¸ö²éѯ¡£¿ÉÓõIJÙ×÷°üÀ¨·¶Î§²éѯ¡¢KNNºÍ¿Õ¼äÁ¬½Ó¡£ÆäÖÐÖ»ÓпռäÁ¬½Ó²Ù×÷ÐèҪѡÔñÁ½¸öÎļþÖ´ÐжþÔª²Ù×÷¡£Èçͼ4ËùÒÔ£¬Óû§Ñ¡ÔñÒ»¸ö²Ù×÷ºó£¬»áµ¯³öÒ»¸ö¶Ô»°¿ò£¬Óû§¿ÉÒÔÌîд²éѯ²ÎÊýºÍÊäµÄ³öÎļþÃû³Æ¡£¶ÔÓÚ·¶Î§²éѯ£¬Óû§ÐèÒªÌṩ²éѯ·¶Î§µÄÁ½¸ö½Çµã¡£¶ÔÓÚKNN£¬ÐèÒªÌṩ²éѯµãºÍÁÚ½ü¶ÔÏóµÄ¸öÊý£¨k£©¡£¶ÔÒѿռäÁ¬½Ó£¬Ö÷ÐèÌṩÁ¬½ÓµÄ²Ù×÷´Ê£¬Ä¬ÈÏΪµþ¼Ó¡£Ò»¸öÓÐȤµÄÀý×ÓÊÇͨ¹ýÁ¬½Ó¹«Ô°ºÍºþ²´È¥²éÕÒËùÓй«Ô°Öк¬Óкþ²´µÄ¹«Ô°£¬²¢ÔÚÆÁÄ»ÉÏÏÔʾ½á¹û¡£Èçͼ4Ëùʾ£¬ÉèÖÃÍê²éѯ²ÎÊýÖ®ºó£¬Ç°¶Ë»áÏÔʾSpatialHadoopÖвéѯ¿Õ¼äÓï¾äдÈëµÄ¹ý³Ì¡£Ò»µ©Óû§ÏòϵͳÌá½»Á˲éѯÇëÇó£¬Ç°¶Ë½«»á°Ñ²éѯÌá½»µ½ºǫ́½øÐд¦Àí¡£Èçͼ5ËùÒÔ£¬Óû§¿ÉÒÔ¿´µ½ÏµÍ³ºǫ́²éѯ´¦ÀíµÄÕû¸ö½ø³Ì¡£ÔÚËùÓеŤ×÷Íê³É֮ǰ£¬Õâ¸ö¹ÜÀí½çÃæÁгöÁËËùÓÐÕýÔÚÔËÐеŤ×÷µÄ½øÕ¹¡£Óû§Ò²¿ÉÒÔÌá½»ËæºóµÄ²éѯ£¬ÕâЩ²Ù×÷Ò²»áͬʱÔÚºǫ́½øÐС£Ò»µ©Ò»¸ö²éѯִÐгɹ¦£¬Æä½á¹û½«»áÔÚÆÁÄ»ÉÏչʾ¡£
7.3 ÓëHadoop¶Ô±È

ΪÁ˶ԱÈSpatialHadoopºÍHadoop£¬±¾ÎÄÓִÁËÒ»¸öÓµÓÐ20¸ö½ÚµãµÄHadoop¼¯Èº¡£Óû§¿ÉÒÔÔÚÁ½¸ö¼¯Èº£¨Hadoop¼¯ÈººÍSpatialHadoop£©ÉÏÖ´ÐÐÏàͬµÄ²éѯ£¬Í¬Ê±¹Û²ìÁ½ÕßµÄÖ´Ðнø¶È¡£ÓÉÓÚSpatialHadoop±£ÁôÁË´«Í³HadoopµÄ¹¦ÄÜ£¬ËùÒԷǿռä²éѯҲ¿ÉÒÔÔÚSpatialHadoopÉÏûÓÐÈκÎÌõ¼þµÄÔËÐС£ÕâÑùÓû§¿ÉÒÔ²âÊԷǿռä²éѯ¹¦ÄÜÀ´±È½ÏÁ½¸ö¼¯ÈºµÄÐÔÄÜ¡£
7.4 °²×°ºÍÅäÖÃ
SpatialHadoopÊÇ¿ªÔ´´úÂëµÄ£¬ÔÚÍøÂçÉÏ¿ÉÒÔ¹«¿ª»ñÈ¡¡£ÔÚʵÀýÖУ¬ÌṩÁË¿ìËÙ°²×°Ö¸ÄÏ£¬ÈçºÎÔÚµ¥»úÉÏ¿ìËÙ°²×°ºÍÔËÐÐSpatialHadoop¡£µÚÒ»²½ÏÂÔØ°²×°Ñ¹Ëõ°ü²¢½âѹµ½±¾µØ´ÅÅÌ£»È»ºó£¬Í¨¹ý±à¼ÅäÖÃÎļþÅäÖð²×°¡£Ö®ºó£¬Æô¶¯SpatialHadoop·þÎñ£¬Ò»Ð©²Ù×÷°¸Àý¾Í¿ÉÒÔºÍÓë·þÎñ½»»¥²¢Ö´ÐС£ÕâЩ²½Öè¿ÉÒÔͨ¹ýSpatialHadoop¹Ù·½ÍøÒ³»ñµÃ¸ü¶àÐÅÏ¢£¨http://spatialhadoop.cs.umn.edu/£©£¬Óû§¿ÉÒÔ¿´µ½¡£ |