Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Modeler   Code  
»áÔ±   
 
   
 
 
     
   
 ¶©ÔÄ
  ¾èÖú
´´½¨ MapReduce ²éѯÀ´´¦ÀíÌØ¶¨ÀàÐ͵ÄÊý¾Ý
 
×÷ÕߣºMartin C. Brown À´Ô´£ºIBM ·¢²¼ÓÚ 2016-1-28
  12294  次浏览      27
 

²»Í¬³¡¾°µÄģʽºÍʾÀý

MapReduce ´¦ÀíΪ´¦ÀíºÍ¹¹½¨²»Í¬ÀàÐ͵IJéѯ´´½¨ÁËÒ»ÕûÌ×з¶ÀýºÍ½á¹¹¡£È»¶ø£¬Òª×î³ä·ÖµØÀûÓà Hadoop£¬Òâζ×ÅÒª±àдºÏÊ浀 MapReduce ²éѯÀ´´¦ÀíÐÅÏ¢¡£±¾ÎĽéÉÜÐí¶à²»Í¬µÄ³¡¾°£¬ÆäÖаüº¬ÈçºÎ¿ª·¢²»Í¬ÀàÐ͵IJéѯµÄʳÆ×ʽʾÀý¡£

¸ß¼¶Îı¾´¦Àí

´¦ÀíÎı¾ÊÇ MapReduce Á÷³ÌµÄÒ»ÖÖ³£¼ûÓ÷¨£¬ÒòΪÎı¾´¦ÀíÏà¶Ô¸´ÔÓÇÒÊÇ´¦ÀíÆ÷×ÊÔ´Ãܼ¯µÄ´¦Àí¡£»ù±¾µÄ×ÖÊýͳ¼Æ³£³£ÓÃÓÚÑÝʾ Haddoop ´¦Àí´óÁ¿Îı¾ºÍ»ù±¾»ã×Ü´óÌåÄÚÈݵÄÄÜÁ¦¡£

Òª»ñµÃ×ÖÊý£¬½«Îı¾´ÓÒ»¸öÊäÈëÎļþÖвð·Ö£¨Ê¹ÓÃÒ»¸ö»ù±¾µÄ string tokenizer£©Îª¸÷¸ö°üº¬¼ÆÊýµÄµ¥´Ê£¬²¢Ê¹ÓÃÒ»¸ö Reduce À´¼ÆËãÿ¸öµ¥´ÊµÄÊýÁ¿¡£ÀýÈ磬´Ó¶ÌÓï the quick brown fox jumps over the lazy dog ÖУ¬Map ½×¶ÎÉú³ÉÇåµ¥ 1 ÖеÄÊä³ö¡£

Çåµ¥ 1. Map ½×¶ÎµÄÊä³ö

the, 1
quick, 1
brown, 1
fox, 1
jumps, 1
over, 1
the, 1
lazy, 1
dog, 1

Reduce ½×¶ÎÈ»ºóºÏ¼ÆÃ¿¸öΩһµÄµ¥´Ê³öÏֵĴÎÊý£¬µÃµ½Çåµ¥ 2 ÖÐËùʾµÄÊä³ö¡£

Çåµ¥ 2. Reduce ½×¶ÎµÄÊä³ö

the, 2
quick, 1
brown, 1
fox, 1
jumps, 1
over, 1
lazy, 1
dog, 1

¾¡¹Ü´Ë·½·¨ÊÊÓÃÓÚ»ù±¾µÄ×ÖÊýͳ¼Æ£¬µ«Äú³£³£Ï£Íûʶ±ðÖØÒªµÄ¶ÌÓï»òµ¥´ÊµÄ³öÏÖ¡£ÀýÈ磬»ñÈ¡ Amazon É϶Բ»Í¬Ó°Æ¬ºÍÊÓÆµµÄÆÀÂÛ¡£

ʹÓÃÀ´×Ô Stanford University ´óÊý¾ÝÏîÄ¿µÄÐÅÏ¢£¬Äú¿ÉÒÔÏÂÔØÓ°Æ¬ÆÀÂÛÊý¾Ý£¨²Î¼û ²Î¿¼×ÊÁÏ£©¡£¸ÃÊý¾Ý°üº¬£¨Amazon Éϱ¨¸æµÄ£©Ô­Ê¼ÆÀÂ󵀮À·ÖºÍÓÐÓÃÐÔ£¬ÈçÇåµ¥ 3 ÖÐËùʾ¡£

Çåµ¥ 3. ÏÂÔØÓ°Æ¬ÆÀÂÛÊý¾Ý

product/productId: B003AI2VGA
review/userId: A3QYDL5CDNYN66
review/profileName: abra "a devoted reader"
review/helpfulness: 0/0
review/score: 2.0
review/time: 1229040000
review/summary: Pretty pointless fictionalization
review/text: The murders in Juarez are real. This movie is a badly acted
fantasy of revenge and holy intercession. If there is a good movie about
Juarez, I don't know what it is, but it is not this one.

Çë×¢Ò⣬¾¡¹ÜÆÀÂÛÕ߸øÓ°Æ¬´òÁË 2 ·Ö£¨1 Ϊ×î²î£¬5 Ϊ×îºÃ£©£¬µ«ÆÀÂÛÄÚÈݽ«´ËӰƬÃèÊöΪһ²¿·Ç³£²îµÄӰƬ¡£ÎÒÃÇÐèÒªÒ»¸öÖÃÐÅ¶ÈÆÀ·Ö£¬ÒÔ±ãÄܹ»Á˽âËù¸øµÄÆÀ·ÖÓëʵ¼ÊµÄÆÀÂÛÊÇ·ñ±Ë´ËÆ¥Åä¡£

Ðí¶à¹¤¾ß¿ÉÓÃÓÚÖ´Ðи߼¶Æô·¢Ê½·ÖÎö£¬µ«»ù±¾µÄ´¦Àí¿ÉʹÓÃÒ»¸ö¼òµ¥µÄË÷Òý»òÕýÔò±í´ïʽÀ´ÊµÏÖ¡£È»ºó£¬ÎÒÃÇ¿Éͳ¼ÆÕýÃæºÍ¸ºÃæÕýÔò±í´ïʽƥÅäÊýÀ´»ñµÃÒ»²¿Ó°Æ¬µÄ·ÖÊý¡£

ͼ 1. ͳ¼ÆÕýÃæºÍ¸ºÃæÕýÔò±í´ïʽƥÅäÊýÀ´»ñµÃÒ»²¿Ó°Æ¬µÄ·ÖÊý

¸ÃͼÏÔʾÁËÈçºÎ´ÓԭʼÊý¾ÝµÄµ¥´Ê·ÖÊýÀ´»ñµÃӰƬ·ÖÊý

¶ÔÓÚ Map ²¿·Ö£¬Í³¼ÆÓ°Æ¬ÆÀÂÛÖи÷¸öµ¥´Ê»ò¶ÌÓïµÄÊýÁ¿£¬ÎªÕýÃæºÍ¸ºÃæÆÀ¼ÛÌṩµ¥¸ö¼ÆÊý¡£Map ²Ù×÷´Ó²úÆ·ÆÀÂÛÖÐͳ¼ÆÓ°Æ¬µÄ·ÖÊý£¬Reduce ²Ù×÷È»ºó°´²úÆ· ID »ã×ÜÕâЩ·ÖÊý£¬ÒÔÌṩÕýÃæ»ò¸ºÃæµÄÆÀ·Ö¡£Òò´Ë Map ÀàËÆÓÚÇåµ¥ 4¡£

Çåµ¥ 4. ΪÕýÃæºÍ¸ºÃæÆÀÂÛÌṩµ¥¸ö¼ÆÊýµÄ Map º¯Êý

// List of positive words/phrases 
static String[] pwords = {"good","excellent","brilliant movie"};
// List of negative words/phrases
static String[] nwords = {"poor","bad","unwatchable"};


int count = 0;
for (String word : pwords) {
String REGEX = "\\b" + word + "\\b";
Pattern p = Pattern.compile(REGEX);
Matcher m = p.matcher(INPUT);
while(m.find()) {
count++;

}
for (String word : nwords) {
String REGEX = "\\b" + word + "\\b";
Pattern p = Pattern.compile(REGEX);
Matcher m = p.matcher(INPUT);
while(m.find()) {
count--;
}
}

output.collect(productId, count);

Reduce È»ºó¿ÉÏñ´«Í³µÄÄÚÈÝÇóºÍÄÇÑù¼ÆËã¡£

Çåµ¥ 5. °´²úÆ· ID ¶ÔÕýÃæºÍ¸ºÃæÆÀÂÛÇóºÍµÄ Reduce º¯Êý

public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {

public void reduce(Text key,
Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
context.write(key, new IntWritable(sum));
}
}

½á¹ûÊÇÆÀÂÛµÄÖÃÐŶȷÖÊý¡£¿ÉÒÔÀ©Õ¹µ¥´ÊÁбíÀ´°üº¬ÄúÏëҪƥÅäµÄ¶ÌÓï¡£

¶ÁÈ¡ºÍдÈë JSON Êý¾Ý

JSON ÒѳÉΪһÖÖʵÓõÄÊý¾Ý½»»»¸ñʽ¡£ËüµÄʵÓÃÐÔÒ»¶¨³Ì¶ÈÉÏÔ´ÓÚËüµÄ¼òµ¥ÐÔÖʺͽṹ£¬ÒÔ¼°ÔÚÈç´Ë¶àµÄÓïÑԺͻ·¾³ÖнâÎöµÄÇáËÉÐÔ¡£

ÔÚ½âÎö´«ÈëµÄ JSON Êý¾Ýʱ£¬×î³£¼ûµÄ¸ñʽÊÇÿ¸ö·ûºÅÊäÈëÐÐÒ»Ìõ JSON ¼Ç¼¡£

Çåµ¥ 6. ÿ¸ö·ûºÅÊäÈëÐÐÒ»Ìõ JSON ¼Ç¼

{ "productId" : "B003AI2VGA", "score": 2.0, "text" : """}
{ "productId" : "B007BI4DAT", "score": 3.4, "text" : """}
{ "productId" : "B006AI2FDH", "score": 4.1, "text" : """}

´Ë´úÂë¿Éͨ¹ýʹÓúÏÊʵÄÀࣨ±ÈÈç GSON£©½«´«ÈëµÄ×Ö·û´®×ª»»Îª JSON ¶ÔÏóÀ´ÇáËɽâÎö¡£½«´Ë·½·¨ÓÃÓÚ GSON ʱ£¬Äú½«ÐèҪȥÐòÁл¯µ½Ò»¸öÔ¤ÏÈÈ·¶¨µÄÀàÖС£

Çåµ¥ 7. È¥ÐòÁл¯µ½Ò»¸öÔ¤ÏÈÈ·¶¨µÄÀàÖÐ

class amazonRank {
private String productId;
private float score;
private String text;
amazonRank() {
}
}

½âÎö´«ÈëµÄÎı¾£¬ÈçÏÂËùʾ¡£

Çåµ¥ 8. ½âÎö´«ÈëµÄÎı¾

 public void map(Object key, Text value, Context context)
throws IOException, InterruptedException {
try {

amazonRank rank = gson.fromJson(value.toString(),amazonRank.class);
...

ҪдÈë JSON Êý¾Ý£¬¿ÉÖ´ÐÐÏà·´µÄ²Ù×÷¡£´´½¨ÄúÏëÒªÓë MapReduce ¶¨ÒåÄÚµÄ JSON Êä³öÆ¥ÅäµÄÊä³öÀ࣬ȻºóʹÓà GSON Àཫ´Ëת»»Îª´Ë½á¹¹µÄÒ»ÖÖ JSON ±íʾ¡£

Çåµ¥ 9. дÈë JSON Êý¾Ý

class recipeRecord {
private String recipe;
private String recipetext;
private int recipeid;
private float calories;
private float fat;
private float weight;
recipeRecord() {
}
}

ÏÖÔÚÄú¿ÉÔÚÊä³öÆÚ¼äÌî³ä¶ÔÏóµÄÒ»¸öʵÀý£¬½«Ëüת»»Îªµ¥Ìõ JSON ¼Ç¼¡£

Çåµ¥ 10. ÔÚÊä³öÆÚ¼äÌî³ä¶ÔÏóµÄÒ»¸öʵÀý

recipeNutrition recipe = new recipeRecord(); 
recipe.recipeid = key.toString();
recipe.calories = sum;

Gson json = new Gson();
output.collect(key, new Text(json.toJson(recipe)));

Èç¹ûÄúÒªÔÚ Hadoop ´¦Àí×÷ÒµÖÐʹÓÃÒ»¸öµÚÈý·½¿â£¬ÇëÈ·±£½«¿â JAR ÎļþÓë MapReduce ´úÂë°üº¬ÔÚÒ»Æð£º$ jar -cvf recipenutrition.jar -C recipenutrition/* google-gson/gson.jar¡£

¾¡¹ÜÔÚ Hadoop MapReduce ´¦ÀíÆ÷Ö®Í⣬µ«ÁíÒ»ÖÖÌæ´ú·½°¸ÊÇʹÓà Jaql£¬Ëü½«Ö±½Ó½âÎö²¢´¦Àí JSON Êý¾Ý¡£

ºÏ²¢Êý¾Ý¼¯

Ò»¸ö MapReduce ×÷ÒµÖÐͨ³£Ö´ÐÐ 3 ÖÖÀàÐ͵ĺϲ¢£º

×éºÏ¶à¸ö¾ßÓÐÏàͬ½á¹¹µÄÎļþµÄÄÚÈÝ¡£

×éºÏ¶à¸öÄúÏëÒª×éºÏµÄ¾ßÓÐÀàËÆ½á¹¹µÄÎļþµÄÄÚÈÝ¡£

Áª½ÓÀ´×Ô¶à¸öÀ´Ô´µÄÓëÒ»¸öÌØ¶¨ ID »ò¹Ø¼ü´ÊÏà¹ØµÄÊý¾Ý¡£

µÚÒ»¸öÑ¡Ïî×îºÃ£¬ÔÚµäÐ굀 MapReduce ×÷ÒµÍⲿ´¦Àí£¬ÒòΪËü¿ÉʹÓà Hadoop Distributed File System (HDFS) getmerge ²Ù×÷»òij¸öÀàËÆ²Ù×÷Íê³É¡£´Ë²Ù×÷½ÓÊܵ¥¸öĿ¼×÷ΪÄÚÈݲ¢Êä³öµ½Ò»¸öÖ¸¶¨Îļþ¡£ÀýÈ磬$ hadoop fs -getmerge srcfiles megafile ½« srcfiles Ŀ¼ÖеÄËùÓÐÎļþºÏ²¢µ½Ò»¸öÎļþÖУºmegafile¡£

ºÏ²¢ÀàËÆÎļþ

ÒªºÏ²¢ÀàËÆµ«²»µÈͬµÄÎļþ£¬Ö÷ÒªÎÊÌâÔÚÓÚÈçºÎʶ±ðÊäÈëʱʹÓõĸñʽÒÔ¼°ÈçºÎÖ¸¶¨Êä³öµÄ¸ñʽ¡£ÀýÈ磬¸ø¶¨Îļþ name, phone, count ºÍµÚ¶þ¸öÎļþ name, email, phone, count£¬ÄúÒª¸ºÔðÈ·¶¨ÄĸöÎļþÊÇÕýÈ·µÄ²¢Ö´ÐÐ Map À´Éú³ÉËùÐèµÄ½á¹¹¡£¶ÔÓÚ¸ü¸´ÔӵļǼ£¬Äú¿ÉÄÜÐèÒªÔÚ Map ½×¶Î¶Ô°üº¬ºÍ²»°üº¬¿ÕÖµµÄ×Ö¶ÎÖ´Ðиü¸´Ôӵĺϲ¢£¬ÒÔÉú³ÉÐÅÏ¢¡£

ÊÂʵÉÏ£¬Hadoop ²»ÊǴ˹ý³ÌµÄÀíÏëÑ¡Ôñ£¬³ý·ÇÄú»¹½«Ëü×÷Ϊ¼ò»¯¡¢Í³¼Æ»ò»¯¼òÐÅÏ¢µÄÒ»¸ö»ú»á¡£Ò²¾ÍÊÇ˵£¬Äúʶ±ð´«Èë¼Ç¼µÄÊýÁ¿£¬ÓÐÄÄЩ¿ÉÄܵĸñʽ£¬²¢ÔÚÄúÏëҪѡÔñµÄ×Ö¶ÎÉÏÖ´ÐÐ Reduce¡£

Áª½Ó

¾¡¹ÜÓÐһЩDZÔڵĽâ¾ö·½°¸À´Ö´ÐÐÁª½Ó£¬µ«ËüÃdz£³£ÒÀÀµÓÚÒÔÒ»Öֽṹ»¯·½Ê½´¦ÀíÐÅÏ¢£¬È»ºóʹÓô˽ṹȷ¶¨¶ÔÊä³öÐÅÏ¢×öʲô¡£

¾ÙÀý¶øÑÔ£¬¸ø¶¨Á½Ìõ²»Í¬µÄÐÅÏ¢ÏßË÷£¨±ÈÈçµç×ÓÓʼþµØÖ·¡¢·¢Ë͵ĵç×ÓÓʼþÊýÁ¿£¬ÒÔ¼°½ÓÊյĵç×ÓÓʼþµØÖ·ÊýÁ¿£©£¬Ä¿µÄÔÚÓÚ½«Êý¾ÝºÏ²¢µ½Ò»ÖÖÊä³ö¸ñʽÖС£ÕâÊÇÊäÈëÎļþ£ºemail, sent-count ºÍ email, received-count¡£Êä³öӦΪ´Ë¸ñʽ£ºemail, sent-count, received-count¡£

´¦Àí´«ÈëµÄÎļþ²¢ÒÔ²»Í¬·½Ê½Êä³öÄÚÈÝ£¬ÒÔ±ã¿ÉÒÔ²»Í¬·½Ê½·ÃÎʺÍÉú³ÉÎļþºÍÊý¾Ý¡£È»ºóÒÀ¿¿ Reduce º¯ÊýÀ´Ö´Ðл¯¼ò¡£ÔÚ´ó¶àÊýÇé¿öÖУ¬Õ⽫ÊÇÒ»¸ö¶à½×¶Î¹ý³Ì£º

Ò»¸ö½×¶Î´¦Àí ¡°ÒÑ·¢Ë͵ġ± µç×ÓÓʼþ£¬ÒÔ email, fake#sent ÐÎʽÊä³öÐÅÏ¢

×¢Ò⣺ÎÒÃÇʹÓÃαǰ׺À´µ÷Õû˳Ðò£¬ÒÔ±ãÊý¾Ý¿É°´Î±Ç°×ºÀ´ºË¶Ô£¬¶ø²»°´ÊÕµ½µÄǰ׺À´ºË¶Ô¡£´Ë×ö·¨ÔÊÐíÊý¾Ý°´Ðé¼Ù¡¢°µº¬µÄ˳ÐòÁª½Ó¡£

Ò»¸ö½×¶Î´¦Àí ¡°ÒÑ·¢Ë͵ġ± µç×ÓÓʼþ£¬ÒÔ email, received ÐÎʽÊä³öÐÅÏ¢¡£

ÔÚ Map º¯Êý¶ÁÈ¡Îļþʱ£¬ËüÉú³ÉһЩÐС£

Çåµ¥ 11. Éú³ÉÐÐ

dev@null.org,0#sent
dev@null.org, received

Map ʶ±ðÊäÈë¼Ç¼²¢Êä³öÒ»¸ö´øÒ»¸ö¼üµÄͳһ°æ±¾¡£Êä³ö²¢Éú³É sent#received ½á¹¹À´´¦ÀíÄÚÈÝ£¬È·¶¨¸ÃÖµÓ¦ºÏ²¢ÔÚÒ»Æð»¹ÊÇ»ã×ÜΪһ¸öµ¥´¿ÊÕµ½µÄÖµ¡£

Çåµ¥ 12. Êä³öÒ»¸ö´øÒ»¸ö¼üµÄͳһ°æ±¾

int sent = 0; 
int received = 0;
for (Text val : values) {
String strVal = val.toString();
buf.append(strVal).append(",");
if (strVal.contains("#")) {
String[] tokens = strVal.split("#");
// If the content contains a hash, assume it's sent and received
int recvthis = Integer.parseInt(tokens[0]);
int sentthis = Integer.parseInt(tokens[1]);
received = received + Integer.parseInt(recvthis);
sent = sent _ sentthis;
} else {
// Otherwise, it's just the received value
received = received + Integer.parseInt(strVal);
}
}
context.write(key, IntWritable(sendReplyCount), new IntWritable(receiveReplyCount));

ÔÚ´ËÇé¿öÏ£¬ÎÒÃÇÒÀÀµÓÚ Hadoop ±¾ÉíÄڵϝ¼òÀ´°´¸Ã¼ü¼ò»¯Êä³öÊý¾Ý£¨ÔÚ´ËÇé¿öÏ£¬¸Ã¼üΪµç×ÓÓʼþµØÖ·£©£¬¼ò»¯ÎªÎÒÃÇÐèÒªµÄÐÅÏ¢¡£ÒòΪ¸ÃÐÅÏ¢ÊÇÒÔµç×ÓÓʼþΪ¼ü£¬ËùÒԼǼ¿ÉÒÔµç×ÓÓʼþΪ¼üÀ´ÇáËɵغϲ¢¡£

ʹÓüüµÄ¼¼ÇÉ

Çë¼Çס£¬MapReduce ¹ý³ÌµÄһЩ·½Ãæ¿ÉΪÎÒÃÇËùÓá£ÔÚ±¾ÖÊÉÏ£¬MapReduce ÊÇÒ»¸öÁ½½×¶Î¹ý³Ì£º

Map ½×¶Î·ÃÎÊÊý¾Ý£¬ÌôÑ¡ÄúÐèÒªµÄÐÅÏ¢£¬È»ºóÊä³ö¸ÃÐÅÏ¢£¬Ê¹ÓÃÒ»¸ö¼üºÍ¹ØÁªµÄÐÅÏ¢¡£

Reduce ½×¶ÎʹÓÃͨÓõļü½«Ó³ÉäµÄÊý¾ÝºÏ²¢¡¢»ã×Ü»òͳ¼ÆÎªÒ»ÖÖ¸ü¼òµ¥µÄÐÎʽ£¬´Ó¶ø¼ò»¯Êý¾Ý¡£

¼üÊÇÒ»¸öÖØÒªµÄ¸ÅÄÒòΪËü¿ÉÓÃÓÚÒÔ²»Í¬·½Ê½¸ñʽ»¯ºÍ»ã×ÜÊý¾Ý¡£ÀýÈ磬Èç¹ûÄú¼Æ»®»¯¼òÓйعú¼ÒºÍ³ÇÊÐÈ˿ڵÄÊý¾Ý£¬¿ÉÒÔ½öÊä³öÒ»¸ö¼üÀ´°´¹ú¼Ò»¯¼ò»ò»ã×ÜÊý¾Ý¡£

Çåµ¥ 13. ½öÊä³öÒ»¸ö¼ü

France
United Kingdom
USA

Òª°´¹ú¼ÒºÍ³ÇÊлã×Ü£¬¼üÊǶþÕߵĸ´ºÏ°æ±¾¡£

Çåµ¥ 14. ¼üÊǹú¼ÒºÍ³ÇÊеĸ´ºÏ°æ±¾

France#Paris
France#Lyon
France#Grenoble
United Kingdom#Birmingham
United Kingdom#London

ÕâÊÇÒ»¸ö»ù±¾µÄ¼¼ÇÉ£¬¿ÉÔÚ´¦ÀíijЩÀàÐ͵ÄÊý¾ÝʱΪÎÒÃÇËùÓã¨ÀýÈç¾ßÓÐÒ»¸ö¹²Í¬¼üµÄ²ÄÁÏ£©£¬ÒòΪÎÒÃÇ¿ÉʹÓÃËüÄ£ÄâαÁª½Ó¡£´Ë¼¼ÇÉÔÚ×éºÏ²©¿ÍÎÄÕ£¨ÓµÓÐÒ»¸ö blogpostid ÒÔ±ãÓÚʶ±ð£©ºÍ²©¿ÍÆÀÂÛ£¨ÓµÓÐÒ»¸ö blogpostid ºÍ blogcommentid£©Ê±Ò²ºÜÓÐÓá£

Òª»¯¼òÊä³ö£¨ÀýÈçͳ¼Æ²©¿ÍºÍÆÀÂÛÖеÄ×ÖÊý£©£¬ÎÒÃÇÊ×ÏÈͨ¹ý Map ´¦Àí²©¿ÍÎÄÕºͲ©¿ÍÆÀÂÛ£¬µ«ÎÒÃÇÊä³öÒ»¸öͨÓÃµÄ ID¡£

Çåµ¥ 15. »¯¼òÊä³ö

blogpostid,the,quick,brown,fox
blogpostid#blogcommentid,jumps,over,the,lazy,dog

Õâ»áÃ÷ÏÔµØÊ¹ÓÃÁ½¸ö¼ü£¬½«ÐÅÏ¢Êä³öΪÁ½¸ö²»Í¬µÄÐÅÏ¢ÐС£ÎÒÃÇÒ²¿É·´×ªÕâÒ»¹ØÏµ¡£ÎÒÃÇ¿Éͨ¹ýÏòÿ¸öµ¥´ÊÌí¼ÓÆÀÂÛ ID£¬´ÓÆÀÂÛÖÐÕë¶Ô blogpostid À´Ê¶±ðµ¥´Ê¡£

Çåµ¥ 16. ·´×ª¹ØÏµ

blogpostid,the,quick,brown,fox,jumps#blogcommentid,over#blogcommentid,
the#blogcommentid,lazy#blogcommentid,dog#blogcommentid

ÔÚ´¦ÀíÆÚ¼ä£¬ÎÒÃÇ¿Éͨ¹ý²é¿´ ID ¶ø»ñÖª¸Ãµ¥´ÊÊÇ·ñ¸½¼Óµ½²©¿ÍÎÄÕ£¬ÒÔ¼°ËüÊÇ·ñ°´¸Ã¸ñʽ¸½¼Óµ½²©¿ÍÎÄÕ»òÆÀÂÛ¡£

Ä£Ä⴫ͳµÄÊý¾Ý¿â²Ù×÷

Hadoop ÔÚÕæÕýÒâÒåÉϲ»ÊÇÒ»¸öÕæÕýµÄÊý¾Ý¿â£¬ÕâÒ»¶¨³Ì¶ÈÉÏÊÇÒòΪÎÒÃÇÎÞ·¨ÖðÐÐÖ´ÐиüС¢É¾³ý»ò²åÈë¡£¾¡¹ÜÕâÔÚÐí¶àÇé¿öϲ»ÊÇÎÊÌ⣨Äú¿É¶ÔÒª´¦ÀíµÄ»î¶¯Êý¾ÝÖ´ÐÐת´¢ºÍ¼ÓÔØ£©£¬µ«ÓÐʱÄú²»Ï£Íûµ¼³ö²¢ÖØÐ¼ÓÔØÊý¾Ý¡£

Ò»ÖÖ±ÜÃâµ¼³ö²¢ÖØÐ¼ÓÔØÊý¾ÝµÄ¼¼ÇÉÊÇ£¬´´½¨Ò»¸ö±ä¸üÎļþ£¬ÆäÖаüº¬À´×Ôԭʼת´¢ÎļþµÄÒ»¸ö²îÒìÁÐ±í¡£ÏÖÔÚÎÒÃÇÔÝʱºöÂÔ´Ó SQL »òÆäËûÊý¾Ý¿âÉú³ÉÕâЩÊý¾ÝµÄ¹ý³Ì¡£Ö»ÒªÊý¾ÝÓÐÒ»¸öΩһ ID£¬ÎÒÃǾͿɽ«ËüÓÃ×÷¼ü£¬¾Í¿ÉÀûÓøüü¡£ÏÂÃæÀ´¿´Ò»¸öÀàËÆÓÚÇåµ¥ 17 µÄÔ´Îļþ¡£

Çåµ¥ 17. Ô´Îļþ

1,London
2,Paris,
3,New York

¼ÙÉèÓÐÒ»¸öÀàËÆÓÚÇåµ¥ 18 µÄ±ä¸üÎļþ¡£

Çåµ¥ 18. ±ä¸üÎļþ

1,DELETE
2,UPDATE,Munich
4,INSERT,Tokyo

×îÖյóöÁ½¸öÎļþ¾­¹ý½âÎöµÄºÏ²¢½á¹û£¬ÈçÇåµ¥ 19 Ëùʾ¡£

Çåµ¥ 19. Ô´ÎļþºÍ±ä¸üÎļþµÄºÏ²¢

2,Munich
3,New York
4,Tokyo

ÎÒÃÇÈçºÎͨ¹ý Hadoop ʵÏÖÕâÑùÒ»Öֺϲ¢£¿

ʹÓà Hadoop ʵÏִ˺ϲ¢µÄÒ»ÖÖ·½Ê½ÊÇ£¬´¦Àíµ±Ç°Êý¾Ý²¢½«Ëüת»»Îª²åÈëÊý¾Ý£¨ÒòΪËüÃǶ¼ÊDzåÈëÄ¿±êÎļþÖеÄÐÂÊý¾Ý£©£¬È»ºó½« UDPATE ²Ù×÷ת»»ÎªÐÂÊý¾ÝµÄ DELETE ºÍ INSERT ²Ù×÷¡£ÊÂʵÉÏ£¬Ê¹Óñä¸üÎļþ£¬Í¨¹ý½«ËüÐÞ¸ÄΪÇåµ¥ 20 ÖеÄÄÚÈݸüÈÝÒ×ʵÏÖ´ËÄ¿µÄ¡£

Çåµ¥ 20. ͨ¹ý Hadoop ʵÏֺϲ¢

1,DELETE
2,DELETE
2,INSERT,Munich
4,INSERT,Tokyo

ÎÊÌâÔÚÓÚ£¬ÎÒÃÇÎÞ·¨¶ÔÁ½¸öÎļþ½øÐÐÎïÀíºÏ²¢£¬µ«ÎÒÃÇ¿ÉÏàÓ¦µØ´¦ÀíËüÃÇ¡£Èç¹ûËüÊÇÒ»¸öԭʼµÄ INSERT »ò DELETE£¬ÎÒÃÇ»áÊä³öÒ»¸ö´øÓмÆÊýÆ÷µÄ¼ü¡£Èç¹ûËüÊÇ´´½¨Ð²åÈëÊý¾ÝµÄ UPDATE ²Ù×÷£¬ÎÒÃÇÏëÒªÒ»¸ö²»»á»¯¼òµÄ²»Í¬µÄ¼ü£¬ËùÒÔÎÒÃÇÉú³ÉÒ»¸öÀàËÆÇåµ¥ 21 µÄ¼ä϶ (interstitial) Îļþ¡£

Çåµ¥ 21. Éú³É¼ä϶Îļþ

1,1,London
2,1,Paris,
3,1,New York
1,-1,London
2,-1,Paris
2#NEW,Munich
4#NEW,1,Tokyo

ÔÚ Reduce ÆÚ¼ä£¬ÎÒÃÇ»ã×Üÿ¸öΩһ¼üµÄ¼ÆÊýÆ÷µÄÄÚÈÝ£¬Éú³ÉÇåµ¥ 22¡£

Çåµ¥ 22. »ã×Üÿ¸öΩһ¼üµÄ¼ÆÊýÆ÷µÄÄÚÈÝ

1,0,London
2,0,Paris,
3,1,New York
2#NEW,1,Munich
4#NEW,1,Tokyo

ÎÒÃÇÈ»ºó¿Éͨ¹ýÒ»¸ö¸¨Öú MapReduce º¯ÊýÔËÐÐÄÚÈÝ£¬Ê¹ÓÃÇåµ¥ 23 ÖÐËùʾµÄ»ù±¾½á¹¹¡£

Çåµ¥ 23. ͨ¹ýÒ»¸ö¸¨Öú MapReduce º¯ÊýÔËÐÐÄÚÈÝ

map: 
if (key contains #NEW):
emit(row)
if (count >0 ):
emit(row)

¸¨Öú MapReduce »áµÃµ½Ô¤ÆÚÊä³ö£¬ÈçÇåµ¥ 24 ÖÐËùʾ¡£

Çåµ¥ 24. ¸¨Öú MapReduce º¯ÊýµÄÔ¤ÆÚÊä³ö

3,1,New York 
2,Munich
4,1,Tokyo

ͼ 2 ÑÝʾÁËÕâ¸öÊ×Ïȸñʽ»¯ºÍ»¯¼ò¡¢È»ºó¼ò»¯Êä³öµÄÁ½½×¶Î¹ý³Ì¡£

ͼ 2. ¸ñʽ»¯¡¢»¯¼òºÍ¼ò»¯Êä³öµÄÁ½½×¶Î¹ý³Ì

ԭʼÊý¾ÝÔÚ Map ºÍ Reduce ½×¶ÎÖеõ½»¯¼òºÍÓ³Éä

Õâ¸ö¹ý³ÌÐèÒª±È´«Í³Êý¾Ý¿âÖиü¶àµÄ¹¤×÷£¬µ«ËüËùÌṩ½â¾ö·½°¸ÐèÒªµÄ¶Ô²»¶Ï¸üеÄÊý¾ÝµÄ½»»»¼òµ¥µÃ¶à¡£

½áÊøÓï

±¾ÎĽéÉÜÁËÐí¶àʹÓà MapReduce ²éѯµÄ²»Í¬³¡¾°¡£Äú¿´µ½ÁËÕâЩ²éѯÔÚ´¦Àí¸÷ÖÖÊý¾ÝÉϵÄÇ¿´ó¹¦ÄÜ£¬ÄúÏÖÔÚÓ¦Äܹ»ÔÚ×Ô¼ºµÄ MapReduce ½â¾ö·½°¸ÖÐÀûÓÃÕâЩʾÀýÁË¡£

   
12294 ´Îä¯ÀÀ       27
Ïà¹ØÎÄÕÂ

»ùÓÚEAµÄÊý¾Ý¿â½¨Ä£
Êý¾ÝÁ÷½¨Ä££¨EAÖ¸ÄÏ£©
¡°Êý¾Ýºþ¡±£º¸ÅÄî¡¢ÌØÕ÷¡¢¼Ü¹¹Óë°¸Àý
ÔÚÏßÉ̳ÇÊý¾Ý¿âϵͳÉè¼Æ ˼·+Ч¹û
 
Ïà¹ØÎĵµ

GreenplumÊý¾Ý¿â»ù´¡Åàѵ
MySQL5.1ÐÔÄÜÓÅ»¯·½°¸
ijµçÉÌÊý¾ÝÖÐ̨¼Ü¹¹Êµ¼ù
MySQL¸ßÀ©Õ¹¼Ü¹¹Éè¼Æ
Ïà¹Ø¿Î³Ì

Êý¾ÝÖÎÀí¡¢Êý¾Ý¼Ü¹¹¼°Êý¾Ý±ê×¼
MongoDBʵս¿Î³Ì
²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿âÉè¼ÆÓëÓÅ»¯
PostgreSQLÊý¾Ý¿âʵսÅàѵ
×îл¼Æ»®
DeepSeekÔÚÈí¼þ²âÊÔÓ¦ÓÃʵ¼ù 4-12[ÔÚÏß]
DeepSeek´óÄ£ÐÍÓ¦Óÿª·¢Êµ¼ù 4-19[ÔÚÏß]
UAF¼Ü¹¹ÌåϵÓëʵ¼ù 4-11[±±¾©]
AIÖÇÄÜ»¯Èí¼þ²âÊÔ·½·¨Óëʵ¼ù 5-23[ÉϺ£]
»ùÓÚ UML ºÍEA½øÐзÖÎöÉè¼Æ 4-26[±±¾©]
ÒµÎñ¼Ü¹¹Éè¼ÆÓ뽨ģ 4-18[±±¾©]

MySQLË÷Òý±³ºóµÄÊý¾Ý½á¹¹
MySQLÐÔÄܵ÷ÓÅÓë¼Ü¹¹Éè¼Æ
SQL ServerÊý¾Ý¿â±¸·ÝÓë»Ö¸´
ÈÃÊý¾Ý¿â·ÉÆðÀ´ 10´óDB2ÓÅ»¯
oracleµÄÁÙʱ±í¿Õ¼äдÂú´ÅÅÌ
Êý¾Ý¿âµÄ¿çƽ̨Éè¼Æ

²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿â
¸ß¼¶Êý¾Ý¿â¼Ü¹¹Éè¼ÆÊ¦
HadoopÔ­ÀíÓëʵ¼ù
Oracle Êý¾Ý²Ö¿â
Êý¾Ý²Ö¿âºÍÊý¾ÝÍÚ¾ò
OracleÊý¾Ý¿â¿ª·¢Óë¹ÜÀí

GE Çø¿éÁ´¼¼ÊõÓëʵÏÖÅàѵ
º½Ìì¿Æ¹¤Ä³×Ó¹«Ë¾ Nodejs¸ß¼¶Ó¦Óÿª·¢
ÖÐÊ¢Òæ»ª ׿Խ¹ÜÀíÕß±ØÐë¾ß±¸µÄÎåÏîÄÜÁ¦
ijÐÅÏ¢¼¼Êõ¹«Ë¾ PythonÅàѵ
ij²©²ÊITϵͳ³§ÉÌ Ò×ÓÃÐÔ²âÊÔÓëÆÀ¹À
ÖйúÓÊ´¢ÒøÐÐ ²âÊÔ³ÉÊì¶ÈÄ£Ðͼ¯³É(TMMI)
ÖÐÎïÔº ²úÆ·¾­ÀíÓë²úÆ·¹ÜÀí