Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Modeler   Code  
»áÔ±   
 
   
 
 
     
   
 ¶©ÔÄ
  ¾èÖú
hadoopʵÏÖ¹ºÎïÉ̳ÇÍÆ¼öϵͳ
 
×÷Õß äìÈ÷×ÓÏҵIJ©¿Í£¬»ðÁú¹ûÈí¼þ    ·¢²¼ÓÚ 2014-07-03
  8916  次浏览      29
 

1£¬É̳ǣºÊǵ¥É̼ң¬¶àÂò¼ÒµÄÉ̳Çϵͳ¡£Êý¾Ý¿âÊÇmysql£¬ÓïÑÔjava¡£

2£¬sqoop1.9.33£ºÔÚmysqlºÍhadoopÖн»»»Êý¾Ý¡£

3£¬hadoop2.2.0£ºÕâÀïÓÃÓÚÁ·Ï°µÄÊÇα·Ö²¼Ä£Ê½¡£

4£¬Íê³ÉÄÚÈÝ£ºÏ²»¶¸ÃÉÌÆ·µÄÈË»¹Ï²»¶£¬Ïàͬ¹ºÎïϲºÃµÄºÃÓÑÍÆ¼ö¡£

²½Ö裺

1£¬Í¨¹ýsqoop´ÓmysqlÖн« ¡°Óû§ÊÕ²ØÉÌÆ·¡± £¨ÕâÀïÓõÄÊÇÓû§ÊÕ²ØÉÌÆ·ÐÅÏ¢±í×÷ÎªÍÆ¼öϵͳҵÎñÉϵÄÒÀ¾Ý£¬ÒµÎñÒÀ¾Ý¿ÉÒԺܸ´ÔÓ¡£ÕâÀïÖ÷Òª½éÉÜÍÆ¼öϵͳµÄ»ù±¾Ô­Àí£¬ËùÒÔÍÆ¼öÒÀ¾ÝºÜ¼òµ¥£©µÄ±íÊý¾Ýµ¼Èëµ½hdfsÖС£

2£¬ÓÃMapReduceʵÏÖÍÆ¼öËã·¨¡£

3£¬Í¨¹ýsqoop½«ÍƼöϵͳµÄ½á¹ûд»Ømysql¡£

4£¬javaÉ̳Çͨ¹ýÍÆ¼öϵͳµÄÊý¾ÝʵÏÖ<ϲ»¶¸ÃÉÌÆ·µÄÈË»¹Ï²»¶£¬Ïàͬ¹ºÎïϲºÃµÄºÃÓÑÍÆ¼ö¡£>Á½¸ö¹¦ÄÜ¡£

ʵÏÖ£º

1£¬ÍƼöϵͳµÄÊý¾ÝÀ´Ô´£º

×ó±ßÊÇÓû§£¬ÓÒ±ßÊÇÉÌÆ·¡£Óû§Ã¿ÊÕ²ØÒ»¸öÉÌÆ·¶¼»áÉú³ÉÒ»ÌõÕâÑùµÄÐÅÏ¢£¬<ϲ»¶¸ÃÉÌÆ·µÄÈË»¹Ï²»¶£¬Ïàͬ¹ºÎïϲºÃµÄºÃÓÑÍÆ¼ö¡£>µÄÊý¾ÝÀ´Ô´¶¼ÊÇÕâÕÅ±í¡£

sqoopµ¼ÈëÊý¾Ý£¬ÕâÀïÓõÄsqoop1.9.33¡£sqoop1.9.33µÄ×ÊÁϺÜÉÙ£¬»á³öÏÖһЩ´íÎó£¬ËÑË÷²»µ½µÄ¿ÉÒÔ·¢µ½ÎÒµÄÓÊÏäkeepmovingzx@163.com¡£
´´½¨Á´½ÓÐÅÏ¢

Õâ¸ö±È½Ï¼òµ¥

´´½¨job

ÐÅÏ¢Ìî¶Ô¾Í¿ÉÒÔÁË

µ¼ÈëÊý¾ÝÖ´ÐÐ start job --jid ÉÏÃæ´´½¨³É¹¦ºó·µ»ØµÄID

µ¼Èë³É¹¦ºóµÄÊý¾Ý

2£¬eclipse¿ª·¢MapReduce³ÌÐò

ShopxxProductRecommend<ϲ»¶¸ÃÉÌÆ·µÄÈË»¹Ï²»¶>

Õû¸öÏîÄ¿·ÖÁ½²¿£¬Ò»£¬ÒÔÓû§¶ÔÉÌÆ·½øÐзÖ×飬¶þ£¬Çó³öÉÌÆ·µÄͬÏÖ¾ØÕó¡£

Ò»,µÚ1´ó²½µÄÊý¾ÝΪÊäÈë²ÎÊý¶ÔÉÌÆ·½øÐзÖ×é

Êä³ö²ÎÊý£º

¶þ£¬ÒÔµÚÒ»²½µÄÊä³öÊý¾ÝΪÊäÈëÇóÉÌÆ·µÄͬÏÖ¾ØÕó

Êä³öÊý¾Ý

µÚÒ»ÁÐÊý¾ÝΪµ±Ç°ÉÌÆ·£¬µÚ¶þÁÐΪÓëËüÏàËÆµÄÉÌÆ·£¬µÚÈýÁÐΪÏàËÆÂÊ£¨Ô½¸ßÔ½ÏàËÆ£©¡£

Õû¸ö¹ý³Ì¾ÍÍêÁË£¬ÏÂÃæ

package xian.zhang.common;
import java.util.regex.Pattern;
public class Util {
public static final Pattern DELIMITER = Pattern.compile("[\t,]");
}

package xian.zhang.core;
import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
/**
* ½«ÊäÈëÊý¾Ý userid1£¬product1 userid1£¬product2 userid1£¬product3
* ºÏ²¢³É userid1 product1,product2,product3Êä³ö
* @author zx
*
*/
public class CombinProductInUser {

public static class CombinProductMapper extends Mapper<LongWritable, Text, IntWritable, Text>{
@Override
protected void map(LongWritable key, Text value,Context context)
throws IOException, InterruptedException {
String[] items = value.toString().split(",");
context.write(new IntWritable(Integer.parseInt(items[0])), new Text(items[1]));
}
}

public static class CombinProductReducer extends Reducer<IntWritable, Text, IntWritable, Text>{
@Override
protected void reduce(IntWritable key, Iterable<Text> values,Context context)
throws IOException, InterruptedException {
StringBuffer sb = new StringBuffer();
Iterator<Text> it = values.iterator();
sb.append(it.next().toString());
while(it.hasNext()){
sb.append(",").append(it.next().toString());
}
context.write(key, new Text(sb.toString()));
}

}

@SuppressWarnings("deprecation")
public static boolean run(Path inPath,Path outPath) throws IOException, ClassNotFoundException, InterruptedException{

Configuration conf = new Configuration();
Job job = new Job(conf,"CombinProductInUser");

job.setJarByClass(CombinProductInUser.class);
job.setMapperClass(CombinProductMapper.class);
job.setReducerClass(CombinProductReducer.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(Text.class);

FileInputFormat.addInputPath(job, inPath);
FileOutputFormat.setOutputPath(job, outPath);

return job.waitForCompletion(true);

}

}

package xian.zhang.core;
import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
/**
* ½«ÊäÈëÊý¾Ý userid1£¬product1 userid1£¬product2 userid1£¬product3
* ºÏ²¢³É userid1 product1,product2,product3Êä³ö
* @author zx
*
*/
public class CombinProductInUser {

public static class CombinProductMapper extends Mapper<LongWritable, Text, IntWritable, Text>{
@Override
protected void map(LongWritable key, Text value,Context context)
throws IOException, InterruptedException {
String[] items = value.toString().split(",");
context.write(new IntWritable(Integer.parseInt(items[0])), new Text(items[1]));
}
}

public static class CombinProductReducer extends Reducer<IntWritable, Text, IntWritable, Text>{
@Override
protected void reduce(IntWritable key, Iterable<Text> values,Context context)
throws IOException, InterruptedException {
StringBuffer sb = new StringBuffer();
Iterator<Text> it = values.iterator();
sb.append(it.next().toString());
while(it.hasNext()){
sb.append(",").append(it.next().toString());
}
context.write(key, new Text(sb.toString()));
}

}

@SuppressWarnings("deprecation")
public static boolean run(Path inPath,Path outPath) throws IOException, ClassNotFoundException, InterruptedException{

Configuration conf = new Configuration();
Job job = new Job(conf,"CombinProductInUser");

job.setJarByClass(CombinProductInUser.class);
job.setMapperClass(CombinProductMapper.class);
job.setReducerClass(CombinProductReducer.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(Text.class);

FileInputFormat.addInputPath(job, inPath);
FileOutputFormat.setOutputPath(job, outPath);

return job.waitForCompletion(true);

}

}

 
package xian.zhang.core;
import java.io.IOException;
import org.apache.hadoop.fs.Path;
public class Main {

public static void main(String[] args) throws ClassNotFoundException, IOException, InterruptedException {

if(args.length < 2){
throw new IllegalArgumentException("ÒªÓÐÁ½¸ö²ÎÊý£¬Êý¾ÝÊäÈëµÄ·¾¶ºÍÊä³ö·¾¶");
}

Path inPath1 = new Path(args[0]);
Path outPath1 = new Path(inPath1.getParent()+"/CombinProduct");

Path inPath2 = outPath1;
Path outPath2 = new Path(args[1]);

if(CombinProductInUser.run(inPath1, outPath1)){
System.exit(ProductCo_occurrenceMatrix.run(inPath2, outPath2)?0:1);
}
}

}

ShopxxUserRecommend<Ïàͬ¹ºÎïϲºÃµÄºÃÓÑÍÆ¼ö>

Õû¸öÏîÄ¿·ÖÁ½²¿£¬Ò»£¬ÒÔÉÌÆ·¶ÔÓû§½øÐзÖ×飬¶þ£¬Çó³öÓû§µÄͬÏÖ¾ØÕó¡£

Ô­ÀíºÍShopxxProductRecommendÒ»Ñù

ÏÂÃæ¸½ÉÏ´úÂë

package xian.zhang.common;
import java.util.regex.Pattern;
public class Util {
public static final Pattern DELIMITER = Pattern.compile("[\t,]");
}

package xian.zhang.core;
import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
/**
* ½«ÊäÈëÊý¾Ý userid1£¬product1 userid1£¬product2 userid1£¬product3
* ºÏ²¢³É productid1 user1,user2,user3Êä³ö
* @author zx
*
*/
public class CombinUserInProduct {

public static class CombinUserMapper extends Mapper<LongWritable, Text, IntWritable, Text>{
@Override
protected void map(LongWritable key, Text value,Context context)
throws IOException, InterruptedException {
String[] items = value.toString().split(",");
context.write(new IntWritable(Integer.parseInt(items[1])), new Text(items[0]));
}
}

public static class CombinUserReducer extends Reducer<IntWritable, Text, IntWritable, Text>{
@Override
protected void reduce(IntWritable key, Iterable<Text> values,Context context)
throws IOException, InterruptedException {
StringBuffer sb = new StringBuffer();
Iterator<Text> it = values.iterator();
sb.append(it.next().toString());
while(it.hasNext()){
sb.append(",").append(it.next().toString());
}
context.write(key, new Text(sb.toString()));
}

}

@SuppressWarnings("deprecation")
public static boolean run(Path inPath,Path outPath) throws IOException, ClassNotFoundException, InterruptedException{
Configuration conf = new Configuration();
Job job = new Job(conf,"CombinUserInProduct");

job.setJarByClass(CombinUserInProduct.class);
job.setMapperClass(CombinUserMapper.class);
job.setReducerClass(CombinUserReducer.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(Text.class);

FileInputFormat.addInputPath(job, inPath);
FileOutputFormat.setOutputPath(job, outPath);

return job.waitForCompletion(true);

}

}

package xian.zhang.core;
import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import xian.zhang.common.Util;
/**
* Óû§µÄͬÏȾØÕó
* @author zx
*
*/
public class UserCo_occurrenceMatrix {
public static class Co_occurrenceMapper extends Mapper<LongWritable, Text, Text, IntWritable>{
IntWritable one = new IntWritable(1);

@Override
protected void map(LongWritable key, Text value, Context context)throws IOException, InterruptedException {

String[] products = Util.DELIMITER.split(value.toString());
for(int i=1;i<products.length;i++){
for(int j=1;j<products.length;j++){
if(i != j){
context.write(new Text(products[i] + ":" + products[j]), one);
}
}
}

}

}

public static class Co_occurrenceReducer extends Reducer<Text, IntWritable, NullWritable, Text>{
NullWritable nullKey =NullWritable.get();

@Override
protected void reduce(Text key, Iterable<IntWritable> values,Context context)
throws IOException, InterruptedException {
int sum = 0;
Iterator<IntWritable> it = values.iterator();
while(it.hasNext()){
sum += it.next().get();
}
context.write(nullKey, new Text(key.toString().replace(":", ",") + "," + sum));
}

}

@SuppressWarnings("deprecation")
public static boolean run(Path inPath,Path outPath) throws IOException, ClassNotFoundException, InterruptedException{

Configuration conf = new Configuration();
Job job = new Job(conf,"UserCo_occurrenceMatrix");

job.setJarByClass(UserCo_occurrenceMatrix.class);
job.setMapperClass(Co_occurrenceMapper.class);
job.setReducerClass(Co_occurrenceReducer.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);

job.setOutputKeyClass(NullWritable.class);
job.setOutputKeyClass(Text.class);

FileInputFormat.addInputPath(job, inPath);
FileOutputFormat.setOutputPath(job, outPath);

return job.waitForCompletion(true);
}

}

package xian.zhang.core;
import java.io.IOException;
import org.apache.hadoop.fs.Path;
public class Main {

public static void main(String[] args) throws ClassNotFoundException, IOException, InterruptedException {

if(args.length < 2){
throw new IllegalArgumentException("ÒªÓÐÁ½¸ö²ÎÊý£¬Êý¾ÝÊäÈëµÄ·¾¶ºÍÊä³ö·¾¶");
}

Path inPath1 = new Path(args[0]);
Path outPath1 = new Path(inPath1.getParent()+"/CombinUser");

Path inPath2 = outPath1;
Path outPath2 = new Path(args[1]);

if(CombinUserInProduct.run(inPath1, outPath1)){
System.exit(UserCo_occurrenceMatrix.run(inPath2, outPath2)?0:1);
}
}

}

´úÂëÔÚgithubÉÏÓÐ

git@github.com:chaoku/ShopxxProductRecommend.git

   
8916 ´Îä¯ÀÀ       29
Ïà¹ØÎÄÕÂ

»ùÓÚEAµÄÊý¾Ý¿â½¨Ä£
Êý¾ÝÁ÷½¨Ä££¨EAÖ¸ÄÏ£©
¡°Êý¾Ýºþ¡±£º¸ÅÄî¡¢ÌØÕ÷¡¢¼Ü¹¹Óë°¸Àý
ÔÚÏßÉ̳ÇÊý¾Ý¿âϵͳÉè¼Æ ˼·+Ч¹û
 
Ïà¹ØÎĵµ

GreenplumÊý¾Ý¿â»ù´¡Åàѵ
MySQL5.1ÐÔÄÜÓÅ»¯·½°¸
ijµçÉÌÊý¾ÝÖÐ̨¼Ü¹¹Êµ¼ù
MySQL¸ßÀ©Õ¹¼Ü¹¹Éè¼Æ
Ïà¹Ø¿Î³Ì

Êý¾ÝÖÎÀí¡¢Êý¾Ý¼Ü¹¹¼°Êý¾Ý±ê×¼
MongoDBʵս¿Î³Ì
²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿âÉè¼ÆÓëÓÅ»¯
PostgreSQLÊý¾Ý¿âʵսÅàѵ
×îл¼Æ»®
DeepSeek´óÄ£ÐÍÓ¦Óÿª·¢ 6-12[ÏÃÃÅ]
È˹¤ÖÇÄÜ.»úÆ÷ѧϰTensorFlow 6-22[Ö±²¥]
»ùÓÚ UML ºÍEA½øÐзÖÎöÉè¼Æ 6-30[±±¾©]
ǶÈëʽÈí¼þ¼Ü¹¹-¸ß¼¶Êµ¼ù 7-9[±±¾©]
Óû§ÌåÑé¡¢Ò×ÓÃÐÔ²âÊÔÓëÆÀ¹À 7-25[Î÷°²]
ͼÊý¾Ý¿âÓë֪ʶͼÆ× 8-23[±±¾©]

MySQLË÷Òý±³ºóµÄÊý¾Ý½á¹¹
MySQLÐÔÄܵ÷ÓÅÓë¼Ü¹¹Éè¼Æ
SQL ServerÊý¾Ý¿â±¸·ÝÓë»Ö¸´
ÈÃÊý¾Ý¿â·ÉÆðÀ´ 10´óDB2ÓÅ»¯
oracleµÄÁÙʱ±í¿Õ¼äдÂú´ÅÅÌ
Êý¾Ý¿âµÄ¿çƽ̨Éè¼Æ


²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿â
¸ß¼¶Êý¾Ý¿â¼Ü¹¹Éè¼ÆÊ¦
HadoopÔ­ÀíÓëʵ¼ù
Oracle Êý¾Ý²Ö¿â
Êý¾Ý²Ö¿âºÍÊý¾ÝÍÚ¾ò
OracleÊý¾Ý¿â¿ª·¢Óë¹ÜÀí


GE Çø¿éÁ´¼¼ÊõÓëʵÏÖÅàѵ
º½Ìì¿Æ¹¤Ä³×Ó¹«Ë¾ Nodejs¸ß¼¶Ó¦Óÿª·¢
ÖÐÊ¢Òæ»ª ׿Խ¹ÜÀíÕß±ØÐë¾ß±¸µÄÎåÏîÄÜÁ¦
ijÐÅÏ¢¼¼Êõ¹«Ë¾ PythonÅàѵ
ij²©²ÊITϵͳ³§ÉÌ Ò×ÓÃÐÔ²âÊÔÓëÆÀ¹À
ÖйúÓÊ´¢ÒøÐÐ ²âÊÔ³ÉÊì¶ÈÄ£Ðͼ¯³É(TMMI)
ÖÐÎïÔº ²úÆ·¾­ÀíÓë²úÆ·¹ÜÀí