Äú¿ÉÒÔ¾èÖú£¬Ö§³ÖÎÒÃǵĹ«ÒæÊÂÒµ¡£

1Ôª 10Ôª 50Ôª





ÈÏÖ¤Â룺  ÑéÖ¤Âë,¿´²»Çå³þ?Çëµã»÷Ë¢ÐÂÑéÖ¤Âë ±ØÌî



  ÇóÖª ÎÄÕ ÎÄ¿â Lib ÊÓÆµ iPerson ¿Î³Ì ÈÏÖ¤ ×Éѯ ¹¤¾ß ½²×ù Modeler   Code  
»áÔ±   
 
   
 
 
     
   
 ¶©ÔÄ
  ¾èÖú
PageRankËã·¨RÓïÑÔʵÏÖ
 
×÷Õß Õŵ¤£¬»ðÁú¹ûÈí¼þ    ·¢²¼ÓÚ 2014-08-11
  3228  次浏览      31
 

ǰÑÔ

GoogleËÑË÷£¬ÔçÒѳÉΪÎÒÿÌì±ØÓõŤ¾ß£¬ÎÞÊý´Î¾ªÌ¾ËüËÑË÷½á¹ûµÄ׼ȷÐÔ¡£Í¬Ê±£¬ÎÒÒ²ÔÚ×öGoogleµÄSEO£¬Íƹã×Ô¼ºµÄ²©¿Í¡£¾­¹ý¼¸¸öÔ³¢ÊÔ£¬ÎҵIJ©¿ÍPRµ½2ÁË£¬ÍâÁ´Ò²Óм¸Íò¸öÁË¡£×ܽáÏÂÀ´£¬»¹ÊǸÐ̾PageRankµÄÉñÆæ£¡

¸Ä±äÊÀ½çµÄËã·¨£¬PageRank£¡

1. PageRankËã·¨½éÉÜ

PageRankÊÇGoogleרÓеÄËã·¨£¬ÓÃÓÚºâÁ¿Ìض¨ÍøÒ³Ïà¶ÔÓÚËÑË÷ÒýÇæË÷ÒýÖÐµÄÆäËûÍøÒ³¶øÑÔµÄÖØÒª³Ì¶È¡£ËüÓÉLarry Page ºÍ Sergey BrinÔÚ20ÊÀ¼Í90Äê´úºóÆÚ·¢Ã÷¡£PageRankʵÏÖÁ˽«Á´½Ó¼ÛÖµ¸ÅÄî×÷ΪÅÅÃûÒòËØ¡£

PageRankÈÃÁ´½ÓÀ´¡±Í¶Æ±¡±

Ò»¸öÒ³ÃæµÄ¡°µÃƱÊý¡±ÓÉËùÓÐÁ´ÏòËüµÄÒ³ÃæµÄÖØÒªÐÔÀ´¾ö¶¨£¬µ½Ò»¸öÒ³ÃæµÄ³¬Á´½ÓÏ൱ÓÚ¶Ô¸ÃҳͶһƱ¡£Ò»¸öÒ³ÃæµÄPageRankÊÇÓÉËùÓÐÁ´ÏòËüµÄÒ³Ãæ£¨¡°Á´ÈëÒ³Ãæ¡±£©µÄÖØÒªÐÔ¾­¹ýµÝ¹éËã·¨µÃµ½µÄ¡£Ò»¸öÓн϶àÁ´ÈëµÄÒ³Ãæ»áÓнϸߵĵȼ¶£¬Ïà·´Èç¹ûÒ»¸öÒ³ÃæÃ»ÓÐÈκÎÁ´ÈëÒ³Ãæ£¬ÄÇôËüûÓеȼ¶¡£

¼òµ¥Ò»¾ä»°¸ÅÀ¨£º´ÓÐí¶àÓÅÖʵÄÍøÒ³Á´½Ó¹ýÀ´µÄÍøÒ³£¬±Ø¶¨»¹ÊÇÓÅÖÊÍøÒ³¡£

PageRankµÄ¼ÆËã»ùÓÚÒÔÏÂÁ½¸ö»ù±¾¼ÙÉ裺

1.ÊýÁ¿¼ÙÉ裺Èç¹ûÒ»¸öÒ³Ãæ½Úµã½ÓÊÕµ½µÄÆäËûÍøÒ³Ö¸ÏòµÄÈëÁ´ÊýÁ¿Ô½¶à£¬ÄÇôÕâ¸öÒ³ÃæÔ½ÖØÒª

2.ÖÊÁ¿¼ÙÉ裺ָÏòÒ³ÃæAµÄÈëÁ´ÖÊÁ¿²»Í¬£¬ÖÊÁ¿¸ßµÄÒ³Ãæ»áͨ¹ýÁ´½ÓÏòÆäËûÒ³Ãæ´«µÝ¸ü¶àµÄÈ¨ÖØ¡£ËùÒÔÔ½ÊÇÖÊÁ¿¸ßµÄÒ³ÃæÖ¸ÏòÒ³ÃæA£¬ÔòÒ³ÃæAÔ½ÖØÒª¡£

ÒªÌá¸ßPageRankÓÐ3¸öÒªµã£º

1.·´ÏòÁ´½ÓÊý

2.·´ÏòÁ´½ÓÊÇ·ñÀ´×ÔPageRank½Ï¸ßµÄÒ³Ãæ

3.·´ÏòÁ´½ÓÔ´Ò³ÃæµÄÁ´½ÓÊý

2. PageRankËã·¨Ô­Àí

ÔÚ³õʼ½×¶Î£ºÍøÒ³Í¨¹ýÁ´½Ó¹ØÏµ¹¹½¨ÆðÓÐÏòͼ£¬Ã¿¸öÒ³ÃæÉèÖÃÏàͬµÄPageRankÖµ£¬Í¨¹ýÈô¸ÉÂֵļÆË㣬»áµÃµ½Ã¿¸öÒ³ÃæËù»ñµÃµÄ×îÖÕPageRankÖµ¡£Ëæ×ÅÿһÂֵļÆËã½øÐУ¬ÍøÒ³µ±Ç°µÄPageRankÖµ»á²»¶ÏµÃµ½¸üС£

ÔÚÒ»ÂÖ¸üÐÂÒ³ÃæPageRankµÃ·ÖµÄ¼ÆËãÖУ¬Ã¿¸öÒ³Ãæ½«Æäµ±Ç°µÄPageRankֵƽ¾ù·ÖÅäµ½±¾Ò³Ãæ°üº¬µÄ³öÁ´ÉÏ£¬ÕâÑùÿ¸öÁ´½Ó¼´»ñµÃÁËÏàÓ¦µÄȨֵ¡£¶øÃ¿¸öÒ³Ãæ½«ËùÓÐÖ¸Ïò±¾Ò³ÃæµÄÈëÁ´Ëù´«ÈëµÄȨֵÇóºÍ£¬¼´¿ÉµÃµ½ÐµÄPageRankµÃ·Ö¡£µ±Ã¿¸öÒ³Ãæ¶¼»ñµÃÁ˸üкóµÄPageRankÖµ£¬¾ÍÍê³ÉÁËÒ»ÂÖPageRank¼ÆËã¡£

1). Ëã·¨Ô­Àí

PageRankËã·¨½¨Á¢ÔÚËæ»ú³åÀËÕßÄ£ÐÍÉÏ£¬Æä»ù±¾Ë¼ÏëÊÇ£ºÍøÒ³µÄÖØÒªÐÔÅÅÐòÊÇÓÉÍøÒ³¼äµÄÁ´½Ó¹ØÏµËù¾ö¶¨µÄ£¬Ëã·¨ÊÇÒÀ¿¿ÍøÒ³¼äµÄÁ´½Ó½á¹¹À´ÆÀ¼Ûÿ¸öÒ³ÃæµÄµÈ¼¶ºÍÖØÒªÐÔ£¬Ò»¸öÍøÒ³µÄPRÖµ²»½ö¿¼ÂÇÖ¸ÏòËüµÄÁ´½ÓÍøÒ³Êý£¬»¹ÓÐÖ¸Ïò¡¯Ö¸ÏòËüµÄÍøÒ³µÄÆäËûÍøÒ³±¾ÉíµÄÖØÒªÐÔ¡£

PageRank¾ßÓÐÁ½´óÌØÐÔ£º

PRÖµµÄ´«µÝÐÔ£ºÍøÒ³AÖ¸ÏòÍøÒ³Bʱ£¬AµÄPRÖµÒ²²¿·Ö´«µÝ¸øB

ÖØÒªÐԵĴ«µÝÐÔ£ºÒ»¸öÖØÒªÍøÒ³±ÈÒ»¸ö²»ÖØÒªÍøÒ³´«µÝµÄÈ¨ÖØÒª¶à

2). ¼ÆË㹫ʽ£º

PR(pi): piÒ³ÃæµÄPageRankÖµ

n: ËùÓÐÒ³ÃæµÄÊýÁ¿

pi: ²»Í¬µÄÍøÒ³p1,p2,p3

M(i): piÁ´ÈëÍøÒ³µÄ¼¯ºÏ

L(j): pjÁ´³öÍøÒ³µÄÊýÁ¿

d:×èÄáϵÊý, ÈÎÒâʱ¿Ì£¬Óû§µ½´ïÄ³Ò³Ãæºó²¢¼ÌÐøÏòºóä¯ÀÀµÄ¸ÅÂÊ¡£

(1-d=0.15) :±íʾÓû§Í£Ö¹µã»÷£¬Ëæ»úÌøµ½ÐÂURLµÄ¸ÅÂÊ

ȡֵ·¶Î§: 0 < d ¡Ü 1, GoogleÉèΪ0.85

3). ¹¹ÔìʵÀý£ºÒÔ4¸öÒ³ÃæµÄÊý¾ÝΪÀý

ͼƬ˵Ã÷£º

ID=1µÄÒ³ÃæÁ´Ïò2,3,4Ò³Ãæ£¬ËùÒÔÒ»¸öÓû§´ÓID=1µÄÒ³ÃæÌø×ªµ½2,3,4µÄ¸ÅÂʸ÷Ϊ1/3

ID=2µÄÒ³ÃæÁ´Ïò3,4Ò³Ãæ£¬ËùÒÔÒ»¸öÓû§´ÓID=2µÄÒ³ÃæÌø×ªµ½3,4µÄ¸ÅÂʸ÷Ϊ1/2

ID=3µÄÒ³ÃæÁ´Ïò4Ò³Ãæ£¬ËùÒÔÒ»¸öÓû§´ÓID=3µÄÒ³ÃæÌø×ªµ½4µÄ¸ÅÂʸ÷Ϊ1

ID=4µÄÒ³ÃæÁ´Ïò2Ò³Ãæ£¬ËùÒÔÒ»¸öÓû§´ÓID=4µÄÒ³ÃæÌø×ªµ½2µÄ¸ÅÂʸ÷Ϊ1

¹¹ÔìÁÚ½Ó±í£º

Á´½ÓÔ´Ò³Ãæ     Á´½ÓÄ¿±êÒ³Ãæ
1 2,3,4
2 3,4
3 4
4 2

¹¹ÔìÁÚ½Ó¾ØÕó(·½Õó):

ÁУºÔ´Ò³Ãæ

ÐУºÄ¿±êÒ³Ãæ

[,1] [,2] [,3] [,4]
[1,] 0 0 0 0
[2,] 1 0 0 1
[3,] 1 1 0 0
[4,] 1 1 1 0

ת»»Îª¸ÅÂʾØÕó(×ªÒÆ¾ØÕó)

[,1] [,2] [,3] [,4]
[1,] 0 0 0 0
[2,] 1/3 0 0 1
[3,] 1/3 1/2 0 0
[4,] 1/3 1/2 1 0

ͨ¹ýÁ´½Ó¹ØÏµ£¬ÎÒÃǾ͹¹Ôì³öÁË¡°×ªÒƾØÕ󡱡£

3. RÓïÑÔµ¥»úË㷨ʵÏÖ

´´½¨Êý¾ÝÎļþ£ºpage.csv

1,2
1,3
1,4
2,3
2,4
3,4
4,2

·Ö±ðÓÃÏÂÃæ3ÖÖ·½Ê½ÊµÏÖPageRank:

1.δ¿¼ÂÇ×èÄáϵͳµÄÇé¿ö

2.°üÀ¨¿¼ÂÇ×èÄáϵͳµÄÇé¿ö

3.Ö±½ÓÓÃRµÄÌØÕ÷Öµ¼ÆË㺯Êý

1). δ¿¼ÂÇ×èÄáϵͳµÄÇé¿ö

RÓïÑÔʵÏÖ

#¹¹½¨ÁÚ½Ó¾ØÕó
adjacencyMatrix<-function(pages){
n<-max(apply(pages,2,max))
A <- matrix(0,n,n)
for(i in 1:nrow(pages)) A[pages[i,]$dist,pages[i,]$src]<-1
A
}

#±ä»»¸ÅÂʾØÕó
probabilityMatrix<-function(G){
cs <- colSums(G)
cs[cs==0] <- 1
n <- nrow(G)
A <- matrix(0,nrow(G),ncol(G))
for (i in 1:n) A[i,] <- A[i,] + G[i,]/cs
A
}

#µÝ¹é¼ÆËã¾ØÕóÌØÕ÷Öµ
eigenMatrix<-function(G,iter=100){
iter<-10
n<-nrow(G)
x <- rep(1,n)
for (i in 1:iter) x <- G %*% x
x/sum(x)
}

> pages<-read.table(file="page.csv",header=FALSE,sep=",")
> names(pages)<-c("src","dist");pages
src dist
1 1 2
2 1 3
3 1 4
4 2 3
5 2 4
6 3 4
7 4 2

> A<-adjacencyMatrix(pages);A
[,1] [,2] [,3] [,4]
[1,] 0 0 0 0
[2,] 1 0 0 1
[3,] 1 1 0 0
[4,] 1 1 1 0

> G<-probabilityMatrix(A);G
[,1] [,2] [,3] [,4]
[1,] 0.0000000 0.0 0 0
[2,] 0.3333333 0.0 0 1
[3,] 0.3333333 0.5 0 0
[4,] 0.3333333 0.5 1 0

> q<-eigenMatrix(G,100);q
[,1]
[1,] 0.0000000
[2,] 0.4036458
[3,] 0.1979167
[4,] 0.3984375

½á¹û½â¶Á£º

ID=1µÄÒ³Ãæ£¬PRÖµÊÇ0£¬ÒòΪûÓÐÖ¸ÏòID=1µÄÒ³Ãæ

ID=2µÄÒ³Ãæ£¬PRÖµÊÇ0.4£¬È¨ÖØ×î¸ß£¬ÒòΪ1ºÍ4¶¼Ö¸Ïò2£¬4È¨ÖØ½Ï¸ß£¬²¢ÇÒ4Ö»ÓÐÒ»¸öÁ´½ÓÖ¸Ïòµ½2£¬È¨ÖØ´«µÝûÓÐËðʧ

ID=3µÄÒ³Ãæ£¬PRÖµÊÇ0.19£¬ËäÓÐ1ºÍ2µÄÖ¸ÏòÁË3£¬µ«ÊÇ1ºÍ2»¹Ö¸ÏòµÄÆäËûÒ³Ãæ£¬È¨Öر»·ÖÉ¢ÁË£¬ËùÒÔID=3µÄÒ³ÃæPR²¢²»¸ß

ID=4µÄÒ³Ãæ£¬PRÖµÊÇ0.39£¬È¨Öغܸߣ¬ÒòΪ±»1,2,3¶¼Ö¸ÏòÁË

´ÓÉÏÃæµÄ½á¹û£¬ÎÒÃÇ·¢ÏÖID=1µÄÒ³Ãæ£¬PRÖµÊÇ0£¬ÄÇôID=1µÄÒ³£¬¾Í²»ÄÜÏòÆäËûÒ³ÃæÊä³öÈ¨ÖØÁË£¬¼ÆËã¾Í»á²»ºÏÀí£¡ËùÒÔ£¬Ôö¼Ód×èÄáϵÊý£¬ÐÞÕýûÓÐÁ´½ÓÖ¸ÏòµÄÒ³Ãæ£¬±£Ö¤Ò³ÃæµÄ×îСPRÖµ>0£¬¡£

2). °üÀ¨¿¼ÂÇ×èÄáϵͳµÄÇé¿ö

Ôö¼Óº¯Êý£ºdProbabilityMatrix

#±ä»»¸ÅÂʾØÕó,¿¼ÂÇdµÄÇé¿ö
dProbabilityMatrix<-function(G,d=0.85){
cs <- colSums(G)
cs[cs==0] <- 1
n <- nrow(G)
delta <- (1-d)/n
A <- matrix(delta,nrow(G),ncol(G))
for (i in 1:n) A[i,] <- A[i,] + d*G[i,]/cs
A
}

> pages<-read.table(file="page.csv",header=FALSE,sep=",")
> names(pages)<-c("src","dist");pages
src dist
1 1 2
2 1 3
3 1 4
4 2 3
5 2 4
6 3 4
7 4 2

> A<-adjacencyMatrix(pages);A
[,1] [,2] [,3] [,4]
[1,] 0 0 0 0
[2,] 1 0 0 1
[3,] 1 1 0 0
[4,] 1 1 1 0

> G<-dProbabilityMatrix(A);G
[,1] [,2] [,3] [,4]
[1,] 0.0375000 0.0375 0.0375 0.0375
[2,] 0.3208333 0.0375 0.0375 0.8875
[3,] 0.3208333 0.4625 0.0375 0.0375
[4,] 0.3208333 0.4625 0.8875 0.0375

> q<-eigenMatrix(G,100);q
[,1]
[1,] 0.0375000
[2,] 0.3738930
[3,] 0.2063759
[4,] 0.3822311

Ôö¼Ó×èÄáϵÊýºó£¬ID=1µÄÒ³Ãæ£¬¾ÍÓÐÖµÁËPR(1)=(1-d)/n=(1-0.85)/4=0.0375£¬¼´ÎÞÍâÁ´Ò³ÃæµÄ×îСֵ¡£

3). Ö±½ÓÓÃRµÄÌØÕ÷Öµ¼ÆË㺯Êý

Ôö¼Óº¯Êý£ºcalcEigenMatrix

#Ö±½Ó¼ÆËã¾ØÕóÌØÕ÷Öµ
calcEigenMatrix<-function(G){
x <- Re(eigen(G)$vectors[,1])
x/sum(x)
}

> pages<-read.table(file="page.csv",header=FALSE,sep=",")
> names(pages)<-c("src","dist");pages
src dist
1 1 2
2 1 3
3 1 4
4 2 3
5 2 4
6 3 4
7 4 2

> A<-adjacencyMatrix(pages);A
[,1] [,2] [,3] [,4]
[1,] 0 0 0 0
[2,] 1 0 0 1
[3,] 1 1 0 0
[4,] 1 1 1 0

> G<-dProbabilityMatrix(A);G
[,1] [,2] [,3] [,4]
[1,] 0.0375000 0.0375 0.0375 0.0375
[2,] 0.3208333 0.0375 0.0375 0.8875
[3,] 0.3208333 0.4625 0.0375 0.0375
[4,] 0.3208333 0.4625 0.8875 0.0375

> q<-calcEigenMatrix(G);q
[1] 0.0375000 0.3732476 0.2067552 0.3824972

Ö±½Ó¼ÆËã¾ØÕóÌØÕ÷Öµ£¬¿ÉÒÔÓÐЧµØ¼õÉÙµÄÑ­»·µÄ²Ù×÷£¬Ìá¸ß³ÌÐòÔËÐÐЧÂÊ¡£

ÔÚÁ˽âPageRankµÄÔ­Àíºó£¬Ê¹ÓÃRÓïÑÔ¹¹½¨PageRankÄ£ÐÍ£¬ÊǷdz£ÈÝÒ׵ġ£Êµ¼ÊÓ¦ÓÃÖУ¬ÎÒÃÇÒ²Ô¸ÒâÓñȽϼòµ¥µÄ·½Ê½½¨Ä££¬ÑéÖ¤ºó£¬ÔÙÓÃÆäËûÓïÑÔÓïÑÔÈ¥ÆóÒµÓ¦Óã¡

   
3228 ´Îä¯ÀÀ       31
Ïà¹ØÎÄÕ Ïà¹ØÎĵµ Ïà¹ØÊÓÆµ



ÎÒÃǸÃÈçºÎÉè¼ÆÊý¾Ý¿â
Êý¾Ý¿âÉè¼Æ¾­Ñé̸
Êý¾Ý¿âÉè¼Æ¹ý³Ì
Êý¾Ý¿â±à³Ì×ܽá
Êý¾Ý¿âÐÔÄܵ÷Óż¼ÇÉ
Êý¾Ý¿âÐÔÄܵ÷Õû
Êý¾Ý¿âÐÔÄÜÓÅ»¯½²×ù
Êý¾Ý¿âϵͳÐÔÄܵ÷ÓÅϵÁÐ
¸ßÐÔÄÜÊý¾Ý¿âÉè¼ÆÓëÓÅ»¯
¸ß¼¶Êý¾Ý¿â¼Ü¹¹Ê¦
Êý¾Ý²Ö¿âºÍÊý¾ÝÍÚ¾ò¼¼Êõ
HadoopÔ­Àí¡¢²¿ÊðÓëÐÔÄܵ÷ÓÅ


MySQLË÷Òý±³ºóµÄÊý¾Ý½á¹¹
MySQLÐÔÄܵ÷ÓÅÓë¼Ü¹¹Éè¼Æ
SQL ServerÊý¾Ý¿â±¸·ÝÓë»Ö¸´
ÈÃÊý¾Ý¿â·ÉÆðÀ´ 10´óDB2ÓÅ»¯
oracleµÄÁÙʱ±í¿Õ¼äдÂú´ÅÅÌ
Êý¾Ý¿âµÄ¿çƽ̨Éè¼Æ


²¢·¢¡¢´óÈÝÁ¿¡¢¸ßÐÔÄÜÊý¾Ý¿â
¸ß¼¶Êý¾Ý¿â¼Ü¹¹Éè¼ÆÊ¦
HadoopÔ­ÀíÓëʵ¼ù
Oracle Êý¾Ý²Ö¿â
Êý¾Ý²Ö¿âºÍÊý¾ÝÍÚ¾ò
OracleÊý¾Ý¿â¿ª·¢Óë¹ÜÀí


GE Çø¿éÁ´¼¼ÊõÓëʵÏÖÅàѵ
º½Ìì¿Æ¹¤Ä³×Ó¹«Ë¾ Nodejs¸ß¼¶Ó¦Óÿª·¢
ÖÐÊ¢Òæ»ª ׿Խ¹ÜÀíÕß±ØÐë¾ß±¸µÄÎåÏîÄÜÁ¦
ijÐÅÏ¢¼¼Êõ¹«Ë¾ PythonÅàѵ
ij²©²ÊITϵͳ³§ÉÌ Ò×ÓÃÐÔ²âÊÔÓëÆÀ¹À
ÖйúÓÊ´¢ÒøÐÐ ²âÊÔ³ÉÊì¶ÈÄ£Ðͼ¯³É(TMMI)
ÖÐÎïÔº ²úÆ·¾­ÀíÓë²úÆ·¹ÜÀí