DBSCAN£¨Density-Based Spatial Clustering
of Applications with Noise£©¾ÛÀàËã·¨£¬ËüÊÇÒ»ÖÖ»ùÓÚ¸ßÃܶÈÁ¬Í¨ÇøÓòµÄ¡¢»ùÓÚÃܶȵľÛÀàËã·¨£¬Äܹ»½«¾ßÓÐ×ã¹»¸ßÃܶȵÄÇøÓò»®·ÖΪ´Ø£¬²¢ÔÚ¾ßÓÐÔëÉùµÄÊý¾ÝÖз¢ÏÖÈÎÒâÐÎ×´µÄ´Ø¡£ÎÒÃÇ×ܽáÒ»ÏÂDBSCAN¾ÛÀàËã·¨ÔÀíµÄ»ù±¾Òªµã£º
DBSCANËã·¨ÐèҪѡÔñÒ»ÖÖ¾àÀë¶ÈÁ¿£¬¶ÔÓÚ´ý¾ÛÀàµÄÊý¾Ý¼¯ÖУ¬ÈÎÒâÁ½¸öµãÖ®¼äµÄ¾àÀ룬·´Ó³Á˵ãÖ®¼äµÄÃܶȣ¬ËµÃ÷Á˵ãÓëµãÊÇ·ñÄܹ»¾Ûµ½Í¬Ò»ÀàÖС£ÓÉÓÚDBSCANËã·¨¶Ô¸ßάÊý¾Ý¶¨ÒåÃܶȺÜÀ§ÄÑ£¬ËùÒÔ¶ÔÓÚ¶þά¿Õ¼äÖеĵ㣬¿ÉÒÔʹÓÃÅ·¼¸ÀïµÂ¾àÀëÀ´½øÐжÈÁ¿¡£
DBSCANËã·¨ÐèÒªÓû§ÊäÈë2¸ö²ÎÊý£ºÒ»¸ö²ÎÊýÊǰ뾶£¨Eps£©£¬±íʾÒÔ¸ø¶¨µãPΪÖÐÐĵÄÔ²ÐÎÁÚÓòµÄ·¶Î§£»ÁíÒ»¸ö²ÎÊýÊÇÒÔµãPΪÖÐÐĵÄÁÚÓòÄÚ×îÉÙµãµÄÊýÁ¿£¨MinPts£©¡£Èç¹ûÂú×㣺ÒÔµãPΪÖÐÐÄ¡¢°ë¾¶ÎªEpsµÄÁÚÓòÄڵĵãµÄ¸öÊý²»ÉÙÓÚMinPts£¬Ôò³ÆµãPΪºËÐĵ㡣
DBSCAN¾ÛÀàʹÓõ½Ò»¸ök-¾àÀëµÄ¸ÅÄk-¾àÀëÊÇÖ¸£º¸ø¶¨Êý¾Ý¼¯P={p(i);
i=0,1,¡n}£¬¶ÔÓÚÈÎÒâµãP(i)£¬¼ÆËãµãP(i)µ½¼¯ºÏDµÄ×Ó¼¯S={p(1), p(2), ¡,
p(i-1), p(i+1), ¡, p(n)}ÖÐËùÓеãÖ®¼äµÄ¾àÀ룬¾àÀë°´ÕÕ´ÓСµ½´óµÄ˳ÐòÅÅÐò£¬¼ÙÉèÅÅÐòºóµÄ¾àÀ뼯ºÏΪD={d(1),
d(2), ¡, d(k-1), d(k), d(k+1), ¡,d(n)}£¬Ôòd(k)¾Í±»³ÆÎªk-¾àÀë¡£Ò²¾ÍÊÇ˵£¬k-¾àÀëÊǵãp(i)µ½ËùÓе㣨³ýÁËp(i)µã£©Ö®¼ä¾àÀëµÚk½üµÄ¾àÀë¡£¶Ô´ý¾ÛÀ༯ºÏÖÐÿ¸öµãp(i)¶¼¼ÆËãk-¾àÀ룬×îºóµÃµ½ËùÓеãµÄk-¾àÀ뼯ºÏE={e(1),
e(2), ¡, e(n)}¡£
¸ù¾Ý¾Ñ鼯Ëã°ë¾¶Eps£º¸ù¾ÝµÃµ½µÄËùÓеãµÄk-¾àÀ뼯ºÏE£¬¶Ô¼¯ºÏE½øÐÐÉýÐòÅÅÐòºóµÃµ½k-¾àÀ뼯ºÏE¡¯£¬ÐèÒªÄâºÏÒ»ÌõÅÅÐòºóµÄE¡¯¼¯ºÏÖÐk-¾àÀëµÄ±ä»¯ÇúÏßͼ£¬È»ºó»æ³öÇúÏߣ¬Í¨¹ý¹Û²ì£¬½«¼±¾ç·¢Éú±ä»¯µÄλÖÃËù¶ÔÓ¦µÄk-¾àÀëµÄÖµ£¬È·¶¨Îª°ë¾¶EpsµÄÖµ¡£
¸ù¾Ý¾Ñ鼯Ëã×îÉÙµãµÄÊýÁ¿MinPts£ºÈ·¶¨MinPtsµÄ´óС£¬Êµ¼ÊÉÏÒ²ÊÇÈ·¶¨k-¾àÀëÖÐkµÄÖµ£¬DBSCANË㷨ȡk=4£¬ÔòMinPts=4¡£
ÁíÍ⣬Èç¹û¾õµÃ¾ÑéÖµ¾ÛÀàµÄ½á¹û²»ÂúÒ⣬¿ÉÒÔÊʵ±µ÷ÕûEpsºÍMinPtsµÄÖµ£¬¾¹ý¶à´Îµü´ú¼ÆËã¶Ô±È£¬Ñ¡Ôñ×îºÏÊʵIJÎÊýÖµ¡£¿ÉÒÔ¿´³ö£¬Èç¹ûMinPts²»±ä£¬EpsÈ¡µÃÖµ¹ý´ó£¬»áµ¼Ö´ó¶àÊýµã¶¼¾Ûµ½Í¬Ò»¸ö´ØÖУ¬Eps¹ýС£¬»áµ¼ÖÂÒ»¸ö´ØµÄ·ÖÁÑ£»Èç¹ûEps²»±ä£¬MinPtsµÄֵȡµÃ¹ý´ó£¬»áµ¼ÖÂͬһ¸ö´ØÖе㱻±ê¼ÇΪÀëȺµã£¬MinPts¹ýС£¬»áµ¼Ö·¢ÏÖ´óÁ¿µÄºËÐĵ㡣
ÎÒÃÇÐèÒªÖªµÀµÄÊÇ£¬DBSCANËã·¨£¬ÐèÒªÊäÈë2¸ö²ÎÊý£¬ÕâÁ½¸ö²ÎÊýµÄ¼ÆËã¶¼À´×Ô¾Ñé֪ʶ¡£°ë¾¶EpsµÄ¼ÆËãÒÀÀµÓÚ¼ÆËãk-¾àÀ룬DBSCANÈ¡k=4£¬Ò²¾ÍÊÇÉèÖÃMinPts=4£¬È»ºóÐèÒª¸ù¾Ýk-¾àÀëÇúÏߣ¬¸ù¾Ý¾Ñé¹Û²ìÕÒµ½ºÏÊʵİ뾶EpsµÄÖµ£¬ÏÂÃæµÄË㷨ʵÏÖ¹ý³ÌÖУ¬ÎÒÃÇ»áÏêϸ˵Ã÷¡£
¶ÔÓÚËã·¨µÄʵÏÖ£¬Ê×ÏÈÎÒÃǸÅÒªµØÃèÊöÒ»ÏÂʵÏֵĹý³Ì£º
½âÎöÑù±¾Êý¾ÝÎļþ
¼ÆËãÿ¸öµãÓëÆäËûËùÓеãÖ®¼äµÄÅ·¼¸ÀïµÂ¾àÀë
¼ÆËãÿ¸öµãµÄk-¾àÀëÖµ£¬²¢¶ÔËùÓеãµÄk-¾àÀ뼯ºÏ½øÐÐÉýÐòÅÅÐò£¬Êä³öµÄÅÅÐòºóµÄk-¾àÀëÖµ
½«ËùÓеãµÄk-¾àÀëÖµ£¬ÔÚExcelÖÐÓÃÉ¢µãͼÏÔʾk-¾àÀë±ä»¯Ç÷ÊÆ
¸ù¾ÝÉ¢µãͼȷ¶¨°ë¾¶EpsµÄÖµ
¸ù¾Ý¸ø¶¨MinPts=4£¬ÒÔ¼°°ë¾¶EpsµÄÖµ£¬¼ÆËãËùÓкËÐĵ㣬²¢½¨Á¢ºËÐĵãÓëµ½ºËÐĵã¾àÀëСÓÚ°ë¾¶EpsµÄµãµÄÓ³Éä
¸ù¾ÝµÃµ½µÄºËÐĵ㼯ºÏ£¬ÒÔ¼°°ë¾¶EpsµÄÖµ£¬¼ÆËãÄܹ»Á¬Í¨µÄºËÐĵ㣬²¢µÃµ½ÀëȺµã
½«Äܹ»Á¬Í¨µÄÿһ×éºËÐĵ㣬ÒÔ¼°µ½ºËÐĵã¾àÀëСÓÚ°ë¾¶EpsµÄµã£¬¶¼·Åµ½Ò»Æð£¬ÐγÉÒ»¸ö´Ø
Ñ¡Ôñ²»Í¬µÄ°ë¾¶Eps£¬Ê¹ÓÃDBSCANËã·¨¾ÛÀàµÃµ½µÄÒ»×é´Ø¼°ÆäÀëȺµã£¬Ê¹ÓÃÉ¢µãͼ¶Ô±È¾ÛÀàЧ¹û
È»ºó£¬ÔÙÏêϸÃèÊö¾ÛÀà¹ý³ÌµÄ¾ßÌåʵÏÖ¡£
¼ÆËãÅ·¼¸ÀïµÂ¾àÀë
ÎÒÃÇʹÓõÄÑù±¾Êý¾Ý£¬À´×ÔÒ»×é¾Î³¶È×ø±êÊý¾Ý£¬Êý¾ÝÎļþ¸ñʽÀàËÆÈçÏÂËùʾ£º
116.389463 39.87194 02 116.389463 39.874577 03 116.312984 39.887419 04 116.382798 39.853576 05 116.496648 39.872999 06 116.436246 39.911165 07 116.622074 40.061037 08 116.599267 40.062485 09 116.441824 39.940168 10 116.599267 40.062485 11 116.402096 39.942057 12 116.37319 39.93428 13 116.327812 39.899396 14 116.374739 39.898751 15 116.287195 39.959335 16 116.513574 39.878222 17 116.474355 39.962825 18 116.400651 40.008559 19 ... ...
|
ÎÒÃÇÐèÒª×öµÄÊ×ÏȾÍÊÇ£¬½âÎöÑù±¾Êý¾ÝÎļþ£¬½«Æäת»»³ÉÎÒÃÇÐèÒªµÄ±íʾÐÎʽ£¬ÎÒÃǶ¨ÒåÁËPoint2DÀ࣬´úÂëÈçÏÂËùʾ£º
01 package org.shirdrn.dm.clustering.common; 02 03 public class Point2D { 04 05 protected final Double x; 06 protected final Double y; 07 08 public Point2D(Double x, Double y) { 09 super(); 10 this.x = x; 11 this.y = y; 12 } 13 14 @Override 15 public int hashCode() { 16 return 31 * x.hashCode() + 31 * y.hashCode(); 17 } 18 19 @Override 20 public boolean equals(Object obj) { 21 Point2D other = (Point2D) obj; 22 return this.x.doubleValue() == other.x.doubleValue() && this.y.doubleValue() == other.y.doubleValue(); 23 } 24 25 public Double getX() { 26 return x; 27 } 28 29 public Double getY() { 30 return y; 31 } 32 33 @Override 34 public String toString() { 35 return "(" + x + ", " + y + ")"; 36 } 37 38 }
|
ÎÒÃÇ¿ÉÒÔ½«½âÎöºóµÄµãµÄ¶ÔÏó·Åµ½Ò»¸öList<Point2D> allPoints¼¯ºÏÀïÃæ£¬ÒÔ±ãºóÐøÊ¹ÓÃʱµü´ú¼¯ºÏ¡£ÔÚ¼ÆËãÁ½µãÖ®¼äµÄÅ·¼¸ÀïµÂ¾àÀëʱ£¬ÐèÒªµü´úÇ°ÃæÉú³ÉµÄPoint2DµÄ¼¯ºÏ£¬¼ÆËãÅ·¼¸ÀïµÂ¾àÀ룬ʵÏÖ·½·¨ÈçÏÂËùʾ£º
public static double euclideanDistance(Point2D p1, Point2D p2) { 2 double sum = 0.0; 3 double diffX = p1.getX() - p2.getX(); 4 double diffY = p1.getY() - p2.getY(); 5 sum += diffX * diffX + diffY * diffY; 6 return Math.sqrt(sum); 7 }
|
Èç¹ûÐèÒª£¬¿ÉÒÔ½«¼ÆËãµÄËùÓеãÖ®¼äµÄ¾àÀ뻺´æ£¬ÒòΪ¼ÆËãk-¾àÀëÐèÒª¶à´Î·ÃÎʵãµÄÅ·¼¸ÀïµÂ¾àÀ룬±ÈÈ磬¿ÉÒÔʹÓÃGuava¿âÖеÄCache¹¤¾ß£º
Cache<Set<Point2D>, Double> distanceCache = CacheBuilder.newBuilder().maximumSize(Integer.MAX_VALUE).build();
|
ÉÏÃæ´úÂëÖУ¬ÉèÖûº´æÈÝÄÉ×ã¹»¶à£¨Integer.MAX_VALUE£©µÄ¶ÔÏ󣬽«¼ÆËã³öµÄÈ«²¿µÄÅ·¼¸ÀïµÂ¾àÀë·ÅÔÚÄÚ´æÖУ¬±ãÓÚºóÐøµü´úÊ±ÖØÓá£
¼ÆËãk-¾àÀë
ÿ¸öµã¶¼Òª¼ÆËãk-¾àÀ룬ÔÚ¼ÆËãÒ»¸öµãµÄk-¾àÀëµÄʱºò£¬Ê×ÏÈÒª¼ÆËã¸Ãµãµ½ÆäËûËùÓеãµÄÅ·¼¸ÀïµÂ¾àÀ룬°´ÕÕ¾àÀëÉýÐòÅÅÐòºó£¬Ñ¡ÔñµÚkСµÄ¾àÀë×÷Ϊk-¾àÀëµÄÖµ£¬ÊµÏÖ´úÂëÈçÏÂËùʾ£º
Task task = q.poll(); // ´Ó¶ÓÁÐqÖÐÈ¡³öÒ»¸öTask£¬¾ÍÊǼÆËãÒ»¸öµãµÄk-¾àÀëµÄÈÎÎñ 02 KPoint2D p1 = (KPoint2D) task.p; 03 final TreeSet<Double> sortedDistances = Sets.newTreeSet(new Comparator<Double>() { // ´´½¨Ò»¸ö½µÐòÅÅÐòTreeSet 04 05 @Override 06 public int compare(Double o1, Double o2) { 07 double diff = o1 - o2; 08 if(diff > 0) { 09 return -1; 10 } 11 if(diff < 0) { 12 return 1; 13 } 14 return 0; 15 } 16 17 }); 18 for (int i = 0; i < allPoints.size(); i++) { // ¼ÆËãµãp1ÓëallPoints¼¯ºÏÖÐÿ¸öµãµÄk-¾àÀë 19 if(task.pos != i) { // µãp1ÓëËü×Ô¼ºµÄÅ·¼¸ÀïµÂ¾àÀëû±ØÒª¼ÆËã 20 KPoint2D p2 = (KPoint2D) allPoints.get(i); 21 Set<Point2D> set = Sets.newHashSet((Point2D) p1, (Point2D) p2); 22 Double distance = distanceCache.getIfPresent(set); // ´Ó»º´æÖÐÈ¡³öÅ·¼¸ÀïµÂ¾àÀ루¿ÉÄܲ»´æÔÚ£© 23 if(distance == null) { 24 distance = MetricUtils.euclideanDistance(p1, p2); 25 distanceCache.put(set, distance); // ²»´æÔÚÔò¼ÓÈ뻺´æÖÐ 26 } 27 if(!sortedDistances.contains(distance)) { 28 sortedDistances.add(distance); 29 } 30 if(sortedDistances.size() > k) { // TreeSetÖÐÖ»×î¶à±£Áôk¸öÅ·¼¸ÀïµÂ¾àÀëÖµ 31 Iterator<Double> iter = sortedDistances.iterator(); 32 iter.next(); 33 // remove (k+1)th minimum distance 34 iter.remove(); // ½«k+1¸ö¾àÀëÖµÖÐ×î´óµÄɾ³ý£¬Ê£Óàk¸öÊÇ×îСµÄ 35 } 36 } 37 } 38 39 // collect k-distance 40 p1.kDistance = sortedDistances.iterator().next(); // ´Ëʱ£¬TreeSetÖÐ×î´óµÄ£¬¾ÍÊǵÚk×îСµÄ¾àÀë
|
ÉÏÊö´úÂëÖУ¬KPoint2DÀàÊÇPoint2DµÄ×ÓÀ࣬²»¹ý±È»ùÀàPoint2D¶àÁËÒ»¸ök-¾àÀëµÄÊôÐÔ£¬´úÂëÈçÏÂËùʾ£º
private class KPoint2D extends Point2D { 02 03 private Double kDistance = 0.0; 04 05 public KPoint2D(Double x, Double y) { 06 super(x, y); 07 } 08 09 @Override 10 public int hashCode() { 11 return super.hashCode(); 12 } 13 14 @Override 15 public boolean equals(Object obj) { 16 return super.equals(obj); 17 } 18 19 }
|
´úÂë±È½ÏÈÝÒ×£¬¿ÉÒԲ鿴Ïà¹Ø×¢ÊÍÐÅÏ¢¡£
»æÖÆk-¾àÀëÇúÏߣ¬Ñ°ÕÒ°ë¾¶Eps
xÖá×ø±êµãÎÒÃÇÖ±½ÓʹÓõÝÔöµÄ×ÔÈ»ÊýÐòÁУ¬Ã¿¸öµã¶ÔÓ¦Ò»¸ö×ÔÈ»Êý£¬yÖá¾ÍÊÇËùÓеãµÄk-¾àÀëµÄ´óС£¬ÎÒÃÇÔÚ´úÂëÖпÉÒÔ½øÐд¦Àí£¬ÊµÏÖÈçÏ£º
for(int i=0; i<allPoints.size(); i++) { 2 KPoint2D kp = (KPoint2D) allPoints.get(i); 3 System.out.println(i + "\t" + kp.kDistance); 4 }
|
×îÖÕÉú³ÉµÄÊý¾Ý£¬Êä³ö¸ñʽÀàËÆÈçÏ£º
0 6.795895820371257E-4 02 1 8.305064719800753E-4 03 2 8.692537028985709E-4 04 3 8.81717074805001E-4 05 4 9.38043175973106E-4 06 5 0.0010181414440047032 07 6 0.0011109837982601679 08 7 0.0011109837982601679 09 8 0.0011414013316968501 10 9 0.0011533646431092647 11 10 0.0011540277293025107 12 11 0.0011712783614491256 13 12 0.001171973122556046 14 13 0.001171973122556046 15 14 0.0012320292204251713 16 15 0.0012371273176228466
|
ÎÒÃǰÑÊä³öÊý¾Ý¸´ÖƵ½Excel±í¸ñÖУ¬Ê¹ÓÃÉÏÊöÊý¾ÝÉú³ÉÉ¢µãͼ£¬»ùÓÚx×ø±êÈ¡ÁË4¸ö²»Í¬µÄ·¶Î§£¬¹Û²ìÇúÏߵı仯Çé¿ö£¬0~2600¡¢0~2000¡¢2001~2630¡¢0~2500¸÷¸öx×ø±ê·¶Î§Äڵĵ㣬¶ÔÓ¦µÄÉ¢µãͼ·Ö±ðÈçÏÂËùʾ£º

ͨ¹ýÉÏͼ¿ÉÒÔ¿´³ö£º
×óÉÏͼ£¨0~2600£©£ºÓÉÓÚx=2500Ö®ºóµêµÄk-¾àÀë±ä»¯Ì«¿ì£¨¿ÉÒÔºöÂÔ£©£¬µ¼ÖÂÇ°ÃæµãµÄk-¾àÀëµÄ±ä»¯Ç÷ÊÆÎÞ·¨¹Û²ì³öÀ´¡£
ÓÒÉÏͼ£¨0~2000£©£ºÈ¥µôβ²¿µÄһЩµãµÄk-¾àÀ룬¿ÉÒÔ¿´³öÇúÏߵı仯Ç÷ÊÆ¡£
×óÏÂͼ£¨2001~2630£©£ºx×ø±êÖáºó°ë²¿·ÖµÄ¾àÀëµÄ±ä»¯Ç÷ÊÆ¡£
ÓÒÏÂͼ£¨0~2500£©£ºÈ¥µôβ²¿Ò»Ð©k-¾àÀëµã£¬Õ¹Ê¾´ó²¿·Ök-¾àÀëµãµÄ±ä»¯Ç÷ÊÆ¡£
×ÛºÏÉÏÃæ4¸öͼ£¬¿ÉÒÔÑ¡ÔñµÃµ½°ë¾¶EpsµÄ·¶Î§´óÖÂÔÚ0.002~0.006Ö®¼ä£¬ÒÑÖªMinPts=4£¬¾ßÌåÎÒÃÇ¿ÉÒÔÑ¡ÔñÏÂÃæ3¸ök-¾àÀëµÄÖµ×÷Ϊ°ë¾¶Eps£º
0.0025094814205335555 2 0.004417483559674606 3 0.006147849217403014
|
ͨ¹ýÏÂÒ»²½½øÐоÛÀ࣬¿´Ò»ÏÂʹÓÃÑ¡ÔñµÄEpsµÄÕ⼸¸öÖµ½øÐоÛÀàµÄЧ¹û¶Ô±È¡£ÁíÍ⣬¶Ô°ë¾¶Eps=8Ò²½øÐоÛÀ࣬Ö÷ÒªÊÇΪÁË¿´Ò»Ï°뾶±ä»¯¶Ô¾ÛÀàЧ¹ûµÄÓ°Ïì¡£
¼ÆËãºËÐĵã
¾ÛÀà¹ý³ÌÐèÒªÖªµÀ°ë¾¶Eps£¬°ë¾¶ÒÑÖª£¬Ê×ÏÈÐèÒª¼ÆËãÓÐÄÄЩºËÐĵ㣬¸ø¶¨µã£¬Èç¹ûÒԸõãΪÖÐÐİ뾶ΪEpsµÄÁÚÓòÄ򵀮äËûµãµÄÊýÁ¿´óÓÚµÈÓÚMinPts£¬Ôò¸Ãµã¾ÍΪºËÐĵ㡣¼ÆËãºËÐĵãµÄʵÏÖ´úÂëÈçÏÂËùʾ£º
Point2D p1 = taskQueue.poll(); 02 if(p1 != null) { 03 ++processedPoints; 04 Set<Point2D> set = Sets.newHashSet(); 05 Iterator<Point2D> iter = epsEstimator.allPointIterator(); 06 while(iter.hasNext()) { 07 Point2D p2 = iter.next(); 08 if(!p2.equals(p1)) { 09 double distance = epsEstimator.getDistance(Sets.newHashSet(p1, p2)); 10 // collect a point belonging to the point p1 11 if(distance <= eps) { // ÊÕ¼¯Ã¿¸öµãÓëÆäÁÚÓòÄڵĵãÖ®¼ä¾àÀëСÓÚµÈÓÚEpsµÄµã 12 set.add(p2); 13 } 14 } 15 } 16 // decide whether p1 is core point 17 if(set.size() >= minPts) { // ¼ÆËãÊÕ¼¯µ½µÄÁÚÓòÄڵĵãµÄ¸öÊý£¬
СÓÚµÈÓÚMinPts£¬Ôò¼ÓÈëµ½Ó³Éä±íMap<Point2D, Set<Point2D>> corePointWithNeighboursÖÐ 18 corePointWithNeighbours.put(p1, set); 19 LOG.debug("Decide core point: point" + p1 + ", set=" + set); 20 } else { 21 // here, perhaps a point was wrongly put outliers set 22 // afterwards we should remedy outliers set 23 if(!outliers.contains(p1)) { // ÕâÀ»á°ÑһЩµã´íÎ󵨼ÓÈëµ½ÀëȺµã¼¯ºÏoutliersÖУ¬ºóÃæ»á½øÐÐÐÞÕý 24 outliers.add(p1); 25 } 26 } 27 }
|
ÄÇЩ±»´íÎó·ÅÈëÀëȺµã¼¯ºÏµÄµã£¬ÐèÒªÔÚ¼ÆËãºËÐĵãÍê³ÉÖ®ºó£¬±éÀúÀëȺµã¼¯ºÏ£¬ÓëºËÐĵ㼯ºÏ£¨¼°Æä¶ÔÓ¦µÄÓ³Éäµã¼¯ºÏ£©½øÐбȶԣ¬´úÂëÈçÏÂËùʾ£º
// process outliers 02 Iterator<Point2D> iter = outliers.iterator(); 03 while(iter.hasNext()) { 04 Point2D np = iter.next(); 05 if(corePointWithNeighbours.containsKey(np)) { 06 iter.remove(); 07 } else { 08 for(Set<Point2D> set : corePointWithNeighbours.values()) { 09 if(set.contains(np)) { 10 iter.remove(); 11 break; 12 } 13 } 14 } 15 }
|
ÕâÑù£¬ÓÐЩ·ÇÀëȺµã¾Í´ÓÀëȺµã¼¯ºÏÖб»ÅųýÁË¡£
Á¬Í¨ºËÐĵãÉú³É´Ø
ºËÐĵãÄܹ»Á¬Í¨£¨ÓÐЩÊé¼®ÖгÆÎª£º¡°Ãܶȿɴ£©£¬ËüÃǹ¹³ÉµÄÒÔEps³¤¶ÈΪ°ë¾¶µÄÔ²ÐÎÁÚÓòÏ໥Á¬½Ó»òÖØµþ£¬ÕâЩÁ¬Í¨µÄºËÐĵ㼰ÆäËù´¦µÄÁÚÓòÄÚµÄÈ«²¿µã¹¹³ÉÒ»¸ö´Ø¡£¼ÙÉèMinPts=4£¬ÔòÁ¬Í¨µÄºËÐĵãʾÀý£¬ÈçÏÂͼËùʾ£º

¼ÙÉèMinPts=4£¬ÉÏͼÖдæÔÚÁ½¸ö´Ø£¬Ã¿¸ö´Ø¶¼ÊÇͨ¹ýºËÐĵãÁ¬Í¨ÔÚÒ»ÆðµÄ£¬Ã¿¸ö´ØÊÇÓÉÁ¬Í¨µÄºËÐĵ㼰ÆäºËÐĵã°ë¾¶EpsÔ²ÐÎÁÚÓòÄڵĵã×é³ÉµÄ£¬ÔÚÕâÁ½¸ö´ØËù¸²¸ÇµÄ·¶Î§ÍⲿµÄµã£¬¶¼ÊÇÀëȺµã£¨Outliers£©¡£
¼ÆËãÁ¬Í¨µÄºËÐĵãµÄ˼·ÊÇ£¬»ùÓÚ¹ã¶È±éÀúÓëÉî¶È±éÀú¼¯ºÏµÄ·½Ê½£º´ÓºËÐĵ㼯ºÏSÖÐÈ¡³öÒ»¸öµãp£¬¼ÆËãµãpÓëS¼¯ºÏÖÐÿ¸öµã£¨³ýÁËpµã£©ÊÇ·ñÁ¬Í¨£¬¿ÉÄÜ»áµÃµ½Ò»¸öÁ¬Í¨ºËÐĵãµÄ¼¯ºÏC1£¬È»ºó´Ó¼¯ºÏSÖÐɾ³ýµãpºÍC1¼¯ºÏÖеĵ㣬µÃµ½ºËÐĵ㼯ºÏS1£»ÔÙ´ÓS1ÖÐÈ¡³öÒ»¸öµãp1£¬¼ÆËãp1ÓëºËÐĵ㼯ºÏS1¼¯ÖÐÿ¸öµã£¨³ýÁËp1µã£©ÊÇ·ñÁ¬Í¨£¬¿ÉÄܵõ½Ò»¸öÁ¬Í¨ºËÐĵ㼯ºÏC2£¬ÔÙ´Ó¼¯ºÏS1ÖÐɾ³ýµãp1ºÍC2¼¯ºÏÖÐËùÓе㣬µÃµ½ºËÐĵ㼯ºÏS2£¬¡¡×îºóµÃµ½p¡¢p1¡¢p2¡¢¡¡£¬ÒÔ¼°C1¡¢C2¡¢¡¡¾Í¹¹³ÉÒ»¸ö´ØµÄºËÐĵ㡣×îÖÕ½«ºËÐĵ㼯ºÏSÖеĵ㶼±éÀúÍê³É£¬µÃµ½ËùÓеĴء£¾ßÌå´úÂëʵÏÖ£¬ÈçÏÂËùʾ£º
// join connected core points 02 LOG.info("Joining connected points ..."); 03 Set<Point2D> corePoints = Sets.newHashSet(corePointWithNeighbours.keySet()); 04 while(true) { 05 Set<Point2D> set = Sets.newHashSet(); 06 Iterator<Point2D> iter = corePoints.iterator(); 07 if(iter.hasNext()) { 08 Point2D p = iter.next(); 09 iter.remove(); 10 Set<Point2D> connectedPoints = joinConnectedCorePoints(p, corePoints); 11 set.addAll(connectedPoints); 12 while(!connectedPoints.isEmpty()) { 13 connectedPoints = joinConnectedCorePoints(connectedPoints, corePoints); 14 set.addAll(connectedPoints); 15 } 16 clusteredPoints.put(p, set); 17 } else { 18 break; 19 } 20 }
|
ÉÏÃæµ÷ÓÃÁËÖØÔØµÄÁ½¸ö·½·¨joinConnectedCorePoints£¬·Ö±ð¸ù¾Ý²ÎÊý²»Í¬¼ÆËãÁ¬Í¨µÄºËÐĵ㼯ºÏ£º¼ÆËãºËÐĵ㼯ºÏÖÐÒ»¸öµã£¬Óë¸ÃºËÐĵ㼯ºÏÖÐÆäËüµÄËùÓкËÐĵãÊÇ·ñÁ¬Í¨£¬µ÷ÓÃÈçÏ·½·¨£º
private Set<Point2D> joinConnectedCorePoints(Point2D p1, Set<Point2D> leftCorePoints) { 02 Set<Point2D> set = Sets.newHashSet(); 03 for(Point2D p2 : leftCorePoints) { 04 double distance = epsEstimator.getDistance(Sets.newHashSet(p1, p2)); 05 if(distance <= eps) { 06 // join 2 core points to the same cluster 07 set.add(p2); 08 } 09 } 10 // remove connected points 11 leftCorePoints.removeAll(set); // ɾ³ýÒѾȷ¶¨ÎªÓëp1Á¬Í¨µÄºËÐĵã 12 return set; 13 }
|
»¹ÓÐÒ»¸ö·½·¨ÊÇ£¬ÉÏÃæµÚÒ»¸ö²ÎÊý±äΪһ¸ö¼¯ºÏ£¬Ëüµ÷ÓÃÉÏÃæµÄ·½·¨´¦Àíÿһ¸öµã£¬·½·¨´úÂëʵÏÖÈçÏÂËùʾ£º
private Set<Point2D> joinConnectedCorePoints(Set<Point2D> connectedPoints, Set<Point2D> leftCorePoints) { 2 Set<Point2D> set = Sets.newHashSet(); 3 for(Point2D p1 : connectedPoints) { 4 set.addAll(joinConnectedCorePoints(p1, leftCorePoints)); 5 } 6 return set; 7 }
|
×îºó£¬¼¯ºÏclusteredPoints´æ´¢µÄ¾ÍÊǾÛÀàºóµÄºËÐĵãµÄÐÅÏ¢£¬ÔÙ¸ù¾ÝºËÐĵ㵽ÆäÁÚÓòÄڰ뾶СÓÚµÈÓÚEpsµÄµãµÄ¼¯ºÏµÄÓ³É䣬¾ÍÄܽ«ËùÓеĵãÉú³É¾ÛÀàÁË¡£
¾ÛÀàЧ¹û±È½Ï
Ñ¡Ôñ²»Í¬µÄ°ë¾¶Eps½øÐоÛÀà·ÖÎö£¬ ¾ÛÀàµÄ½á¹ûÒ²²»Ïàͬ£¬ÓÐЩÇé¿öϲîÒìºÜ´ó£¬ÓÐЩÇé¿ö²îÒì½ÏС¡£±ÈÈ磬ÔÚMinPts=4µÄÇé¿öÏ£¬ÈçºÎ»æÖÆk-¾àÀëÇúÏߣ¬ÒѾÔÚÇ°ÃæÏêϸ˵Ã÷ÁË´¦Àí¹ý³Ì£¬ÎÒÃǸù¾Ýk-¾àÀëÇ÷ÊÆÔö³¤ÇúÏߣ¬Ñ¡ÔñÁËÒ»×é°ë¾¶EpsµÄÖµ£¬Ö´ÐÐDBSCAN¾ÛÀàËã·¨£¬ÏÂÃæÎÒÃDZȽÏÔÚMinPts=4ºÍMinPts=8µÄÇé¿öÏ£¬ÔÙ·Ö±ðÑ¡Ôñ²»Í¬µÄ°ë¾¶Eps£¬Ö´ÐÐDBSCAN¾ÛÀàËã·¨Éú³É´Ø£¬¶Ô·Ö²¼Çé¿ö½øÐжԱȡ£
²ÎÊý£ºMinPts=4
Ñ¡Ôñ°ë¾¶EpsµÄÖµ·Ö±ðΪÈçÏÂ3¸ö¹Û²ìÖµ£º
0.0025094814205335555 2 0.004417483559674606 3 0.006147849217403014
|
×îÖյõ½µÄ¾ÛÀàЧ¹ûͼÔÚÏÂÃæ¿ÉÒÔ¿´µ½£¬ÆäÖУ¬×ó²àΪûÓÐÀëȺµãµÄÇé¿ö¸÷¸ö´ØµÄ·Ö²¼Çé¿ö£¬ÓÒ²àÊǽ«ÀëȺµãÈ«²¿¼ÓÈ뵽ͼÉÏÏÔʾ£¬Í¼ÖÐͼÀýÖеÄ9999±íʾÀëȺµã£¬ÆäËûµÄ¶¼ÊǾÛÀàÉú³ÉµÄ´Ø£¬ÈçÏÂͼËùʾ£º

ͨ¹ýÉÏͼ¿ÉÒÔ¿´³ö£¬°ë¾¶EpsÑ¡ÔñµÄ½ÏСʱ£¬»á²úÉú´óÁ¿µÄÀëȺµã£¬ÆäʵÎÒÃÇÏëһϣ¬°ë¾¶Ð¡ÁË£¬×ÔÈ»ÂäÔÚij¸öµãµÄÁÚÓòÄڵĵã¼õÉٵĿÉÄÜÐÔ¾ÍÔö¼ÓÁË£¬µ¼Öºܶàµã¿ÉÄܾÍÎÞ·¨³ÉΪºËÐĵ㣬×ÔȻҲ¾ÍÎÞ·¨³ÉΪij¸ö´ØµÄµã£¬¶øÇÒºÜÉú³ÉµÄ´Ø°üº¬µÄµãµÄÊýÁ¿¶¼±È½ÏÉÙ£¬Ä³Ð©±¾À´ºÜ½Ó½üµÄµãÓ¦¸Ã¿ÉÒÔ³ÉΪͬһ¸ö´Ø£¬µ«ÊDZ»·ÖÁÑÁË¡£
µ±°ë¾¶±È½Ï´óһЩʱ£¬Éú³ÉµÄÒ»¸ö´Ø°üº¬Á˱Ƚ϶àµÄµã£¬¶øÇÒÕâ¸ö´ØµÄÐÎ×´Öмä¿ÉÄܳöÏÖһЩ¡°¿Õ¶´¡±£¬ÒòΪµãÖ®¼äµÄ¾àÀë´óһЩҲÄÜÂú×ã³ÉΪºËÐĵãµÄÒªÇó£¬ËùÒÔÕâЩµã¾Ûµ½Ò»´ØÖпÉÄÜȷʵ±È½ÏºÏÀí¡£
²ÎÊý£ºMinPts=8
Ñ¡Ôñ°ë¾¶EpsµÄÖµ·Ö±ðΪÈçÏÂ3¸ö¹Û²ìÖµ£º
0.004900098978598581 2 0.009566439044911 3 0.013621050253196359
|
ʹÓÃÎÒÃÇʵÏÖµÄDBSCAN¾ÛÀàËã·¨½øÐзÖÎö´¦Àí£¬µÃµ½µÄ¾ÛÀà½á¹û£¬ÈçÏÂͼËùʾ£º

×ܽá
ÒòΪDBSCAN¾ÛÀàËã·¨£¬ÊÇ»ùÓÚÃܶȵľÛÀàËã·¨£¬ËùÒÔ¶ÔÓÚÃܶȷֱ𲻾ù£¬¸÷¸ö´ØµÄÃܶȱ仯½Ï´óʱ£¬¿ÉÄܻᵼÖÂһЩÎÊÌ⣺±ÈÈç°ë¾¶Eps½Ï´óʱ£¬±¾À´²»ÊôÓÚͬһ¸öµÄ´ØµÄµã±»¾Ûµ½Ò»¸ö´ØÖУ»±ÈÈç°ë¾¶Eps±È½ÏСʱ£¬»á³öÏÖ´óÁ¿±È½ÏСµÄ´Ø£¨¼´Ã¿¸ö´ØÖк¬Óеĵ㲻½ÏÉÙ£¬µ«ÊÇÕâЩµã×é³ÉµÄС´ØÃܶÈȷʵºÜ´ó£©£¬Í¬Ê±³öÏÖÁË´óÁ¿µÄµã²»Âú×ã³ÉΪºËÐĵãµÄÒªÇó£¬MinPtsÔ½´óÔ½ÈÝÒ׳öÏÖÕâÖÖÇé¿ö¡£
DBSCAN¾ÛÀàËã·¨µÄ˼Ïë·Ç³£Ã÷È·Ò×¶®£¬ËäÈ»ÐèÒªÓû§ÊäÈë2¸ö²ÎÊý£¬µ«ÊÇÕýÊÇÊäÈë²ÎÊýµÄÁé»îÐÔ£¬ÎÒÃÇ¿ÉÒÔ¸ù¾Ý×Ô¼ºÊµ¼ÊÓ¦ÓõÄÐèÒª£¬Êʵ±µ÷Õû²ÎÊýÖµ£¬ÔÚÌØ¶¨Ó¦Ó󡾰ϽøÐоÛÀà·ÖÎö£¬µÃµ½ºÏÊʵĴػ®·Ö¡£Í¨¹ýÉÏÃæÑ¡Ôñ²»Í¬²ÎÊý½øÐоÛÀàµÄ½á¹û¶Ô±È£¬Èç¹ûÏ£Íû¾Ö²¿±È½ÏÃܼ¯µÄµã¶¼Äܹ»Éú³É´Ø£¬ÄÇôÔڹ̶¨MinPtsµÄÖµµÄÌõ¼þÏ£¬°ë¾¶Epsͨ¹ýµ÷Õû±äС£¬¿ÉÄÜ´ïµ½ÕâÒ»ÐèÒª£¬²úÉúµÄ´ØµÄÊýÁ¿±È½Ï¶à£¬µ«ÊÇͬʱҲ²úÉúÁË´óÁ¿·ÖÉ¢µÄÀëȺµã£¬Êµ¼ÊÉÏÓ¦¸Ã¿ÉÒÔ½øÐжþ´Î¾ÛÀ࣬½«ÀëȺµãҲͨ¹ýºÏÊʵķ½Ê½¹éµ½Ö¸¶¨µÄ´ØÖУ»Èç¹ûÏ£ÍûÉú³ÉÈ«¾Ö±È½Ï´óµÄ´Ø£¬¿ÉÒÔÊʵ±µ÷Õû°ë¾¶Eps±ä´ó£¬Éú³ÉµÄ´ØµÄÊýÁ¿½ÏÉÙ£¬ÀëȺµãµÄÊýÁ¿Ò²Ïà¶Ô½ÏÉÙ¡£
|