| ±à¼ÍƼö: |
±¾ÎÄÀ´×ÔÓÚarleyzhang²©¿Í£¬±¾ÎÄÖ÷Òª½éÉÜÄ¿±ê¼ì²âÖÐÁ½²½¼ì²âËã·¨µÄ×ܽá¶Ô±È£¬Ìá³öÁËÒ»ÖÖеÄÁ½²½¼ì²âÄ£ÐÍ£¬
Light-Head RCNN £¬Ï£Íû»á¶ÔÄúµÄѧϰÓÐËù°ïÖú¡£
|
|
R-CNN
RbgÌá³öµÄR-CNNµÄ·½·¨
1.Ò»ÕÅͼÏñÏÈͨ¹ýselective searchµÄ·½·¨£¬Éú³É1K~2K¸öºòÑ¡ÇøÓò£¬Õâ¸ö²½ÖèÉú³ÉµÄºòÑ¡ÇøÓò´óСÊDz»Ò»ÑùµÄ£¬Òò´ËÐèÒª
warped region£¬Ò²¾ÍÊǽ«²»Í¬´óСµÄ region Ëõ·Åµ½Í¬ÑùµÄ³ß´ç£¬ÒòΪCNNºóÃæµÄÈ«Á¬½Ó²ãÒªÇóÊäÈë³ß´ç¹Ì¶¨¡£
2.¶Ôÿ¸ö warped ºóµÄºòÑ¡ÇøÓò£¬Ê¹ÓÃCNNÌáÈ¡ÌØÕ÷ £¬ÌáÈ¡µÄÌØÕ÷Ðè´æ´¢µ½´ÅÅÌ£»
3.¶ÁÈ¡ÌØÕ÷£¬ËÍÈëÿһÀàµÄ SVM ·ÖÀàÆ÷£¬ÅбðÊÇ·ñÊôÓÚ¸ÃÀࣻ
4.×îÖÕ»¹ÓÐÒ»¸öλÖûعéÆ÷ÓÃÓÚ¾«Ï¸ÐÞÕý¡£
»Ø¹é·½·¨£º
»Ø¹éº¯Êý£º
Ëðʧº¯Êý£º
ѵÁ··½·¨£º
Óŵ㣺
1.½á¹¹¼òµ¥Ã÷ÁË£¬ÈÝÒ×Àí½â£»
2.CNN×Ô¶¯ÌáÈ¡ÌØÕ÷£¬Ê¡È¥ÊÖ¹¤Éè¼ÆÌØÕ÷µÄ¸´ÔÓ²Ù×÷£¬ÒÔ¼°¶Ô¾ÑéºÍÔËÆøµÄÒÀÀµÐÔ£»
3.ʹÓà selective search·½·¨À´Éú³ÉºòÑ¡ÇøÓò£¬ÏÔÖø¼õÉÙºòÑ¡ÇøÓòµÄÊýÁ¿£¬Ôö¼ÓºòÑ¡ÇøÓòµÄÖÊÁ¿£¨°üº¬Ä¿±êµÄ¿ÉÄÜÐÔ¸ü´ó£©£¬ÒòΪÕâÏ൱ÓÚÒ»¸öÈõ¼ì²âÆ÷£¬Ïà±ÈÓÚsliding
windowÇî¾ÙËÑË÷µÄ·½Ê½¿Ï¶¨ÒªºÃºÜ¶à£»
4.Ô¤²â¾«¶ÈÌá¸ßÁË30%¡£
ȱµã£º
1.ÌáÈ¡ÌØÕ÷ʱ£¬CNNÐèÒªÔÚÿһ¸öºòÑ¡ÇøÓòÉÏÅÜÒ»±é£»ºòÑ¡ÇøÓòÖ®¼äµÄ½»µþʹµÃÌØÕ÷±»Öظ´ÌáÈ¡,
Ôì³ÉÁËÑÏÖØµÄ
2.ËÙ¶ÈÆ¿¾±, ½µµÍÁ˼ÆËãЧÂÊ;
3.½«ºòÑ¡ÇøÓòÖ±½ÓËõ·Åµ½¹Ì¶¨´óС, ÆÆ»µÁËÎïÌåµÄ³¤¿í±È, ¿ÉÄܵ¼ÖÂÎïÌåµÄ¾Ö²¿Ï¸½ÚËðʧ;
4.R-CNN»¹²»ÊǶ˵½¶ËµÄÄ£ÐÍ£¬ÑµÁ·²½Öè·±Ëömulti-stage£¨ÏÈԤѵÁ·¡¢fine
tuning¡¢´æ´¢CNNÌáÈ¡µÄÌØÕ÷£»ÔÙѵÁ·SVM £»ÔÙregression£©¡£´Ófine tuning
µ½ÑµÁ·SVMʱ£¬²»ÄÜÒ»²½µ½Î»£¬Òª·Ö³ÉÁ½²½£»
5.ѵÁ·SVMʱÐèÒª½«Ö®Ç°CNNÌáÈ¡µ½µÄÌØÕ÷È«²¿´æ´¢ÔÚ´ÅÅÌÉÏ£¬´ÅÅ̶ÁдºÄʱ£¬ÇÒÕ¼Óÿռä´ó£¬£¨Pascal
200G£©£»
6.ʹÓöîÍâµÄselective search Ëã·¨Éú³ÉºòÑ¡ÇøÓòµÄ¹ý³ÌÒ²ºÜºÄʱ£»
7.Ô¤²âʱ¼äºÜÂý£¬Ò»ÕÅͼƬҪ49s¡£
SPPNet
Õë¶ÔR-CNNµÄÁ½¸öȱÏÝ£º
1.ÏÈÉú³ÉºòÑ¡ÇøÓò£¬ÔÙ¶ÔÇøÓò½øÐоí»ý£¬ºòÑ¡ÇøÓòÖ®¼äµÄ½»µþʹµÃÌØÕ÷±»Öظ´ÌáÈ¡,
Ôì³ÉÁËÑÏÖØµÄËÙ¶ÈÆ¿¾±, ½µµÍÁ˼ÆËãЧÂÊ;
2.½«ºòÑ¡ÇøÓòÖ±½ÓËõ·Åµ½¹Ì¶¨´óС, ÆÆ»µÁËÎïÌåµÄ³¤¿í±È, ¿ÉÄܵ¼ÖÂÎïÌåµÄ¾Ö²¿Ï¸½ÚËðʧ;
ºÎ¿Ã÷Ìá³öµÄ¸Ä½ø·½·¨ÈçÏ£º
¸Ä±äÉú³ÉºòÑ¡ÇøÓòµÄ˳Ðò¡£´ÓÏÈÉú³ÉºòÑ¡ÇøÓòÔÙÌáÈ¡ÌØÕ÷£¬±ä³ÉÏÈÌáÈ¡ÌØÕ÷ÔÙÉú³ÉºòÑ¡ÇøÓò£¬ÊµÏÖÁËÌØÕ÷ÌáÈ¡²¿·ÖµÄ¼ÆËã¹²Ïí£¬¼«´óµÄ¼õÉÙ¼ÆËãÁ¿¡£Éú³ÉºòÑ¡ÇøÓòµÄ·½Ê½»¹ÊÇ
selective searchËã·¨£¬ÔÚÔͼÉÏÉú³ÉºòÑ¡ÇøÓòºó£¬Ó³Éäµ½ÌØÕ÷ͼÉÏÈ¥¡£
ʹÓÃSPP³Ø»¯£¨Spatial pyramid model, ¿Õ¼ä½ð×ÖËþ³Ø»¯³Ø»¯£©£º´«Í³µÄ³Ø»¯·½Ê½ÊÇ
ÒÑÖªÊäÈë³ß´çºÍ ¹Ì¶¨³Ø»¯ºË´óС£¬È·¶¨Êä³ö³ß´ç£¬ÄÇôÕâʱºòÊä³ö³ß´ç¿Ï¶¨ÊÇËæÊäÈë³ß´ç±ä»¯µÄ£¬ËùÒÔÕâʱºò¾ÍÒªÇóÊäÈëͼƬÊǹ̶¨³ß´ç¡£¶øSPP³Ø»¯ÊÇ
ÒÑÖªÊäÈë³ß´ç ºÍ ¹Ì¶¨Êä³ö³ß´ç£¬À´È·¶¨È·¶¨³Ø»¯ºËµÄ´óС¡£SPP ²ãÓò»Í¬´óСµÄ³Ø»¯´°¿Ú×÷ÓÃÓÚ¾í»ýµÃµ½µÄÌØÕ÷ͼ£¬³Ø»¯´°¿ÚµÄ´óСºÍ²½³¤¸ù¾ÝÌØÕ÷ͼµÄ³ß´ç½øÐж¯Ì¬¼ÆË㣬×îÖÕ¿ÉÒÔ×éºÏ³ÉÒ»¸öÌØ¶¨Î¬¶ÈµÄÌØÕ÷Êä³ö¡£ÕâÀïµÄÊäÈë¿ÉÒÔÊÇÒ»¸öfeature
map£¨·ÖÀàÎÊÌ⣩£¬Ò²¿ÉÒÔÊÇÒ»¸öwindow£¨¼ì²âÎÊÌ⣩¡£
ѵÁ·¹ý³Ì£º
Óŵ㣺
SPP-net ¶ÔÓÚÒ»·ùͼÏñµÄËùÓкòÑ¡ÇøÓò, Ö»ÐèÒª½øÐÐÒ»´Î¾í»ý¹ý³Ì, ±ÜÃâÁËÖØ¸´¼ÆËã, ÏÔÖøÌá¸ßÁ˼ÆËãЧÂÊ¡£¸Ã·½·¨ÔÚËÙ¶ÈÉϱÈ
R-CNN Ìá¸ß 24 ~102 ±¶ .
SPP³Ø»¯²ãʹµÃ¼ì²âÍøÂç¿ÉÒÔ´¦ÀíÈÎÒâ³ß´çµÄͼÏñ, Òò´Ë¿ÉÒÔ²ÉÓöà³ß¶ÈͼÏñÀ´ÑµÁ·ÍøÂç, ´Ó¶øÊ¹µÃÍøÂç¶ÔÄ¿±êµÄ³ß¶ÈÓкܺõij°ôÐÔ.
ȱµã£º
SPP-net µÄѵÁ·¹ý³Ì¸ü¸´ÔÓÁË£¬£¨ÏÈԤѵÁ·¡¢´æ´¢SPPÌØÕ÷¡¢Ê¹ÓÃSPPÌØÕ÷fine tuningÈ«Á¬½Ó²ã¡¢´æ´¢CNNÌáÈ¡µÄÌØÕ÷£»ÔÙѵÁ·SVM
£»ÔÙregression£©¡£
CNN ÌáÈ¡µÄÌØÕ÷´æ´¢ÐèÒªµÄ¿Õ¼äºÍʱ¼ä¿ªÏúÔö´ó;
ÔÚ΢µ÷½×¶Î, SPP-net Ö»Äܸüпռä½ð×ÖËþ³Ø»¯²ãºóµÄÈ«Á¬½Ó²ã, ¶ø²»Äܸüоí»ý²ã(ºÃÏñÊÇÌݶȲ»Á¬Ðø),
ÕâÏÞÖÆÁ˼ì²âÐÔÄܵÄÌáÉý¡£
Fast R-CNN
Ϊʲô SPPnetºÍ R-CNNѵÁ·ºÜÂý£¿
Ö÷ÒªÔÒòÓÐÁ½µã
1.ʹÓÃSVM×ö·ÖÀàÆ÷ʱ£¬ÐèÒª½«ÌØÕ÷ÊÂÏÈ´æ´¢µ½´ÅÅÌÉÏ£¬´ÅÅ̽»»¥ºÄʱ£»
2.ѵÁ·²½Öè·±Ëö£¬²»ÄÜÁªºÏѵÁ·¡£
Õë¶Ô R-CNN ºÍ SPPNet µÄÕâÁ½¸öÎÊÌâ, rbg Ìá³ö Äܹ»¶Ëµ½¶ËÁªºÏѵÁ·µÄ Fast R-CNN
Ëã·¨ £¬ÈçÏ£º
Ê×ÏÈÔÚͼÏñÖÐÌáÈ¡¸ÐÐËÈ¤ÇøÓò (Regions of Interest, RoI)£¬»¹ÊÇʹÓÃselective
searchËã·¨£¬Éú³ÉµÄºòÑ¡ÇøÓòÕâÀï³ÆÎªROI£¬½«ROIÓ³Éäµ½feature mapÉÏ;
È»ºó²ÉÓÃÓë SPP-net ÏàËÆµÄ´¦Àí·½Ê½,¶Ôÿ·ùͼÏñÖ»½øÐÐÒ»´Î¾í»ý,
ÔÚ×îºóÒ»¸ö¾í»ý²ãÊä³öµÄÌØÕ÷ͼÉ϶Ôÿ¸ö RoI ½øÐÐÓ³Éä, µÃµ½ÏàÓ¦µÄRoI µÄÌØÕ÷ͼ, ²¢ËÍÈë RoI
³Ø»¯²ã (Ï൱ÓÚµ¥²ãµÄSPP ²ã, ͨ¹ý¸Ã²ã°Ñ¸÷³ß´çµÄÌØÕ÷ͼͳһµ½ÏàͬµÄ´óС);
×îºó¾¹ýÈ«Á¬½Ó²ãµÃµ½Á½¸öÊä³öÏòÁ¿, Ò»¸ö½øÐÐ Softmax ·ÖÀà, ÁíÒ»¸ö½øÐб߿ò»Ø¹é.
¸Ä½øµÄ·½·¨£º
´®Ðнṹ¸Ä³É²¢Ðнṹ £ºÔÀ´µÄ R-CNN ÊÇÏȶԺòÑ¡¿òÇøÓò½øÐзÖÀ࣬ÅжÏÓÐûÓÐÎïÌ壬Èç¹ûÓÐÔò¶Ô
Bounding Box ½øÐо«ÐÞ »Ø¹é ¡£ÕâÊÇÒ»¸ö´®ÁªÊ½µÄÈÎÎñ£¬ÄÇÃ´ÊÆ±ØÃ»Óв¢ÁªµÄ¿ì£¬ËùÒÔ rbg
¾Í½«ÔÓнṹ¸Ä³É²¢ÐУ¬ÔÚ·ÖÀàµÄͬʱ£¬¶Ô Bbox ½øÐлع飻
ROI³Ø»¯£ºÔÚÕâ¸öÄ£ÐÍÀROI¾ÍÊǸÐÐËÈ¤ÇøÓò(Regions of Interest, RoI)
£¬Ò²¾ÍÊÇ֮ǰģÐÍÖеĺòÑ¡ÇøÓò¡£SPP³Ø»¯µÄ¸Ä½ø£¬Ï൱ÓÚÖ»ÓÃÁËÒ»ÖֳߴçµÄ SPP³Ø»¯£»
²»ÓÃSVM·ÖÀ࣬¸ÄÓÃSoftMax·ÖÀ࣬¿ÉÒÔÊ¡È¥ÌØÕ÷´æ´¢£»
ʹÓÃmulti-task loss ¶àÈÎÎñËðʧº¯Êý£¨·ÖÀà+»Ø¹é£©£¬¶Ëµ½¶Ë£¨end-to-end£©ÑµÁ·¡£
ѵÁ··½Ê½£º
multi-task Ëðʧº¯Êý£º
еÄÌôÕ½¼°½â¾ö·½·¨£º
ROI³Ø»¯£º
Ò»¾ä»°¸ÅÀ¨×÷Ó㺽«²»Í¬³ß´çÊäÈëµÄfeature map»òÕß ROI£¬½µ²ÉÑù³É¹Ì¶¨³ß´çµÄÊä³ö feature
map£¬ÔÙËÍÈëÈ«Á¬½Ó²ã¡£
×ö·¨£º
½«imageÖеÄROI Ó³Éäµ½feature map ÖжÔӦλÖõÄÇøÓò£¬ÕâÐ©ÇøÓò´óСÊDz»Í³Ò»µÄ£¨ÒÑÖªÊäÈ룩£»
½«Ó³ÉäºóµÄÇøÓò»®·ÖΪÏàͬ´óСµÄsections£¨sectionsÊýÁ¿ÓëÊä³öµÄά¶ÈÏàͬ£¬¹Ì¶¨Êä³ö³ß´ç£©£»Õâ¸ö¹ý³ÌÖУ¬Ã¿¸öROI·ÖºÃµÄsectionÖеÄÏñËØÊýÁ¿ÊDz»Ò»ÑùµÄ£¨³Ø»¯ºËµÄ´óСºÍ²½³¤¸ù¾ÝÊäÈëºÍÊä³ö³ß´ç½øÐж¯Ì¬¼ÆË㣩£¬×îÖÕ¿ÉÒÔ×éºÏ³ÉÒ»¸öÌØ¶¨Î¬¶ÈµÄÌØÕ÷Êä³ö£»
¶Ôÿ¸ösections½øÐÐmax pooling²Ù×÷£»
²Î¿¼£ºRegion of interest pooling explained
Óŵ㣺
1.¾«¶ÈÓÐÌá¸ß
2.ͨ¹ýʹÓà multi-task loss £¬¿ÉÒÔʵÏÖ end-to-endѵÁ·£¬single-stage£¬³ýÁËԤѵÁ·Ö®Í⣬ÆäËûµÄ¶¼ÊÇ¿ÉÒÔÒ»ÆøºÇ³ÉµÄ¡£
3.·ÖÀàºÍ»Ø¹éÈÎÎñ¿ÉÒÔ¹²Ïí¾í»ýÌØÕ÷,Ï໥´Ù½ø.
4.Fast R-CNN ²ÉÓà Softmax ·ÖÀàÓë±ß¿ò»Ø¹éÒ»Æð½øÐÐѵÁ·,
ʡȥÁËÌØÕ÷´æ´¢, Ìá¸ßÁ˿ռäºÍʱ¼äÀûÓÃÂÊ¡£ Óë R-CNN Ïà±È, ÔÚѵÁ· VGG ÍøÂçʱ,Fast
R-CNN µÄѵÁ·½×¶Î¿ì 9 ±¶, ²âÊԽ׶οì 213±¶; Óë SPP-net Ïà±È, Fast R-CNN
µÄѵÁ·½×¶Î¿ì 3±¶, ²âÊԽ׶οì 10 ±¶¡£
ȱµã£º
Fast R-CNN ÈÔÈ»´æÔÚËÙ¶ÈÉÏµÄÆ¿¾±, ¾ÍÊǺòÑ¡ÇøÓòÉú³É²½ÖèºÄ·ÑÁËÕû¸ö¼ì²â¹ý³ÌµÄ´óÁ¿Ê±¼ä.
Faster R-CNN
ΪÁ˽â¾öºòÑ¡ÇøÓòÉú³É²½ÖèÏûºÄ´óÁ¿¼ÆËã×ÊÔ´, µ¼Ö¼ì²âËٶȹýÂýµÄÎÊÌâ,
ÈÎÉÙÇ䣬ºÎ¿Ã÷£¬rbgÁªºÏÌá³öÇøÓòÉú³ÉÍøÂç (Region proposal network, RPN),
²¢ÇÒ°ÑRPN ºÍ Fast R-CNN Èںϵ½Ò»¸öͳһµÄÍøÂç (³ÆÎª Faster R-CNN),
¶þÕß¹²Ïí¾í»ýÌØÕ÷. ÈçÏ£º
RPN ½«Ò»Õû·ùͼÏñ×÷ΪÊäÈë, Êä³öһϵÁеľØÐκòÑ¡ÇøÓò. ËüÊÇÒ»¸öÈ«¾í»ýÍøÂçÄ£ÐÍ, ͨ¹ýÔÚÓë Fast
R-CNN ¹²Ïí¾í»ý²ãµÄ×îºóÒ»²ãÊä³öµÄÌØÕ÷ͼÉÏ»¬¶¯Ò»¸öСÐÍÍøÂ磨sliding window£©, Õâ¸öÍøÂçÓëÌØÕ÷ͼÉϵÄС´°¿ÚÈ«Á¬½Ó,
ÿ¸ö»¬¶¯´°¿ÚÓ³Éäµ½Ò»¸öµÍάµÄÌØÕ÷ÏòÁ¿, ÔÙÊäÈë¸øÁ½¸ö²¢ÁеÄÈ«Á¬½Ó²ã, ¼´·ÖÀà²ã (cls layer)
ºÍ±ß¿ò»Ø¹é²ã(reg layer), ÓÉÓÚÍøÂçÊÇÒÔ»¬¶¯´°µÄÐÎʽÀ´½øÐвÙ×÷, ËùÒÔÈ«Á¬½Ó²ãµÄ²ÎÊýÔÚËùÓпռäλÖÃÊǹ²ÏíµÄ.
RPNµÄ½á¹¹ÔÚʵÏÖʱʵ¼ÊÉÏÊÇÒ»¸öÈ«¾í»ýÍøÂç¡£
RPNÊÇÒ»¸öÈõ¼ì²âÆ÷£¬RPNµÄÊä³öÊÇһЩ¿ÉÄܰüº¬Ä¿±êµÄºòÑ¡¿ò£¨ region proposals
»òÕß³ÆÎª region of interest ,ROI£©,ÕâЩROI ½«»áÊäÈëFast R-CNNÖУ¬ÓÃÓÚ×îºóµÄ¼ì²â¡£
ѵÁ··½Ê½£º
AnchorµÄ×÷ÓÃ
°´ÕÕÁ½²½¼ì²âµÄ¹ßÀý£¬Ó¦¸ÃÒªÏÈÓгõ²½µÄROI £¬È»ºó²ÅÊÇ×îÖյķÖÀàºÍ»Ø¹é£¬¶ÔÓÚÔ°æµÄFast R-CNNÀ´Ëµ£¬ËüµÄROIÊÇÓÉselective
searchËã·¨ÌṩµÄ£»¶øFaster R-CNNÖÐµÄ Fast R-CNNµÄ ROI ÔòÊÇÓÉRPNÍøÂç²úÉúµÄ¡£
ÄÇôRPN¼ÈÈ»ÊÇÒ»¸öÈõ¼ì²âÆ÷£¬ÄÇôRPNµÄROI»òÕß region proposals´ÓÄÄÀ´£¿ ´ð°¸ÊÇ´Ó
anchor ÖÐÀ´¡£
½áºÏÉÏÃæµÄͼ£¬RPNÔÚCNNÌáÈ¡ÌØÕ÷Ö®ºóÒÔsliding windowµÄ·½Ê½ÔÚ×îºóÒ»¸öfeature
mapÉÏÌáÈ¡ÌØÕ÷£¬Ã¿¸ö»¬¶¯´°¿ÚÖÐÐͼ¹ØÁª×Å k¸ö box£¬ÕâЩbox¾Í³ÆÎªanchor£¬»òÕß½Ðanchor
box¡£ÕâЩ¹ØÁªµÄbox ¿ÉÒÔͨ¹ýÄæÏòÓ³Éä¶ÔÓ¦µ½ÔͼÉÏ£¬¶ÔÓ¦µ½ÔͼÉϵÄÇøÓò¾ÍÊÇregion proposals£¬²»¹ýÕâЩregion
proposals¶¼ÊÇλÓÚͬһ¸öÖÐÐĵ㡣¾ÍÊÇ˵sliding windowʱµÄwindow£¨´óС¹Ì¶¨£©ÊÇÓÉÕâЩÔͼÉϵIJ»Í¬´óСºÍ±ÈÀýµÄ
region proposals Éú³ÉµÄ£¨ÀàËÆÓÚROI³Ø»¯µÄ¹¦ÄÜ£©¡£
ʵ¼ÊÉÏsliding windowʱÿ¸ö window Æðµ½ÁËÒ»²¿·Öregion proposals
µÄ×÷Ó㬵«ÊÇÓÉÓÚÕâÀïµÄsliding windowµÄ³ß´çÊǹ̶¨µÄ£¬ËùÒÔ²»ÄÜÆðµ½¶à³ß¶È£¬¶à³ß´ç£¨multiple
scales and sizes £©Ô¤²âµÄ×÷Óã¬Òò´ËÌá³ö¹ØÁªk¸ö²»Í¬´óСºÍ³¤¿í±ÈµÄanchor box£¬ÕâÑù¶þÕß½áºÏ¼´¿ÉÆðµ½¶à³ß¶È£¬¶à³ß´çÔ¤²âµÄ×÷Óᣲο¼ÏÂͼ£º
ʹÓÃanchorµÄºÃ´¦ÊÇ£¬RPN×îºósliding window ʱ¿ÉÒÔʹÓà ¾í»ýµÄ·½Ê½ÊµÏÖ£¨ÒòΪ
windowµÄ´óСÊǹ̶¨µÄ£©£¬Ê¹ÍøÂç±äµÃºÜ¼òµ¥¡£
¶øºóÃæµÄ 256άÏòÁ¿µÄÊäÈë ÓÉÓÚÓþí»ý²ãʵÏÖËùÒÔÒ²ÓÉ £¨1,1,256£©£¬±ä³ÉÁË£¨W,H,256£©.
µ«ÊÇÎÒÃÇÖªµÀºóÃæµÄÊäÈëÊǹ̶¨³ß´çµÄ window £¬ÄÇôÔÚ·ÖÀàºÍ»Ø¹éʱÊÇÈçºÎÀ´·´Ó³²»Í¬³ß´çºÍ±ÈÀýµÄ
region proposalsÄØ£¿´ð°¸ÊÇͨ¹ý ±êÇ©ºÍËðʧº¯Êý¡£
ÒÔÏÂÒýÓÃ×Ô [²Î¿¼×ÊÁÏ 1]£¬ÔÚÄÇλ´óÐֵܵIJ©¿ÍÖУ¬Ò»¿ªÊ¼ËûµÄÀí½âÊǶԵ쬵«ÊǺóÃæµÄ²¹³äËûÓÖ¸ø¸Ä´íÁË£¬µ«ÊDz»ÊÇʲô´ó´í£¬ÄÚÈÝÊǶԵģ¬Ö»ÊÇÒò¹û¹ØÏµ¸ã·´ÁË£¬ÕâÀïÖ»°ÑÄÚÈÝÌù³öÀ´£º
´ÓnxnÌá³öµÄ256dÌØÕ÷ÊDZ»ÕâkÖÖÇøÓò¹²ÏíµÄ£¬ÔÚclc layerºÍreg layer¼ÆËãËðʧµÄʱºò£¬ÓÃÕâ¹²ÏíµÄ256dÌØÕ÷
¼ÓÉÏ anchorÍÆËã³ökÖÖÇøÓòµÄ×ø±êºÍǰ¾°¡¢±³¾°µÄ±êÇ©£¬±ã¿ÉÒÔ¶ÔÕâkÖÖÇøÓòͬʱ¼ÆËãloss¡£
clc layerºÍreg layerͬʱԤ²âk¸öÇøÓòµÄǰ¾°¡¢±³¾°¸ÅÂÊ£¨1¸öÇøÓò2¸öscores£¬ËùÒÔÊÇ2k¸öscores£©£¬ÒÔ¼°bounding
box£¨1¸öÇøÓò4¸öcoordinates£¬ËùÒÔÊÇ4k¸öcoordinates£©£¬¾ßÌåµÄ˵£º
clc layerÊä³öÔ¤²âÇøÓòµÄ2¸ö²ÎÊý£¬¼´Ô¤²âΪǰ¾°µÄ¸ÅÂÊpaºÍpb£¬ËðʧÓÃsoftmax loss£¨cross
entropy loss£©£¨±¾À´»¹ÒÔΪÊÇsigmoid£¬ÕâÑùµÄ»°Ö»Ô¤²âpa¾Í¿ÉÒÔÁË£¿£©¡£ÐèÒªµÄ¼à¶½ÐÅÏ¢ÊÇY=0,1£¬±íʾÕâ¸öÇøÓòÊÇ·ñground
truth
reg layerÊä³öÔ¤²âÇøÓòµÄ4¸ö²ÎÊý£ºx,y,w,h£¬ÓÃsmooth L1 loss¡£ÐèÒªµÄ¼à¶½ÐÅÏ¢ÊÇ
anchorµÄÇøÓò×ø±ê{xa,ya,wa,ha} ºÍ ground truthµÄÇøÓò×ø±ê{x,y,w,h}
ÏÔÈ»£¬ÉÏÃæµÄ¼à¶½ÐÅÏ¢£ºY£¬{xa,ya,wa,ha}£¨k¸ö£©£¬{x*,y*,w*,h*}£¨1¸ö£©£¬¾ÍÊÇͨ¹ýanchor»úÖÆ²úÉúµÄ¡£Õ⼸¸ö²ÎÊýµÄÖ¸¶¨£¨±ÈÈçk¸öanchorÇøÓòµÄYÊÇÔõôµÃµ½µÄ£©ÊǸù¾ÝÎÄÕÂÖеÄÑù±¾²úÉú¹æÔò£¬ºÜ¶à²©¿ÍÖÐÒ²¶¼Ìáµ½ÁË¡£
²Î¿¼×ÊÁÏ£º
faster-rcnnÖУ¬¶ÔRPNµÄÀí½â
faster rcnnÖÐrpnµÄanchor£¬sliding windows£¬proposals£¿
R-FCN
Èç¹û²»¿¼ÂÇÉú³ÉROIµÄ²¿·Ö£¨±ÈÈçRPN£¬Region Proposal Network£©£¬Á½²½¼ì²âÄ£ÐÍ¿ÉÒÔ·ÖΪÁ½²¿·Ö×ÓÍøÂ磨subnetworks
£©£º
µÚÒ»²¿·ÖÊǹ²Ïí¼ÆËãµÄÈ«¾í»ý»ù´¡×ÓÍøÂç base£¬»ò³Æbody£¬trunk £¬ÕâÒ»²¿·ÖÊÇÓëROI¶ÀÁ¢µÄ£¬Ö÷ÒªÓÃÓÚÌáÈ¡ÌØÕ÷
µÚ¶þ²¿·ÖÊDz»¹²Ïí¼ÆËãµÄ×ÓÍøÂç head£¬Éú³ÉµÄÿһ¸öROI¶¼Òª¾¹ýhead²¿·Ö£¬Ö÷ÒªÓÃÓÚ·ÖÀà
Faster R-CNN ʵÏÖÁËºÜ¶à¼ÆËãµÄ¹²Ïí£ºROIÖ®¼äµÄÌØÕ÷ÌáÈ¡¹²Ïí¼ÆË㣬ROI ÌáÈ¡Óëbase²¿·Ö¹²Ïí¼ÆË㣬µ«ÊÇROIͨ¹ýhead²¿·ÖÊDz»¹²Ïí¼ÆËã¡£
R-FCN¾ÍÊÇ»ùÓÚFCN½« head²¿·ÖҲʵÏÖÁ˼ÆËã¹²Ïí¡£µ«ÊÇÓÉÓÚÖ±½Ó½« Faster R-CNN
µÄhead²¿·ÖÒ²¾ÍÊÇÈ«Á¬½Ó²ã¸ÄΪȫ¾í»ý²ã£¬È»ºóÔÙʹÓÃ
R-FCNÊÇÒ»ÖÖеĻùÓÚÇøÓòµÄÈ«¾í»ýÍøÂç¼ì²â·½·¨. ΪÁ˸øÍøÂçÒýÈëÆ½ÒÆ±ä»¯, ¹¹½¨¶ÔλÖÃÃô¸ÐµÄ³Ø»¯·½Ê½
(Position sensitive pooling), ±àÂë¸ÐÐËÈ¤ÇøÓòµÄÏà¶Ô¿Õ¼äλÖÃÐÅÏ¢. ¸ÃÍøÂç½â¾öÁË
Faster R-CNN ÓÉÓÚÖØ¸´¼ÆËãÈ«Á¬½Ó²ã¶øµ¼ÖµĺÄʱÎÊÌâ, ʵÏÖÁËÈÃÕû¸öÍøÂçÖÐËùÓеļÆËã¶¼¿ÉÒÔ¹²Ïí
¡£
Position sensitive pooling£º
ROI Pooling ÖеÄÿһ¸öÍø¸ñ¶¼À´×ÔÇ°Ãæ position-sensitive score
mapsÖв»Í¬×éͨµÀµÄ feature map¡£Õâ¸ö¸ú·Ö×é¾í»ýµÄÒâ˼ÓеãÏñ£¬Õâ¸ö¿ÉÒÔ½Ð×ö·Ö×éROI³Ø»¯¡£ÊÇÒ»ÖÖÑ¡ÔñÐÔROI³Ø»¯£¬Ö÷ÒªÊÇΪÁËÔöÇ¿¶ÔλÖõÄÃô¸Ð³Ì¶È¡£
Light-head RCNN
²»¹ÜÊÇ Faster R-CNN»¹ÊÇ R-FCN ÔÚ ROI(Region of Interest)
Éú³Éǰºó¶¼ÊǼÆËãÁ¿ºÜ´óµÄ£¬±ÈÈç Faster R-CNN µÄhead²¿·Ö°üº¬Á½¸öÈ«Á¬½Ó²ãÓÃÓÚROI
·ÖÀ࣬¶øÈ«Á¬½Ó²ã¼«´óµØÏûºÄ¼ÆË㣻R-FCNËäÈ»±È Faster R-CNN¿ìÐí¶à£¬µ«ÊÇÓÉÓÚÉú³ÉµÄ score
mapsÌ«¶à£¬ÏëÒª´ïµ½ÊµÊ±£¨30FPS, Frame Per Second£©»¹ÊÇÓеãÀ§Äѵġ£Ò²¾ÍÊÇ˵ÕâЩģÐÍÖ®ËùÒÔËÙ¶ÈÂýµÄÔÒòÔÚÓÚ¼ÆËãÁ¿¹ýÓÚ·±ÖصÄ
head²¿·Ö (heavy-head design )£¬¼´±ãÊǽ« base²¿·ÖÏ÷¼õѹËõ£¬¼ÆËãÏûºÄÒ²²»Äܴܺó³Ì¶ÈµÄÏ÷¼õ¡£
±¾ÎÄÌá³öÁËÒ»ÖÖеÄÁ½²½¼ì²âÄ£ÐÍ£¬ Light-Head RCNN £¬ÎªÁ˽â¾öÏÖÔÚµÄÁ½²½¼ì²âÆÕ±é´æÔÚµÄ
heavy-headµÄÎÊÌâ¡£ÔÚ±¾ÎÄÄ£Ð͵ÄÉè¼ÆÖÐʹÓÃÁËthin feature map ºÍ cheap
R-CNN subnet (pooling and single fully-connected layer)£¬ÊǶÔR-FCNµÄ¸Ä½ø¡£
¸Ä½ø£º
Òý½øinception V3ÖпɷÖÀë¾í»ýµÄ˼Ï룬×÷Õß²ÉÓÃlarge separable convolutionÉú³ÉchannelÊý¸üÉÙµÄfeature
map(´Ó3969¼õÉÙµ½490)¡£
ÓÃFC²ã´úÌæÁËR-FCNÖеÄglobal average pool£¬±ÜÃâ¿Õ¼äÐÅÏ¢µÄ¶ªÊ§¡£
²Î¿¼×ÊÁÏ£º
https://zhuanlan.zhihu.com/p/33158548
Mask R-CNN
ÔÚfatser rcnnµÄ»ù´¡É϶ÔROIÌí¼ÓÒ»¸ö·Ö¸îµÄ·ÖÖ§£¬Ô¤²âROIµ±ÖÐÔªËØËùÊô·ÖÀ࣬ʹÓÃFCN½øÐÐÔ¤²â£»
¾ßÌå²½Ö裺ʹÓÃfatser rcnnÖеÄrpnÍøÂç²úÉúregion proposal£¨ROI£©£¬½«ROI·ÖÁ½¸ö·ÖÖ§£º£¨1£©fatser
rcnn²Ù×÷£¬¼´¾¹ýROI pooling ÊäÈëfc½øÐзÖÀàºÍ»Ø¹é£»£¨2£©mask²Ù×÷£¬¼´Í¨¹ýROIAlignУÕý¾¹ýROI
PoolingÖ®ºóµÄÏàͬ´óСµÄROI£¬È»ºóÔÚÓÃfcn½øÐÐÔ¤²â£¨·Ö¸î£©¡£
ROIAlign²úÉúµÄÔÒò£ºRoI Pooling¾ÍÊǽ«ÔͼROIÇøÓòÓ³Éäµ½feature mapÉÏ£¬×îºópoolingµ½¹Ì¶¨´óСµÄ¹¦ÄÜ¡£µ±°ÑÔͼÉϵÄROI
Ó³Éäµ½ feature mapÉÏʱ£¬´æÔÚ¹éÒ»»¯»òÕßÁ¿»¯£¨¼´È¡Õû£©µÄ¹ý³Ì¡£ÔÚ¹éÒ»»¯µÄ¹ý³Ìµ±ÖУ¬ÓÉÓÚ´æÔÚ¶à´ÎÁ¿»¯¹ý³Ì£¨¾í»ý²½³¤£¬³Ø»¯£©£¬»á´æÔÚROIÓëÌáÈ¡µ½µÄÌØÕ÷²»¶Ô×¼µÄÏÖÏó³öÏÖ
Ò²¾ÍÊÇfeature mapÉϵÄROIÔÙÓ³Éä»áÔͼʱ»á¸úÔÀ´µÄROI¶Ô²»×¼£¬ÓÉÓÚ·ÖÀàÎÊÌâ¶ÔÆ½ÒÆÎÊÌâ±È½Ï³°ô£¬ËùÒÔÓ°Ïì±È½ÏС¡£µ«ÊÇÕâÔÚÔ¤²âÏñËØ¼¶¾«¶ÈµÄÑÚģʱ»á²úÉúÒ»¸ö·Ç³£µÄ´óµÄ¸ºÃæÓ°Ïì¡£×÷Õß¾ÍÌá³öÁËÕâ¸ö¸ÅÄîROIAlign£¬Ê¹ÓÃROIAlign²ã¶ÔÌáÈ¡µÄÌØÕ÷ºÍÊäÈëÖ®¼ä½øÐÐУ׼¡£
ROI AlignµÄ˼·ºÜ¼òµ¥£ºÈ¡ÏûÁ¿»¯²Ù×÷£¬Ê¹ÓÃË«ÏßÐÔ²åÖµµÄ·½·¨»ñµÃ×ø±êΪ¸¡µãÊýµÄÏñËØµãÉϵÄͼÏñÊýÖµ,´Ó¶ø½«Õû¸öÌØÕ÷¾Û¼¯¹ý³Ìת»¯ÎªÒ»¸öÁ¬ÐøµÄ²Ù×÷£¬¡£ÖµµÃ×¢ÒâµÄÊÇ£¬ÔÚ¾ßÌåµÄËã·¨²Ù×÷ÉÏ£¬ROI
Align²¢²»ÊǼòµ¥µØ²¹³ä³öºòÑ¡ÇøÓò±ß½çÉϵÄ×ø±êµã£¬È»ºó½«ÕâÐ©×ø±êµã½øÐгػ¯£¬¶øÊÇÖØÐÂÉè¼ÆÁËÒ»ÌױȽÏÓÅÑŵÄÁ÷³Ì£¬Èç
ͼ Ëùʾ£º
±éÀúÿһ¸öºòÑ¡ÇøÓò£¬±£³Ö¸¡µãÊý±ß½ç²»×öÁ¿»¯¡£
½«ºòÑ¡ÇøÓò·Ö¸î³Ék x k¸öµ¥Ôª£¬Ã¿¸öµ¥ÔªµÄ±ß½çÒ²²»×öÁ¿»¯¡£
ÔÚÿ¸öµ¥ÔªÖмÆËã¹Ì¶¨Ëĸö×ø±êλÖã¬ÓÃË«ÏßÐÔÄÚ²åµÄ·½·¨¼ÆËã³öÕâËĸöλÖõÄÖµ£¬È»ºó½øÐÐ×î´ó³Ø»¯²Ù×÷¡£
ÕâÀï¶ÔÉÏÊö²½ÖèµÄµÚÈýµã×÷һЩ˵Ã÷£ºÕâ¸ö¹Ì¶¨Î»ÖÃÊÇÖ¸ÔÚÿһ¸ö¾ØÐε¥Ôª£¨bin£©Öа´Õչ̶¨¹æÔòÈ·¶¨µÄλÖᣱÈÈ磬Èç¹û²ÉÑùµãÊýÊÇ1£¬ÄÇô¾ÍÊÇÕâ¸öµ¥ÔªµÄÖÐÐĵ㡣Èç¹û²ÉÑùµãÊýÊÇ4£¬ÄÇô¾ÍÊǰÑÕâ¸öµ¥ÔªÆ½¾ù·Ö¸î³ÉËĸöС·½¿éÒÔºóËüÃÇ·Ö±ðµÄÖÐÐĵ㡣ÏÔÈ»ÕâЩ²ÉÑùµãµÄ×ø±êͨ³£ÊǸ¡µãÊý£¬ËùÒÔÐèҪʹÓòåÖµµÄ·½·¨µÃµ½ËüµÄÏñËØÖµ¡£
|