±à¼ÍƼö: |
±¾ÎÄÀ´×ÔÓÚcsdn£¬ÎÄÕÂÖ÷Òª½éÉÜÁËMask
RCNNÕûÌåʵÏÖ¿ò¼Ü¡¢FPNºÍRPNµÄ¶ÔÓ¦¹ØÏµÒÔ¼°·ÖÀàºÍbbox»Ø¹éµÈÏà¹ØÄÚÈÝ¡£ |
|
²Î¿¼×ÊÁÏ
Òª³ä·ÖÀí½âmaskRCNN½¨ÒéÏÈͨ¶ÁRCNµÄϵÁÐÂÛÎÄÁ˽âÖ÷ÌâÂöÂç, È»ºó²Î¿¼´úÂëʵÏÖÁ˽âϸ½Ú¡£
RCNN
FAST-RCNN
FASTER-RCNN
FPN
MASK-RCNN
±¾ÎÄÄÚÈÝ»ùÓÚmatterportµÄʵÏÖ°æ±¾£¬ÕâÀïÓÐÒ»·Ý¹Ù·½²©¿Í½éÉÜÁËһЩʵÏÖϸ½Ú£¬ÍƼöÔĶÁ¡£
ÕûÌå¼Ü¹¹
ÏÂͼΪmask-rcnnµÄÕûÌåʵÏÖ¿ò¼Ü

ѵÁ·ºÍÍÆµ¼¹ý³ÌµÄÇø±ð
´ÓͼÖпÉÒÔ¿´³öÀ´£¬MASK-RCNNµÄѵÁ·ºÍÍÆµ¼¹ý³ÌÂÔÓв»Í¬¡£
1£© ѵÁ·µÄʱºò£¬·ÖÀàÆ÷ʹÓõÄregion proposalÊǸù¾Ýground truthºÍrpnµÄ½á¹û¼ÆËã³öÀ´µÄ£¬¶øÍƵ¼µÄʱºò£¬Ö±½ÓʹÓÃRPNµÄ½á¹û¡£
2£© ѵÁ·µÄʱºò·ÖÀàÆ÷ºÍmaskÉú³ÉÆ÷ÊDz¢Ðеģ¬ÍƵ¼µÄʱºòÊÇ´®Ðеģ¬ÏȽøÐзÖÀàºÍbboxµÄ»Ø¹é£¬È»ºóʹÓÃÆä½á¹û½øÐÐmaskµÄÉú³É¡£
3£©×¢ÒâËäÈ»Á÷³Ì²»Í¬£¬µ«ÊDz»Ò»ÑùµÄ²¿·Ö(detection target layerºÍdetection
layer)Êǹ̶¨µÄÁ÷³Ì£¬Ã»ÓвÎÊýºÍ¡®¿Éѧϰ¡¯µÄ²¿·Ö¡£ÆäËûÖ÷ÒªµÄÐèҪѵÁ·Ñ§Ï°µÄÍøÂçÊÇÒ»ÑùµÄ¡£
¶àÈÎÎñѵÁ·
BackboneÒ»°ãÖ±½ÓʹÓÃѵÁ·ºÃµÄÄ£ÐÍ£¬±ÈÈçResNet£¬VGGNetµÈ¡£RPNÍøÂç¡¢Àà±ðÅж¨ºÍBBox»Ø¹éÍøÂ磬MaskÉú³ÉÍøÂ磬¸÷×Ô¶¼ÓжÔÓ¦µÄloss£¬¼¸¸öÄ£¿é¿ÉÒÔͬʱѧϰ£¬¶øÇÒ¾Ý˵ͬʱѵÁ·Ð§¹û¸üºÃ¡£
FPNºÍRPNµÄ¶ÔÓ¦¹ØÏµ
¾ßÌåÀ´Ëµ£¬FPNµÄ¸÷²ãfeature¶¼Ó¦Óõ½Í¬Ò»¸öRPN, µ«ÊǶÔÓ¦²»Í¬µÄanchor boxµÄ´óС¡£ÕâÀïºÍanchor
box´óСµÄ¶ÔÓ¦¹ØÏµÊÇÒþº¬µÄ¡£±ÈÈç¶ÔÓÚ512*512µÄÊäÈëͼƬ£¬Èç¹ûfeatureÊÇ128*128µÄ£¬ÄÇô¶ÔÓ¦µÄanchor
boxÊÇ8*8¡£²»¹ýÕâ¸ö¶ÔÓ¦¹ØÏµÊÇ¿ÉÅäÖõÄ(RPN_ANCHOR_SCALES, BACKBONE_STRIDES)£¬Ò²¿ÉÒÔÓв»Ò»ÑùµÄ¶ÔÓ¦¹ØÏµ£¬Èç¹ûÐ޸ĵÄÐèҪעÒâreception
field£¬ÒÔ¼°ÔÚ¹¹Ôìground truth bboxµÄʱºòÒª¶ÔÓ¦ºÃ¡£
FPNÔÚ·ÖÀà/BBOX»Ø¹é/maskÉú³ÉÊÇÈçºÎʹÓÃ
¸ù¾ÝRPNÉú³ÉµÄBBOXµÄ´óС£¬¶ÔÓ¦µ½²»Í¬µÄfeature²ã¡£matterportµÄ´úÂëÕâÀïÊÇдËÀµÄ¡£¶ÔÓÚ224*224µÄROI£¬¶ÔÓ¦µ½FPNµÄP4.
´Ë´¦ÓиöÒÉÎÊ£¬¶ÔÓÚ²»Í¬µÄÊäÈëͼƬ´óС£¬ÊDz»ÊÇÓ¦¸ÃÓв»Í¬µÄ¶ÔÓ¦¹ØÏµ¡£
FPNµÄ¸÷²ãfeatureÆäʵûÓÐºÏÆðÀ´Ò»ÆðÓã¬RPNÓò»Í¬²ãµÄfeature¶ÔÓ¦²»Í¬µÄanchor
boxµÄ´óС£¬Àà±ðÅж¨ºÍbbox»Ø¹é£¬ÒÔ¼°maskÉú³É¶¼ÊÇÑ¡¶¨Ä³Ò»²ãfeature×÷ÎªÍøÂçµÄÊäÈë¡£
FPN

FPNÉϲãupsampleÖ®ºóºÍϲãÖ±½ÓÏà¼Ó£¬channelÊý²»±ä¡£ÕâÀïºÍUnet²»Ò»Ñù£¬UnetÓÃÁ¬½Ó(concatenation)µÄ·½Ê½ºÏ²¢ÉÏϲãfeature,µÃµ½µÄchannelÊý»á±ä¶à¡£
RPN


ÂÛÎÄÖÐRPNÊÇÔÚfeaturelayerÉÏʹÓÃ3*3µÄÇøÓò×÷ΪÊäÈ룬ÔÚʵÏÖµÄʱºò¾ÍÊǼòµ¥µÄ3*3¾í»ý£¬Ã¿¸öλÖö¼Éú³É½á¹û¡£
RPNÓëProposal LayerµÄ¶ÔÓ¦¹ØÏµ

FPNµÄ²»Í¬feature²ã¶¼ÊäÈëµ½rpnÍøÂ磬Éú³ÉÒ»×éRPN½á¹û£¬È»ºó½«ÕâЩ½á¹ûºÏ²¢ÆðÀ´£¬ÊäÈëµ½ProposalLayer¡£ÐèҪעÒâ¶ÔÓ¦¹ØÏµ£¬ÒòΪij¸öRPNµÄ½á¹û¶ÔÓ¦ÄĸöBox
Scale£¬ÄĸöBox ratio£¬ÒÔ¼°¶ÔÓ¦ÔʼͼƬÄĸöPosition£¬¶¼Êǹ̶¨µÄ£¬ºóÐø¼ÆËãLossµÄʱºòÐèÒªºÍGround
Truth¶ÔÓ¦ÆðÀ´¡£ÔÚProposalLayerÖ®ºóÕâ¸ö¶ÔÓ¦¹ØÏµ¾Í²»ÐèÒªÁË£¬ÒòΪBbox±¾Éí¼Ç¼ÁËλÖá£
·ÖÀàºÍbbox»Ø¹é

Êä³ö·ÖÀà½á¹û£¨Éϲ¿£©ºÍbbox»Ø¹é£¨Ï²¿£©¡£Ã¿¸öÀà±ðÓÐÒ»¸ö½á¹û£¨²»°üÀ¨±³¾°£©£¬ÉÏͼÖÐÀà±ðΪ2¡£
MaskÉú³ÉÍøÂç

ĬÈÏÇé¿öÏÂÉú³É28*28´óСµÄMask£¬Ã¿¸öÀà±ðÒ»¸öMask¡£ÍÆÀíʱʹÓúó´¦Àí½«mask resizeµ½bboxµÄ´óС£¬²¢Ìî³ä0±ä³ÉÔͼƬ´óС(utils.unmold_mask)¡£
Gradient´«µÝ
PyramidROIAlign²ã×èÖ¹GradientÏòROI proposals´«µÝ£¬µ«ÊÇ»áÏòFPN´«µÝ¡£Ò²¾ÍÊÇ˵ͷ²¿µÄ·´Ïò´«µÝ²¿·Ö¶ÔRPNÍøÂç²»²úÉúÓ°Ïì¡£´úÂë²Î¿¼models.PyramidROIAlign
ÆäËûÄ£¿é
Detection target layer£¬Proposal layer£¬ ÒÔ¼°ÍƵ¼¹ý³ÌÖеÄDetetion
layer¶¼ÊÇÆÕͨµÄ·ÇѧϰµÄ¹ý³Ì.
Proposal layerÑ¡Ôñ6000¸ö¸ÅÂÊ×î´óµÄanchor boxes£¬×öһЩºó´¦Àí£¬Ê¹ÓÃNMSÈ¥ÖØ¡£µÃµ½µÄ½á¹û×÷ΪºóÐøµÄÊäÈë¡£ÓÉÓÚFPNµÄ¸ß¾«¶È²ã±È½Ï´ó£¬±ÈÈç128*128£¬»áÉú³É128*128*NUM_bbox_ratio¸ö½á¹û£¬ÒÔ0.5,1,2Èý¸öbox
ratioÀ´¼ÆËãÊÇ128*128*3=49152¸ö£¬¶øÇÒ¿ÉÄÜ´æÔÚ´óÁ¿µÄÖØµþ£¬Èç¹û²»¼Ó´¦ÀíÊäÈëµ½ºóÐøÍøÂ磬»áÕ¼ÓôóÁ¿µÄÄÚ´æ¡£
Detection target layer°ÑProposal layerµÄÊä³ö½øÒ»²½´¦Àí£¬Éú³ÉºÏÊʵĺòÑ¡ROIsÊäÈëµ½ºóÐøÍøÂ磬²¢Îª¼ÆËãloss×ö×¼±¸¡£
Detection layerÖ÷ÒªÊǸù¾ÝÄ¿±ê·ÖÀàºÍBbox»Ø¹éµÄ½á¹û£¬Ñ¡ÔñºÏÊʵÄROI£¨È¥³ý±³¾°£¬È¥³ýµÍ¸ÅÂʵÄbox£¬NMSÈ¥ÖØ£©ÊäÈëµ½maskÉú³ÉÍøÂç¡£
ROIAlign ÂÛÎÄÖÐÕⲿ·ÖÊÇʹÓòåÖµµÄ·½Ê½£¬½«BBox¶ÔÓ¦µÄfeature±ä»»³É7*7´óС¡£matterportµÄʵÏÖÖ±½ÓʹÓÃÁËtfµÄresize¡£
|