Keyword [FPN]
Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 2117-2125.
1. Overview
1.1. Motivation
Feature pyramid are basic in detection, but expensive. In this paper, it proposed Feature Pyramid Network (FPN).
- fater-rcnn based on FPN run at 6 fps on GPU
1.2. Forms
1.2.1. Pyramidal Feature Hierarchy
- High-resolution maps have low-level feature. And it will harm representational capacity for detection.
- SDD build pyramid from high up in the network
1.2.2. Feature Pyramid Network
- combine
- low-resolution (strong semantic)
- high-resolution (weak semantic)
1.3. Related Work
1.3.1. Hand-Engineered
- SIFT
- HOG
1.3.2. Deep ConvNet
- OverFeat
- R-CNN
- SPPnet
1.3.3. Using Multiple Layers
- FCN. sum partial scores over multi-scale
- Hypercolumns, HyperNet, ParseNet, ION. concat feature of multi-layers
- SSD, MS-CNN. predict at multi-layers without combining
- U-Net, Sharp-Mask. Recombinator, Hourglass, Laplacian pyramid. lateral/skip connection
1.4. FPN
- Two Pathway
- bottom-up pathway
- top-down pathway and lateral connection
The bottom-up map is lower-level semantics, more accuracy for localizing.
2. Application
2.1. RPN
- Head
- binary classification
- bounding box regression
- Attach head (shared) to each (P2, P3, P4, P5, P6) level of FPN. Each level
- single scale anchor
- 3 aspect ratios {1:2, 1:1, 2:1}
Head shared mechanism is analogous to image pyramid mechanism with common head classifier.
2.2. Fast R-CNN
No RPN, only RoI pooling.
- view feature pyramid as produced from image pyramid
- assign RoI (w,h) to level Pk of FPN
- 224 is the canonical ImageNet pre-training size
- k0 is the level of 224x224 RoI
- Shared head
Analogous to ResNet-based Faster R-CNN, set k0 = 4. The k of 112x112 RoI is 3 (k0 - log[112/224]).
3. Experiments
3.1. Ablation Study
Using P2 have more proposal.