Keyword [FPN]
Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 2117-2125.
1. Overview
1.1. Motivation
Feature pyramid are basic in detection, but expensive. In this paper, it proposed Feature Pyramid Network (FPN).
- fater-rcnn based on FPN run at 6 fps on GPU
 
1.2. Forms

1.2.1. Pyramidal Feature Hierarchy
- High-resolution maps have low-level feature. And it will harm representational capacity for detection.
 - SDD build pyramid from high up in the network
 
1.2.2. Feature Pyramid Network
- combine
- low-resolution (strong semantic)
 - high-resolution (weak semantic)
 
 

1.3. Related Work
1.3.1. Hand-Engineered
- SIFT
 - HOG
 
1.3.2. Deep ConvNet
- OverFeat
 - R-CNN
 - SPPnet
 
1.3.3. Using Multiple Layers
- FCN. sum partial scores over multi-scale
 - Hypercolumns, HyperNet, ParseNet, ION. concat feature of multi-layers
 - SSD, MS-CNN. predict at multi-layers without combining
 - U-Net, Sharp-Mask. Recombinator, Hourglass, Laplacian pyramid. lateral/skip connection
 
1.4. FPN

- Two Pathway
- bottom-up pathway
 - top-down pathway and lateral connection
The bottom-up map is lower-level semantics, more accuracy for localizing. 
 
2. Application
2.1. RPN
- Head
- binary classification
 - bounding box regression
 
 - Attach head (shared) to each (P2, P3, P4, P5, P6) level of FPN. Each level
- single scale anchor
 - 3 aspect ratios {1:2, 1:1, 2:1}
Head shared mechanism is analogous to image pyramid mechanism with common head classifier. 
 
2.2. Fast R-CNN
No RPN, only RoI pooling.
- view feature pyramid as produced from image pyramid
 - assign RoI (w,h) to level Pk of FPN

 
- 224 is the canonical ImageNet pre-training size
 - k0 is the level of 224x224 RoI
 - Shared head
Analogous to ResNet-based Faster R-CNN, set k0 = 4. The k of 112x112 RoI is 3 (k0 - log[112/224]). 
3. Experiments
3.1. Ablation Study
Using P2 have more proposal.




3.2. Comparison

3.3. Extension

