(NIPS 2016) Hierarchical Object Detection with Deep Reinforcement Learning

Keyword [DQN]

Bellver M, Giroinieto X, Marques F, et al. Hierarchical Object Detection with Deep Reinforcement Learning[J]. Advances in Parallel Computing, 2016

1. Overview

In this paper, it trained an intelligent agent to decide where to look

  • proposal strategy. with and w/o overlap
  • extract feature strategy. zoom and crop-pool
  • overlap+zoom better
  • cast the problem as a Markov Decision Process (MDP)
  • RL has been applied to Classification, Captioning, Activity Recognition
  • Region proposal is expensive (R-CNN)
  • AttentionNet. cast detection as iterative classification
  • SSD
  • Active Object Localization (agent).

2. Methods

2.1. MDP Formulation

  • state. descriptor of current region + memory vector (last 4 actions, 6 * 4=24 dimension)
  • action. 5 movement + terminal
  • reward.

  • τ=0.5
  • η=3

2.2. Q-learning

2.3. Proposal Strategy

  • overlap. 0.75

2.4. Model

2.5. Exploration-Exploration

  • ε-greedy policy. start with ε=1, decrease until ε=0.1 in step of 0.1
  • start with random actions, and at each epoch the agent takes decisions relying more on the already learnt policy
  • to help the agent learn terminal action, we force it each time the current region has a IoU>0.5

2.6. Training Parameters

  • DQN
  • Adam 1e-6
  • 50 epoch
  • discount factor. γ=0.9

2.7. Experiments Replay

(s, a, r, s’)

  • consecutive experiences in this paper is very correlated, lead to inefficient and unstable learning
  • to solve it, we use random minibatches. 1000 experiences and batch size 100

3. Experiments

3.1. Region Proposal

  • with less than 3 step, can almost approximate all objects we can detect

3.2. Zoom vs Crop-Pool

  • Zoom better