Liu W, Luo W, Lian D, et al. Future frame prediction for anomaly detection–a new baseline[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 6536-6545.
1. Overview
1.1. Motivation
- In anomaly video detection, almost all existing methods tackle the problem by minimizing the reconstruction errors of training data, which can not guarantee a large reconstruction error for an abnormal event
- The capacity of DNN is high, and larger reconstruction errors for abnormal events do not necessarily happen
- Abnormal events are unbounded
In this paper, it proposes to tackle the anomaly detection problem within a video prediction framework
- first work to leverage the difference between predicted future frame and GT
- first work to introduce a temporal constraint into video prediction task
- other than spatial constraints (intensity and gradient), also introduce temporal constraint
1.2. Related Work
- Hand-craft Feature. HOG, HOF
- Deep Learning. ConvLSTM-AE
- Video Frame Prediction
- Least Square GAN
2. Architecture
2.1. Overview
2.2. Contraints
Intensity
Gradient
Motion
f. pre-trained FlowNet
- Adversarial (Least Square GAN)
2.3. Loss Function
- frame normalize to [-1, 1]
- frame resize to 256x256
- t=4; random clip of 5 sequential frames
- batch size 4
- int, gd, op, adv: 1.0, 1.0, 2.0, 0.05
2.4. Anomaly Detection on Testing Data
Mathieu shows that Peak Signal to Noise Ratio (PSNR) is a better way for image quality assessment (higher PSNR → normal)
normalize PSNR of all frames in each testing video to the range [0, 1], and calculate the regular score for each frame by
3. Experiments
3.1. Dataset
3.1.1. CUHK Avenue Dataset
- 16 training videos and 21 testing videos
- 47 abnormal events
3.1.2. UCSD Dataset
- UCSD Pedestrian 1. 34 training videos and 36 testing videos with 40 irregular events
- UCSD Pedestrian 2. 16 training videos and 12 testing videos with 12 abnormal events
3.1.3. ShanghaiTech Dataset
- 330 training videos and 107 testing
- 130 abnormal events
3.1.4. Toy Dataset
- 210 frames for training
- 1242 frames for testing
3.2. Comparison
3.3. Ablation Study
- Delta. gap between average score of normal frames and that of abnormal frames
3.4. Toy Experiments
- After observing the pedestrian for a while when the pedestrian has made his or her choice, it becomes predictable and PSNR would go up