Keyword [CLEVR-Ref+] [IEP-Ref] [IEP] [Nerual Module Networks]
Liu R , Liu C , Bai Y , et al. CLEVR-Ref+: Diagnosing Visual Reasoning with Referring Expressions[J]. 2019.
1. Overview
1.1. Motivation
1) Current benchmark datasets suffer from bias
2) Current SOTA models can not be easily evaluated on their intermediate reasoning process
In this paper, it builds CLEVR-Ref+ (convert from CLEVR in VQA)and propose IEP-Ref
- control over dataset bias
- the segmentation module in the end of IEP-Ref can be attached to any intermedia module to reveal the entire reasoning process
- IEP-Ref can correctly predict no-foreground, even if all training data has at least one object referred
1.2. Contribution
1) construct CLEVR-Ref+ dataset
2) test several SOTA on CLEVR-Ref+, including IEP-Ref
3) segmentation module trained in IEP-Ref can be trivially plugged in all intermediate steps
2. CLEVR-Ref+
3. Experiments
3.1. Step-By-Step Inspection of Visual Reasoning
When testing, simply attach the trained Segment module to the output of all intermediate modules.
3.2. False-Premise Referring Expressions
Robust enough to return zero foreground.
4. IEP-Ref
1) Preprocess. Its output is the input to the Scene module.
2) Unary. Transform one feature to another (Scene, Filter X, Unique, Relate, Same X modules)
3) Binary. Transform two feature to one (And and Or)
4) Postprocess. segmentation module