Keyword [DGA]

Yang S, Li G, Yu Y. Dynamic Graph Attention for Referring Expression Comprehension[J]. arXiv preprint arXiv:1909.08164, 2019.

1. Overview

1.1. Motivation

existing methods treat the objects in isolation or only explore the direct relationships between objects without being aligned with the expression

In this paper, it proposes Dynamic Graph Attention Network (DGA)
1) muti-step reasoning
2) differential analyzer module
3) static graph attention module
4) dynamic graph attention module
5) matching module

1.2. Dataset

RefCOCO
RefCOCO+
RefCOCOg

2. DGA

2.1. Reasoning Structure Analyzer

Output the weight of each word $r_l^{(t)}$ at each time step $t$.

$q$. expression feature

2.2. Static Graph Attention Module

Output:
1) $α_{k,l}$. weight between node $k$ and word $l$
2) $c_k$. expression feature for node $k$
3) $β_{n,l}$. weight between edge $n$ and word $l$

1) edge $e_{ij}$ = [0, 1, …, 11]
0=’no relation’; 1=’inside’; …; 11=’bottom right’

2) $x_k^I = [x_k^o; p_k]$. node of $G_I$. visual feature + spatial feature
3) node of $G_M$. node of $G_I$ + $c_k$

2.3. Dynamic Graph Attention Module

Output the feature $m_k^{(t)}$ of node $k$ at each time step $t$.

1) update $α{k,l}$ and $β{n,l}$ based on $r_l^{(t)}$
2) update node feature $m_k^{(t)}$ based on $m_k^{(t-1)}$ and other connected node feature $m_j^{(t-1)}$