pip install -r requirements.txt
- Download questions/answers for VQAv2 and VQA-CPv2 by executing
bash tools/download.sh
- Preprocess process the data with
bash tools/process.sh
- The parameters are stored at https://drive.google.com/drive/folders/1eW07eGIHukT3YKMFtG-Hh1MJr6hIOPDg?usp=sharing
Run
CUDA_VISIBLE_DEVICES=0 python main.py -dataset cpv2 -mode base -scale sin -output base
- Set
mode
asgld_iter
andgld_joint
for our model in iterative and joint training;base
for baseline model;gld_reg
.for w/ regularization term version - Set
dataset
asv2
for the general VQA task;cpv2
for the VQA task which enhance the language prior
CUDA_VISIBLE_DEVICES=0 python gld_iter_ce.py
CUDA_VISIBLE_DEVICES=0 python gld_joint_ce.py
to see the difference with crossentropy as loss;
To see visualization, set visual
as True
CUDA_VISIBLE_DEVICES=0 python main.py -dataset cpv2 -mode gld_reg -scale sin -visual True -qid 140 -output vis
change qid
to see the different question and image pairs
and change mode to see the visualization result in different setting