osc.data.clevr_with_masks.prepare_test_vqa
- prepare_test_vqa(example, img_size: Tuple[int, int], crop_size: Tuple[int, int], mean: Tuple[float, float, float], std: Tuple[float, float, float])[source]
Prepare a test example for VQA (center crop+normalization)
The VQA target is a one-hot encoding of all possible questions like “is there at least one (size, color, material, shape) object in the scene?”. There are 2 sizes, 8 colors, 2 materials and 3 shapes, so 96 binary values.
- Parameters
- Returns
A dict containing the image
[3 H W]
and the VQA target[V]
.