osc.data.clevr_with_masks.prepare_test_vqa

prepare_test_vqa(example, img_size: Tuple[int, int], crop_size: Tuple[int, int], mean: Tuple[float, float, float], std: Tuple[float, float, float])[source]

Prepare a test example for VQA (center crop+normalization)

The VQA target is a one-hot encoding of all possible questions like “is there at least one (size, color, material, shape) object in the scene?”. There are 2 sizes, 8 colors, 2 materials and 3 shapes, so 96 binary values.

Parameters
Returns

A dict containing the image [3 H W] and the VQA target [V].