osc.data.clevr_with_masks.prepare_test_vqa

prepare_test_vqa(example, img_size: Tuple[int, int], crop_size: Tuple[int, int], mean: Tuple[float, float, float], std: Tuple[float, float, float])[source]

Prepare a test example for VQA (center crop+normalization)

The VQA target is a one-hot encoding of all possible questions like “is there at least one (size, color, material, shape) object in the scene?”. There are 2 sizes, 8 colors, 2 materials and 3 shapes, so 96 binary values.

Parameters

example –
img_size (Tuple[int, int]) – image size (H, W)
crop_size (Tuple[int, int]) – crop size (H, W)
mean (Tuple[float, float, float]) – image mean for normalization
std (Tuple[float, float, float]) – image standard deviation for normalization

Returns

A dict containing the image [3 H W] and the VQA target [V].