Benchmarking VLMs’ Reasoning About Persuasive Atypical Images

Published in WACV, 2025

In this paper, we introduce three novel tasks to evaluate the VLMs’ performance in complex reasoning on atypical image tasks.
We introduce a new adversarial dataset on PittAd to assess the VLMs’ semantical understanding against LLMs.
We show that although VLMs are able to generate acceptable and useful descriptions for LLMs to understand the images their performance is lower than LLMs in reasoning and semantical understanding of the images.