Abstract: Knowledge-based Visual Question Answering (KB-VQA) aims to answer the image-aware question via the external knowledge, which requires an agent to not only understand images but also ...
Try it now — load your own PDF or use the sample: ...