CVQA - Culturally-diverse Multilingual Visual Question Answering Benchmark

David Romero*👁️, Chenyang Lyu*👁️, Haryo Akbarianto Wibowo👁️, Teresa Lynn, Injy Hamed, Aditya Nanda Kishore, Aishik Mandal, Alina Dragonetti, Artem Abzaliev, Atnafu Lambebo Tonja, Bontu Fufa Balcha, Chenxi Whitehouse, Christian Salamea, Dan John Velasco, David Ifeoluwa Adelani, David Le Meur, Emilio Villa-Cueva, Fajri Koto, Fauzan Farooqui, Frederico Belcavello, Ganzorig Batnasan, Gisela Vallejo, Grainne Caulfield, Guido Ivetta, Haiyue Song, Henok Biadglign Ademtew, Hernán Maina, Holy Lovenia, Israel Abebe Azime, Jan Christian Blaise Cruz, Jay Gala, Jesus-German Ortiz-Barajas, Jiahui Geng, Jinheon Baek, Jocelyn Dunstan Escudero, Kumaranage Ravindu Yasas Nagasinghe, Laura Alonso Alemany, Luciana Benotti, Luis Fernando D'Haro, Marcelo Viridiano, Marcos Estecha-Garitagoitia, Maria Camila Buitrago Cabrera, Mario Rodríguez-Cantelar, Mélanie Jouitteau, Mihail Mihaylov, Mohamed Fazli Mohamed Imam, Muhammad Farid Adilazuarda, Munkhjargal Gochoo, Munkh-Erdene Otgonbold, Naome Etori, Olivier Niyomugisha, Paula Mónica Silva, Pranjal Chitale, Raj Dabre, Rendi Chevi, Ruochen Zhang, Ryandito Diandaru, Samuel Cahyawijaya, Santiago Góngora, Soyeong Jeong, Sukannya Purkayastha, Tatsuki Kuribayashi, Thanmay Jayakumar, Tiago Timponi Torrent, Toqeer Ehsan, Vladimir Araujo, Yova Kementchedjhieva, Zara Burzo, Zheng Wei Lim, Zheng-Xin Yong, Oana Ignat, Joan Nwatu, Rada Mihalcea, Thamar Solorio👁️, Alham Fikri Aji👁️

*Indicates Equal Contribution
👁️ MBZUAI Core Team

alt text

We introduce CVQA: a novel, multilingual, culturally nuanced VQA benchmark that includes a diverse set of languages, many of them underrepresented and understudied in NLP.

  • A culturally diverse multilingual VQA: Our data consists of 9k questions across 28 countries, covering 26 languages. We also sub-categorize CVQA based on Country-Language pairs, resulting in 33 distinct pairs. CVQA is written in both English and local languages, enabling us to benchmark multilingual Multimodal Large Language Models (MLLMs) and English-only MLLMs.
  • Baseline Evaluation: we provide an initial set of evaluations on this benchmark, to serve as a baseline for future research on vision-language models that are culturally diverse.

CVQA Data Statistic

Below is the statistic of our data.

alt text

Experiment Results

This section shows the performance of several MLLM models tested on CVQA.

alt text

Test your system!

Eager to show that your system understands culture? Prove it by submitting your prediction on the leaderboard.

You can visit the website here: Leaderboard

BibTeX

@misc{romero2024cvqa,
      title={CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark}, 
      author={David Romero and Chenyang Lyu and Haryo Akbarianto Wibowo and Teresa Lynn and Injy Hamed and Aditya Nanda Kishore and Aishik Mandal and Alina Dragonetti and Artem Abzaliev and Atnafu Lambebo Tonja and Bontu Fufa Balcha and Chenxi Whitehouse and Christian Salamea and Dan John Velasco and David Ifeoluwa Adelani and David Le Meur and Emilio Villa-Cueva and Fajri Koto and Fauzan Farooqui and Frederico Belcavello and Ganzorig Batnasan and Gisela Vallejo and Grainne Caulfield and Guido Ivetta and Haiyue Song and Henok Biadglign Ademtew and Hernán Maina and Holy Lovenia and Israel Abebe Azime and Jan Christian Blaise Cruz and Jay Gala and Jiahui Geng and Jesus-German Ortiz-Barajas and Jinheon Baek and Jocelyn Dunstan and Laura Alonso Alemany and Kumaranage Ravindu Yasas Nagasinghe and Luciana Benotti and Luis Fernando D'Haro and Marcelo Viridiano and Marcos Estecha-Garitagoitia and Maria Camila Buitrago Cabrera and Mario Rodríguez-Cantelar and Mélanie Jouitteau and Mihail Mihaylov and Mohamed Fazli Mohamed Imam and Muhammad Farid Adilazuarda and Munkhjargal Gochoo and Munkh-Erdene Otgonbold and Naome Etori and Olivier Niyomugisha and Paula Mónica Silva and Pranjal Chitale and Raj Dabre and Rendi Chevi and Ruochen Zhang and Ryandito Diandaru and Samuel Cahyawijaya and Santiago Góngora and Soyeong Jeong and Sukannya Purkayastha and Tatsuki Kuribayashi and Thanmay Jayakumar and Tiago Timponi Torrent and Toqeer Ehsan and Vladimir Araujo and Yova Kementchedjhieva and Zara Burzo and Zheng Wei Lim and Zheng Xin Yong and Oana Ignat and Joan Nwatu and Rada Mihalcea and Thamar Solorio and Alham Fikri Aji},
      year={2024},
      eprint={2406.05967},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}