Home

Harnessing dual strengths in computer vision and natural language processing

At the forefront of vision and language research, we are a multimodal group that’s using interdisciplinary expertise to solve big challenges, and make a lasting contribution to society.

VL reseachers working with drone

Our expertise at a glance

  • Joint visual and language learning, e.g., image/video captioning, visual language navigation, visual question answering
  • Multilingual natural language understanding and generation, e.g., machine translation, biomedical NLP, dialogue and chatbots
  • Knowledge graph representation, reasoning, question answering and network representation learning
  • 3D visual computing and analytics

Learn more about the group

For any additional questions or inquiries, feel free to reach out
to Professor Jianfei Cai, Discipline Lead, Vision and Language.