I am a third-year Ph.D. candidate at Shanghai Jiao Tong University, supervised by Prof. Jifeng Dai. I obtained my bachelor’s degree from Beihang University in 2022, where I worked with Prof. Si Liu. I also have a double bachelor’s degree in economics from Peking University. Currently, I am an intern at OpenGVLab of Shanghai AI Laboratory. Previously I interned at SenseTime and Sea AI Lab.
Ph.D. (Joint Program with Shanghai AI Lab), 2022-
Department of EE, Shanghai Jiao Tong University
B.A. in Economics (Double Major), 2019-2022
National School of Development, Peking University
B.Eng. in Computer Science, 2018-2022
Shenyuan Honors College, Beihang University
Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Gen Luo*, Xue Yang*, Wenhan Dou*, Zhaokai Wang*, Jiawen Liu, Jifeng Dai, Yu Qiao, Xizhou Zhu
Preprint
Sparkle: Mastering Basic Spatial Capabilities in Vision Language Models Elicits Generalization to Composite Spatial Reasoning
Yihong Tang*, Ao Qu*, Zhaokai Wang*, Dingyi Zhuang*, Zhaofeng Wu, Wei Ma, Shenhao Wang, Yunhan Zheng, Zhan Zhao, Jinhua Zhao
Preprint
SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding
Hao Li, Changyao Tian, Jie Shao, Xizhou Zhu, Zhaokai Wang, Jinguo Zhu, Wenhan Dou, Xiaogang Wang, Hongsheng Li, Lewei Lu, Jifeng Dai
Preprint
ITINERA: Integrating Spatial Optimization with Large Language Models for Open-domain Urban Itinerary Planning
Yihong Tang*, Zhaokai Wang*, Ao Qu*, Yihao Yan*, Zhaofeng Wu, Dingyi Zhuang, Jushi Kai, Kebing Hou, Xiaotong Guo, Jinhua Zhao, Zhan Zhao, Wei Ma
EMNLP 2024
Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft
Hao Li*, Xue Yang*, Zhaokai Wang*, Xizhou Zhu, Jie Zhou, Yu Qiao, Xiaogang Wang, Hongsheng Li, Lewei Lu, Jifeng Dai
CVPR 2024
Conference Reviewer:
Teaching Assistant