I am a third-year Ph.D. candidate at Shanghai Jiao Tong University, supervised by Prof. Jifeng Dai. I obtained my bachelor’s degree from Beihang University in 2022, where I worked with Prof. Si Liu. I also have a double bachelor’s degree in economics from Peking University. Currently, I am an intern at OpenGVLab of Shanghai AI Laboratory. Previously I interned at SenseTime and Sea AI Lab.
Ph.D. (Joint Program with Shanghai AI Lab), 2022-
Department of EE, Shanghai Jiao Tong University
B.A. in Economics (Double Major), 2019-2022
National School of Development, Peking University
B.Eng. in Computer Science, 2018-2022
Shenyuan Honors College, Beihang University
Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Gen Luo*, Xue Yang*, Wenhan Dou*, Zhaokai Wang*, Jifeng Dai, Yu Qiao, Xizhou Zhu
Preprint
Sparkle: Mastering Basic Spatial Capabilities in Vision Language Models Elicits Generalization to Composite Spatial Reasoning
Yihong Tang*, Ao Qu*, Zhaokai Wang*, Dingyi Zhuang*, Zhaofeng Wu, Wei Ma, Shenhao Wang, Yunhan Zheng, Zhan Zhao, Jinhua Zhao
Preprint
Parameter-Inverted Image Pyramid Networks
Xizhou Zhu*, Xue Yang*, Zhaokai Wang*, Hao Li, Wenhan Dou, Junqi Ge, Lewei Lu, Yu Qiao, Jifeng Dai
ITINERA: Integrating Spatial Optimization with Large Language Models for Open-domain Urban Itinerary Planning
Yihong Tang*, Zhaokai Wang*, Ao Qu*, Yihao Yan*, Zhaofeng Wu, Dingyi Zhuang, Jushi Kai, Kebing Hou, Xiaotong Guo, Jinhua Zhao, Zhan Zhao, Wei Ma
EMNLP 2024
Synergizing Spatial Optimization with Large Language Models for Open-Domain Urban Itinerary Planning Yihong Tang*, Zhaokai Wang*, Ao Qu*, Yihao Yan*, Kebing Hou, Dingyi Zhuang, Xiaotong Guo, Jinhua Zhao, Zhan Zhao, Wei Ma
KDD UrbComp 2024 (Best Paper Award)
Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft
Hao Li*, Xue Yang*, Zhaokai Wang*, Xizhou Zhu, Jie Zhou, Yu Qiao, Xiaogang Wang, Hongsheng Li, Lewei Lu, Jifeng Dai
CVPR 2024
Video Background Music Generation: Dataset, Method and Evaluation Le Zhuo*, Zhaokai Wang*, Baisen Wang*, Yue Liao, Chenxi Bao, Stanley Peng, Songhao Han, Aixi Zhang, Fei Fang, Si Liu
ICCV 2023
Video Background Music Generation with Controllable Music Transformer
Shangzhe Di, Zeren Jiang, Si Liu, Zhaokai Wang, Leyan Zhu, Zexin He, Hongming Liu, Shuicheng Yan
ACM MM 2021 (Best Paper Award)
Confidence-aware Non-repetitive Multimodal Transformers for TextCaps
Zhaokai Wang, Renda Bao, Qi Wu, Si Liu
AAAI 2021
Conference Reviewer:
Teaching Assistant: