I’m Yiwei Zhang, currently a senior undergraduate researcher from the Department of Computer Science and Technology, Tsinghua University, working with Dr. Chuang Gan in the field of multimodal machine learning and computer vision.
Here is my Curriculum Vitae.
In general, my research interest lies in the field of Machine Learning, in particular multimodal machine learning and its applications.
From a theoretical perspective, I am interested in understanding the computational and statistical principals in multimodal learning and building systems that can process and relate information from multiple modalities.
From an application perspective, I apply these principals to solve problems in Computer Vision, Natural Language Processing, and Speech such as multimodal language modeling, multi-modality learning from videos, and audio-visual embodied indoor navigation.
Watch, Reason and Code: Learning to Represent Videos Using Program.
Xuguang Duan, Qi Wu, Chuang Gan, Yiwei Zhang, Wenbing Huang, Anton van den Hengel, and Wenwu Zhu.
ACM International Conference on Multimedia (ACM MM), October 2019
Look, Listen, and Act: Towards Audio-Visual Embodied Navigation
Gan Chuang*, Yiwei Zhang*, Jiajun Wu, Boqing Gong, and Joshua Tenenbaum.
Factorized Multimodal Transformer for Multimodal Sequential Learning
Amir Zadeh, Chengfeng Mao, Kelly Shi, Yiwei Zhang, Paul Pu Liang, Soujanya Poria, and Louis-Philippe Morency.
Under review in Information Fusion
Language Technology Inititute, Carnegie Mellon University
Jul. 2019 - Dec. 2019
- Directed by Prof. Louis-Philippe Morency
- Research on multimodal machine learning and Transformer models.
Department of Computer Science and Technology, Tsinghua University
Jul. 2018 - Jun. 2019
- Directed by Dr. Chuang Gan
- Research on multimodal machine learning and computer vision.