Selected Publication

Updated by Yanwei Li on March 29, 2024
News: We release the paper, code, and models for Mini-Gemini.
News: LLaMA-VID is accepted by ECCV 2024, Milan.
News: DQTrack is accepted by ICCV 2023, Paris.
News: We release the code of project GPT4Tools.
News: UVTR is accepted by NeurIPS 2022, New Orleans.

Preprint

Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
Yanwei Li*, Yuechen Zhang*, Chengyao Wang*, Zhisheng Zhong, Yixin Chen, Ruihang Chu, Shaoteng Liu, Jiaya Jia
arXiv Preprint, 2024
[paper] [code] [project]
LLaVA-OneVision: Easy Visual Task Transfer
Bo Li, Yuanhan Zhang, Dong Guo, Renrui Zhang, Feng Li, Hao Zhang, Kaichen Zhang, Yanwei Li, Ziwei Liu, Chunyuan Li
arXiv Preprint, 2024
[paper] [code] [project]

Journal Paper

Fully Convolutional Networks for Panoptic Segmentation with Point-based Supervision
Yanwei Li, Hengshuang Zhao, Xiaojuan Qi, Yukang Chen, Lu Qi, Liwei Wang, Zeming Li, Jian Sun, Jiaya Jia
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
[paper] [code]
Scale-aware Automatic Augmentations for Object Detection with Dynamic Training
Yukang Chen, Peizhen Zhang, Tao Kong, Yanwei Li, Xiangyu Zhang, Lu Qi, Jian Sun, Jiaya Jia
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
[paper]

Conference Paper

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models
Yanwei Li*, Chengyao Wang*, Jiaya Jia
European Conference on Computer Vision (ECCV), 2024
[paper] [code] [project]
LISA: Reasoning Segmentation via Large Language Model
Xin Lai, Zhuotao Tian, Yukang Chen, Yanwei Li, Yuhui Yuan, Shu Liu, Jiaya Jia
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[paper] [code]
GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction
Rui Yang*, Lin Song*, Yanwei Li, Sijie Zhao, Yixiao Ge, Xiu Li, Ying Shan
Advances in Neural Information Processing Systems (NeurIPS), 2023
[paper] [code]
End-to-end 3D Tracking with Decoupled Queries
Yanwei Li, Zhiding Yu, Jonah Philion, Animashree Anandkumar, Sanja Fidler, Jiaya Jia, Jose Alvarez
International Conference on Computer Vision (ICCV), 2023
[paper] [code]
Unifying Voxel-based Representation with Transformer for 3D Object Detection
Yanwei Li, Yilun Chen, Xiaojuan Qi, Zeming Li, Jian Sun, Jiaya Jia
Advances in Neural Information Processing Systems (NeurIPS), 2022
[paper] [code]
Voxel Field Fusion for 3D Object Detection
Yanwei Li, Xiaojuan Qi, Yukang Chen, Liwei Wang, Zeming Li, Jian Sun, Jiaya Jia
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
[paper] [code]
Focal Sparse Convolutional Networks for 3D Object Detection
Yukang Chen, Yanwei Li, Xiangyu Zhang, Jian Sun, Jiaya Jia
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022(Oral)
[paper] [code]
Fully Convolutional Networks for Panoptic Segmentation
Yanwei Li, Hengshuang Zhao, Xiaojuan Qi, Liwei Wang, Zeming Li, Jian Sun, Jiaya Jia
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021 (Oral)
[paper] [code] [slides]
Scale-aware Automatic Augmentation for Object Detection
Yukang Chen*, Yanwei Li*, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
[paper] [code]
Fine-Grained Dynamic Head for Object Detection
Lin Song, Yanwei Li, Zhengkai Jiang, Zeming Li, Hongbin Sun, Jian Sun, Nanning Zheng
Advances in Neural Information Processing Systems (NeurIPS), 2020
[paper] [code]
Rethinking Learnable Tree Filter for Generic Feature Transform
Lin Song, Yanwei Li, Zhengkai Jiang, Zeming Li, Xiangyu Zhang, Hongbin Sun, Jian Sun, Nanning Zheng
Advances in Neural Information Processing Systems (NeurIPS), 2020
[paper] [code]
Learning Dynamic Routing for Semantic Segmentation
Yanwei Li, Lin Song, Yukang Chen, Zeming Li, Xiangyu Zhang, Xingang Wang, Jian Sun
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020 (Oral)
[paper] [code] [video] [slides]
Learnable Tree Filter for Structure-preserving Feature Transform
Lin Song*, Yanwei Li*, Zeming Li, Gang Yu, Hongbin Sun, Jian Sun, Nanning Zheng
Advances in Neural Information Processing Systems (NeurIPS), 2019
[paper] [code]
Attention-guided unified network for panoptic segmentation
Yanwei Li, Xinze Chen, Zheng Zhu, Lingxi Xie, Guan Huang, Dalong Du, Xingang Wang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
[paper]
Identity-Enhanced Network for Facial Expression Recognition
Yanwei Li, Xingang Wang, Shilei Zhang, Lingxi Xie, Wenqi Wu, Hongyuan Yu, Zheng Zhu
Asian Conference on Computer Vision (ACCV), 2018
[paper]

Workshop & Competition

Diversified Dynamic Routing for Vision Tasks
Botos Csaba, Adel Bibi, Yanwei Li, Philip Torr, Ser-Nam Lim
European Conference on Computer Vision (ECCV) Workshop, 2022
[paper]
MicroSoft COCO Panoptic Challenge
Yanwei Li*, Naiyu Gao*, Chaoxu Guo, Xinze Chen, Qian Zhang, Guan Huang, Xin Zhao, Kaiqi Huang, Dalong Du, Chang Huang
Win 2nd place, Oral in ECCV COCO Workshop, 2018.
[slides]
State-aware Re-identification Feature for Multi-target Multi-camera Tracking
Peng Li* , Jiabin Zhang* , Zheng Zhu*, Yanwei Li, Lu Jiang, Guan Huang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshop, 2019
[paper]