Recent Highlights

  • 2026/02 Released Seed 2.0, a worldwide top-ranked unified model for agent/language/vision.
  • 2025/12 Released Seed 1.8, a worldwide top-ranked unified model for agent/language/vision.
  • 2025/07 Released paper, code, and data of DenseWorld-1M for VLM training.
  • 2025/05 Released Seed 1.5-VL, a worldwide top-ranked VLM for language/vision.
  • 2024/08 Released LLaVA-OneVision, a top VLM for multi-image/video with 4.7K+ github stars.
  • 2024/03 Released Mini-Gemini, a top VLM with 3.3K+ github stars.

Selected Project

  • Seed2.0
    Seed2.0 Model Card: Towards Intelligence Frontier for Real-World Complexity
    Core Contributor in Seed 2.0 VL Team
    preprint, 2026
  • Seed1.8
    Seed1.8 Model Card: Towards Generalized Real-World Agency
    Core Contributor in Seed 1.8 VL Team
    arXiv Preprint, 2026
  • Seed1.5-VL
    Seed1.5-VL Technical Report
    Core Contributor in Seed 1.5-VL Team
    arXiv Preprint, 2025
  • DenseWorld-1M
    DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World
    Xiangtai Li*, Tao Zhang*, Yanwei Li*, Haobo Yuan, Shihao Chen, Yikang Zhou, et al.
    arXiv Preprint, 2025

Selected Journal

  • Mini-Gemini
    Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
    Yanwei Li*, Yuechen Zhang*, Chengyao Wang*, Zhisheng Zhong, Yixin Chen, Ruihang Chu, Shaoteng Liu, Jiaya Jia
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
  • LLaVA-OneVision
    LLaVA-OneVision: Easy Visual Task Transfer
    Bo Li, Yuanhan Zhang, Dong Guo, Renrui Zhang, Feng Li, Hao Zhang, Kaichen Zhang, Yanwei Li, Ziwei Liu, Chunyuan Li
    Transactions on Machine Learning Research (TMLR), 2025
  • PanopticFCN
    Fully Convolutional Networks for Panoptic Segmentation with Point-based Supervision
    Yanwei Li, Hengshuang Zhao, Xiaojuan Qi, Yukang Chen, Lu Qi, Liwei Wang, Zeming Li, Jian Sun, Jiaya Jia
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
  • SA-AutoAug
    Scale-aware Automatic Augmentations for Object Detection with Dynamic Training
    Yukang Chen, Peizhen Zhang, Tao Kong, Yanwei Li, Xiangyu Zhang, Lu Qi, Jian Sun, Jiaya Jia
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

Selected Conference

  • LLaMA-VID
    LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models
    Yanwei Li*, Chengyao Wang*, Jiaya Jia
    European Conference on Computer Vision (ECCV), 2024 🏆 Top-10 Influential
  • LISA
    LISA: Reasoning Segmentation via Large Language Model
    Xin Lai, Zhuotao Tian, Yukang Chen, Yanwei Li, Yuhui Yuan, Shu Liu, Jiaya Jia
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024 Oral 🏆 Top-10 Influential
  • GPT4Tools
    GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction
    Rui Yang*, Lin Song*, Yanwei Li, Sijie Zhao, Yixiao Ge, Xiu Li, Ying Shan
    Advances in Neural Information Processing Systems (NeurIPS), 2023
  • DQTrack
    End-to-end 3D Tracking with Decoupled Queries
    Yanwei Li, Zhiding Yu, Jonah Philion, Animashree Anandkumar, Sanja Fidler, Jiaya Jia, Jose Alvarez
    International Conference on Computer Vision (ICCV), 2023
  • UVTR
    Unifying Voxel-based Representation with Transformer for 3D Object Detection
    Yanwei Li, Yilun Chen, Xiaojuan Qi, Zeming Li, Jian Sun, Jiaya Jia
    Advances in Neural Information Processing Systems (NeurIPS), 2022
  • VFF
    Voxel Field Fusion for 3D Object Detection
    Yanwei Li, Xiaojuan Qi, Yukang Chen, Liwei Wang, Zeming Li, Jian Sun, Jiaya Jia
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
  • FocalsConv
    Focal Sparse Convolutional Networks for 3D Object Detection
    Yukang Chen, Yanwei Li, Xiangyu Zhang, Jian Sun, Jiaya Jia
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022 Oral
  • PanopticFCN
    Fully Convolutional Networks for Panoptic Segmentation
    Yanwei Li, Hengshuang Zhao, Xiaojuan Qi, Liwei Wang, Zeming Li, Jian Sun, Jiaya Jia
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021 Oral
  • SA-AutoAug
    Scale-aware Automatic Augmentation for Object Detection
    Yukang Chen*, Yanwei Li*, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021
  • DynamicHead
    Fine-Grained Dynamic Head for Object Detection
    Lin Song, Yanwei Li, Zhengkai Jiang, Zeming Li, Hongbin Sun, Jian Sun, Nanning Zheng
    Advances in Neural Information Processing Systems (NeurIPS), 2020
  • LearnableTreeFilter
    Rethinking Learnable Tree Filter for Generic Feature Transform
    Lin Song, Yanwei Li, Zhengkai Jiang, Zeming Li, Xiangyu Zhang, Hongbin Sun, Jian Sun, Nanning Zheng
    Advances in Neural Information Processing Systems (NeurIPS), 2020
  • DynamicRouting
    Learning Dynamic Routing for Semantic Segmentation
    Yanwei Li, Lin Song, Yukang Chen, Zeming Li, Xiangyu Zhang, Xingang Wang, Jian Sun
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020 Oral
  • TreeFilter
    Learnable Tree Filter for Structure-preserving Feature Transform
    Lin Song*, Yanwei Li*, Zeming Li, Gang Yu, Hongbin Sun, Jian Sun, Nanning Zheng
    Advances in Neural Information Processing Systems (NeurIPS), 2019
  • AUNet
    Attention-guided Unified Network for Panoptic Segmentation
    Yanwei Li, Xinze Chen, Zheng Zhu, Lingxi Xie, Guan Huang, Dalong Du, Xingang Wang
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019

Workshop & Competition

  • DiversifiedDynRouting
    Diversified Dynamic Routing for Vision Tasks
    Botos Csaba, Adel Bibi, Yanwei Li, Philip Torr, Ser-Nam Lim
    European Conference on Computer Vision (ECCV) Workshop, 2022
  • COCO2018
    MicroSoft COCO Panoptic Challenge
    Yanwei Li*, Naiyu Gao*, Chaoxu Guo, Xinze Chen, Qian Zhang, Guan Huang, Xin Zhao, Kaiqi Huang, Dalong Du, Chang Huang
    2nd Place, European Conference on Computer Vision (ECCV) Workshop, 2018