Qingkai Fang | 房庆凯
I am a fourth-year Ph.D. student at Institute of Computing Technology, Chinese Academy of Sciences (ICT/CAS), luckily advised by Prof. Yang Feng at ICT Natural Language Processing (ICTNLP) group.
Before that, I received my B.E. degree in Computer Science and Technology from Beijing University of Posts and Telecommunications (BUPT) in Jun. 2021.
Email: fangqingkai21b [at] ict.ac.cn / poeroz1204 [at] gmail.com
Google Scholar
/
ACL Anthology
/
DBLP
/
Github
/
CV
I expect to graduate with a Ph.D. in June 2026 and will be seeking job opportunities in the industry. If you are interested, please feel free to reach out.
|
|
Research
My research interests mainly lie in natural language processing and multimodal learning. Particularly, I am interested in:
-
Large language models (LLMs) and multimodal LLMs, including speech-language models and vision-language models.
-
(Simultaneous) speech-to-text and speech-to-speech translation.
-
Non-autoregressive sequence generation and streaming sequence generation.
|
News
[2025/01] Two papers are accepted to ICLR 2025!
[2024/11] I got the National Scholarship (30,000 RMB).
[2024/09] Our speech-language model LLaMA-Omni is released! It is a powerful speech interaction model built upon Llama-3.1-8B-Instruct, which achieves low-latency and high-quality speech interactions. Check our paper, code, and model!
[2024/05] Four papers are accepted to ACL 2024 (3 main conference + 1 findings)!
[2023/10] One paper is accepted to EMNLP 2023 main conference!
[2023/09] One paper is accepted to NeurIPS 2023!
[2023/06] Our multilingual LLM BayLing (百聆) is released! BayLing is an instruction-following LLM with advanced language alignment and multi-turn interaction capability. Read our paper and try our online demo!
[2023/05] Three papers are accepted to ACL 2023 main conference!
[2022/10] One paper is accepted to EMNLP 2022 main conference!
[2022/07] We won first place in the Chinese-Thai track and second place in the Mongolian-Chinese track in CCMT 2022!
[2022/02] Two papers are accepted to ACL 2022 main conference!
|
Preprint
BayLing 2: A Multilingual Large Language Model with Efficient Language Alignment
Shaolei Zhang, Kehao Zhang, Qingkai Fang, Shoutao Guo, Yan Zhou, Xiaodong Liu, Yang Feng.
Paper / Code / Demo
BayLing: Bridging Cross-lingual Alignment and Instruction Following through Interactive Translation for Large Language Models
Shaolei Zhang, Qingkai Fang, Zhuocheng Zhang, Zhengrui Ma, Yan Zhou, Langlin Huang, Mengyu Bu, Shangtong Gui, Yunji Chen, Xilin Chen, Yang Feng
Paper / Code / Demo
|
Publications
2025
LLaMA-Omni: Seamless Speech Interaction with Large Language Models
Qingkai Fang, Shoutao Guo, Yan Zhou, Zhengrui Ma, Shaolei Zhang, Yang Feng
ICLR 2025
Paper / Code / Model
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token
Shaolei Zhang, Qingkai Fang, Zhe Yang, Yang Feng
ICLR 2025
Paper / Code / Model
2024
Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?
Qingkai Fang, Shaolei Zhang, Zhengrui Ma, Min Zhang, Yang Feng
Proceedings of ACL 2024 (CCF-A)
Paper / Code / Demo
CTC-based Non-autoregressive Textless Speech-to-Speech Translation
Qingkai Fang, Zhengrui Ma, Yan Zhou, Min Zhang, Yang Feng
Findings of ACL 2024
Paper / Code
StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning
Shaolei Zhang, Qingkai Fang, Shoutao Guo, Zhengrui Ma, Min Zhang, Yang Feng
Proceedings of ACL 2024 (CCF-A)
Paper / Code / Demo
A Non-autoregressive Generation Framework for Simultaneous Speech-to-x Translation
Zhengrui Ma, Qingkai Fang, Shaolei Zhang, Shoutao Guo, Yang Feng, Min Zhang
Proceedings of ACL 2024 (CCF-A)
Paper / Code / Demo
2023
DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation
Qingkai Fang, Yan Zhou, Yang Feng
NeurIPS 2023 (CCF-A)
Paper / Code / Demo
Understanding and Bridging the Modality Gap for Speech Translation
Qingkai Fang, Yang Feng
Proceedings of ACL 2023 (CCF-A)
Paper / Code
Back Translation for Speech-to-text Translation Without Transcripts
Qingkai Fang, Yang Feng
Proceedings of ACL 2023 (CCF-A)
Paper / Code
CMOT: Cross-modal Mixup via Optimal Transport for Speech Translation
Yan Zhou, Qingkai Fang, Yang Feng
Proceedings of ACL 2023 (CCF-A)
Paper / Code
Bridging the Gap between Synthetic and Authentic Images for Multimodal Machine Translation
Wenyu Guo, Qingkai Fang, Dong Yu, Yang Feng
Proceedings of EMNLP 2023 (CCF-B)
Paper / Code
2022
STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation
Qingkai Fang, Rong Ye, Lei Li, Yang Feng, Mingxuan Wang
Proceedings of ACL 2022 (CCF-A)
Paper / Code
Neural Machine Translation with Phrase-Level Universal Visual Representations
Qingkai Fang, Yang Feng
Proceedings of ACL 2022 (CCF-A)
Paper / Code
Low-resource Neural Machine Translation with Cross-modal Alignment
Zhe Yang, Qingkai Fang, Yang Feng
Proceedings of EMNLP 2022 (CCF-B)
Paper / Code
|
Awards
National Scholarship, at ICT/CAS, Nov. 2024
ICT's Special Scholarship (Highest award in ICT/CAS), at ICT/CAS, Jan. 2024
First Academic Scholarship, at ICT/CAS, Sep. 2023/2024
Merit Student, at ICT/CAS, May. 2023/2024
Outstanding Graduates in Beijing, at BUPT, Jun. 2021
CCF Elite Collegiate Award, Aug. 2020
National Scholarship (Top 1%), at BUPT, Dec. 2019
Silver Medal, ACM-ICPC Asia Regional Contest, Shenyang Site, Oct. 2018
Silver Medal, China Collegiate Programming Contest (CCPC), Guilin Site, Oct. 2018
Bronze Medal, National Olympiad in Informatics (NOI), Jul. 2016
|
Services
Reviewer: ACL 2023~2024, EMNLP 2021~2024, TALLIP
|
|