1 . Hao Shi, Weili Song, Xinting Zhang, Jiahe Shi, Cuicui Luo, Xiang Ao, Hamid Arian and Luis Angel Seco. AlphaForge: A Framework to Mine and Dynamically Combine Formulaic Alpha Factors. 39th Annual AAAI Conference on Artificial Intelligence (AAAI), pp. 25860–25868, Philadelphia, PA, USA, Feb. 25–Mar. 4, 2025.
2 . Zhiguang Lu, Qianqian Xu, Shilong Bao, Zhiyong Yang, Qingming Huang. Bidirectional Logits Tree: Pursuing Granularity Reconcilement in Fine-Grained Classification. AAAI Conference on Artificial Intelligence (AAAI), pp. 19189–19197, Philadelphia, PA, USA, Feb. 25-Mar. 4, 2025.
3 . Hanyu Zhang, Xiting Wang, Chengao Li, Xiang Ao, Qing He. Controlling Large Language Models Through Concept Activation Vectors. 39th Annual AAAI Conference on Artificial Intelligence (AAAI), pp. 25851-25859, Philadelphia, PA, USA, Feb. 25–Mar. 4, 2025.
4 . Guanqi Ding, Chengyu Yang, Shuhui Wang, Xincheng Li, Jinzhe Zhang, Xin Jin, Qingming Huang. Dis²Booth: Learning Image Distribution with Disentangled Features for Text-to-Image Diffusion Models. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 2744–2752, Philadelphia, PA, USA, Feb. 25–Mar. 4, 2025.
5 . Shuo Cai, Xinzhe Han, Shuhui Wang. Divide-and-Conquer: Tree-structured Strategy with Answer Distribution Estimator for Goal-Oriented Visual Dialogue. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 1917-1925, Philadelphia, PA, USA, Feb. 25–Mar. 4, 2025.
6 . Qi Yuan, Yang Liu, Yateng Tang, Xinhuan Chen, Xuehao Zheng, Qing He, Xiang Ao. Dynamic Graph Learning with Static Relations for Credit Risk Assessment. 39th Annual AAAI Conference on Artificial Intelligence (AAAI), pp. 13133-13141, Philadelphia, PA, USA, Feb. 25–Mar. 4, 2025.
7 . Yuchen Sun, Qianqian Xu, Zitai Wang, Zhiyong Yang, Junwei He. EDGE: Unknown-aware Multi-label Learning by Energy Distribution Gap Expansion. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 12613–12621, Philadelphia, PA, USA, Feb. 25–Mar. 4, 2025.
8 . Shoutao Guo, Shaolei Zhang, Zhengrui Ma, Yang Feng. Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation. Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI), pp. 23969–23977, Philadelphia, PA, USA, Feb. 25–Mar. 4, 2025.
9 . Yiran Qiao, Ningtao Wang, Yuncong Gao, Yang Yang, Xing Fu, Weiqiang Wang, Xiang Ao. Online Fraud Detection via Test-time Retrieval-based Representation Enrichment. 39th Annual AAAI Conference on Artificial Intelligence (AAAI), pp. 12470-12478, Philadelphia, PA, USA, Feb. 25–Mar. 4, 2025.
10 . Yunbin Tu, Liang Li, Li Su, Qingming Huang. Query-centric Audio-Visual Cognition Network for Moment Retrieval, Segmentation and Step-Captioning. 39th Annual AAAI Conference on Artificial Intelligence (AAAI), pp. 7464-7472, Philadelphia, PA, USA, Feb. 25–Mar. 4, 2025.
11 . Xingyu Lyu, Qianqian Xu, Zhiyong Yang, Shaojie Lyu, Qingming Huang. SSE-SAM: Balancing Head and Tail Classes Gradually through Stage-Wise SAM. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 19278–19286, Philadelphia, PA, USA, Feb. 25–Mar. 4, 2025.
12 . Chengao Li, Hanyu Zhang, Yunkun Xu, Hongyan Xue, Xiang Ao, Qing He. Gradient-Adaptive Policy Optimization: Towards Multi-Objective Alignment of Large Language Models. The 63rd Annual Meeting of the Association for Computational Linguistics (ACL), Vienna, Austria, Jul. 27-Aug. 1, 2025.
13 . Qingkai Fang, Yan Zhou, Shoutao Guo, Shaolei Zhang, Yang Feng. LLaMA-Omni 2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis. The 63rd Annual Meeting of the Association for Computational Linguistics (ACL), Vienna, Austria, Jul. 27-Aug. 1, 2025.
14 . Qiang Ding, Lvzhou Luo, Yixuan Cao, Ping Luo. Attention with Dependency Parsing Augmentation for Fine-Grained Attribution. Findings of the Association for Computational Linguistics: ACL (ACL Findings), Vienna, Austria, Jul. 27-Aug. 1, 2025.
15 . Andong Chen, Kehai Chen, Yang Xiang, Xuefeng Bai, Muyun Yang, Yang Feng, Tiejun Zhao, Min Zhang. LLM-based Translation Inference with Iterative Bilingual Understanding. Findings of the Association for Computational Linguistics: ACL (ACL Findings), Vienna, Austria, Jul. 27-Aug. 1, 2025.
16 . Bo Lv, Nayu Liu, Yang Shen, Xin Liu, Ping Luo, Yue Yu. Whether LLMs Know If They Know: Identifying Knowledge Boundaries via Debiased Historical In-Context Learning. Findings of the Association for Computational Linguistics: ACL (ACL Findings), Vienna, Austria, Jul. 27-Aug. 1, 2025.
17 . Zhuocheng Zhang, Yang Feng, Min Zhang. FlexRAG: A Flexible and Comprehensive Framework for Retrieval-Augmented Generation. The 63rd Annual Meeting of the Association for Computational Linguistics (ACL System Demonstrations), Vienna, Austria, Jul. 27-Aug. 1, 2025.
18 . Gaoxiang Cong, Liang Li, Jiadong Pan, Zhedong Zhang, Amin Beheshti, Anton Van Den Hengel, Yuankai Qi, Qingming Huang. FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing. ACM International Conference on Multimedia (ACM MM), Dublin, Ireland, Oct. 27-31, 2025.
19 . Jiadong Pan, Liang Li, Hongcheng Gao, Zhengjun Zha, Qingming Huang, Jiebo Luo. SafeCFG: Controlling Harmful Features with Dynamic Safe Guidance for Safe Generation. ACM International Conference on Multimedia (ACM MM), Dublin, Ireland, Oct. 27-31, 2025.
20 . Qiyang Wan, Ruiping Wang, Chengzhi Gao, Xilin Chen. Catch Your Concepts: A Flexible ConceptLocator for Interpretable Visual Recognition. 36th British Machine Vision Conference (BMVC), Sheffield, UK, Nov. 24-27, 2025.
21 . Tianyue Wang, Shuang Yang, Shiguang Shan, Xilin Chen. GLip: A Global-Local Integrated Progressive Framework for Robust Visual Speech Recognition. 36th British Machine Vision Conference (BMVC), Sheffield, UK, Nov. 24-27, 2025.
22 . Yujie Zhao, Jiabei Zeng, Shiguang Shan. Pose-Robust Calibration Strategy for Point-of-Gaze Estimation on Mobile Phones. 36th British Machine Vision Conference (BMVC), Sheffield, UK, Nov. 24-27, 2025.
23 . Kejin Liu, Junhong Lian, Xiang Ao, Ningtao Wang, Xing Fu, Yu Cheng, Weiqiang Wang, Xinyu Liu. Improved Personalized Headline Generation via Denoising Fake Interests from Implicit Feedback. 34th ACM International Conference on Information and Knowledge Management (CIKM), pp. 1872-1881, Seoul, Korea, Nov. 10-14, 2025.
24 . Gaoxiang Cong, Jiadong Pan, Liang Li, Yuankai Qi, Yuxin Peng, Anton van den Hengel, Jian Yang, Qingming Huang. EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15863-15873, Nashville, TN, USA, Jun. 11–15, 2025.
25 . Zonghui Guo, Yingjie Liu, Jie Zhang, Haiyong Zheng, Shiguang Shan. Face Forgery Video Detection via Temporal Forgery Cue Unraveling. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7396-7405, Nashville, TN, USA, Jun. 10–17, 2025.
26 . Ziyi Bai, Hanxuan Li, Bin Fu, Chuyan Xiong, Ruiping Wang, Xilin Chen. R2C: Mapping Room to Chessboard to Unlock LLM As Low-Level Action Planner. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 19456–19466, Nashville, TN, USA, Jun. 10–17, 2025.
27 . Zhen Yang, Zhuo Tao, Qi Chen, Yuankai Qi, Liang Li, Anton van den Hengel, Qingming Huang. Separation of powers: On segregating knowledge from observation in LLM-enabled knowledge-based visual question answering. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 24753-24762, Nashville, TN, USA, Jun. 10–17, 2025.
28 . Yiheng Li, Ruibing Hou, Hong Chang, Shiguang Shan, Xilin Chen. UniPose: A Unified Multimodal Framework for Human Pose Comprehension, Generation and Editing. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 27805-27815, Nashville, TN, USA, Jun. 10–17, 2025.
29 . Yue Wu, Zhaobo Qi, Junshu Sun, Yaowei Wang, Qingming Huang, Shuhui Wang. Video Language Model Pretraining with Spatio-temporal Masking. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8557-8567, Nashville, TN, USA, Jun. 10–17, 2025.
30 . Yang Liu, Zikun Zhang, Xiang Ao, Lingxiang Tian, Qing He. OFTEN: Graph Invariant Learning via Soft Environment Inference. 30th International Conference on Database Systems for Advanced Applications (DASFAA), Singapore, Singapore, May. 26-29, 2025.
31 . Yifan Liu, Yixuan Cao, Ping Luo. Inspire Me with Your Questions: Repurposing Historical Questions for New Documents. International Conference on Database and Expert Systems Applications (DEXA), Bangkok, Thailand, Aug. 25-27, 2025.
32 . Mengyu Bu, Shaolei Zhang, Zhongjun He, Hua Wu, Yang Feng. AlignX: Advancing Multilingual Large Language Models with Multilingual Representation Alignment. The 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP), Suzhou, China, Nov. 4-9, 2025.
33 . Yujie Wang, Yunwei Zhao, Jing Yang, Han Han, Shiguang Shan, Jie Zhang. Evaluating Cognitive-Behavioral Fixation via Multimodal User Viewing Patterns on Social Media. The 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP), Suzhou, China, Nov. 4-9, 2025.
34 . Kangyu Qiao, Shaolei Zhang, Yang Feng. IG-Pruning: Input-Guided Block Pruning for Large Language Models. The 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP), Suzhou, China, Nov. 4-9, 2025.
35 . Lvzhou Luo, Yixuan Cao, Ping Luo. AttnComp: Attention-Guided Adaptive Context Compression for Retrieval-Augmented Generation. Findings of the Association for Computational Linguistics: EMNLP (EMNLP Findings), Suzhou, China, Nov. 4-9, 2025.
36 . Dan Han, Mingjie He, Jie Zhang, Shiguang Shan. Dual-Branch Partial Annotation Learning for Facial Attributes Recognition. IEEE 19th International Conference on Automatic Face and Gesture Recognition (FG), Tampa/Clearwater, FL, USA, May 26-30, 2025.
37 . Yi Qiao, Yang Liu, Qing He, Xiang Ao. Domain-aware Node Representation Learning for Graph Out-of-Distribution Generalization. 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India, Apr. 6-11, 2025.
38 . Xinkuan Qiu, Meina Kan, Yongbin Zhou, Shiguang Shan. Benchmarking Multimodal Large Language Models Against Image Corruptions. IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, HI, USA, Oct. 19-23, 2025.
39 . Feixiang Wang, Shuang Yang, Shiguang Shan, Xilin Chen. CogCM: Cognition-Inspired Contextual Modeling for Audio-Visual Speech Enhancement. IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, HI, USA, Oct. 19-23, 2025.
40 . Yufei Cai, Hu Han, Yuxiang Wei, Shiguang Shan, Xilin Chen. EfficientMT: Efficient Temporal Adaptation for Motion Transfer in Text-to-Video Diffusion Models. IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, HI, USA, Oct. 19-23, 2025.
41 . Zongyao Xue, Meina Kan, Shiguang Shan, Xilin Chen. Feature Decomposition-Recomposition in Large Vision-Language Model for Few-Shot Class-Incremental Learning. IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, HI, USA, Oct. 19-23, 2025.
42 . Sixian Zhang, Xinyao Yu, Xinhang Song, Yiyao Wang, Shuqiang Jiang. Function-centric Bayesian Network for Zero-Shot Object Goal Navigation. IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, HI, USA, Oct. 19-23, 2025.
43 . Mengdi Liu, Zhangyang Gao, Hong Chang, Ziqing Li, Shiguang Shan, Xilin Chen. G2PDiffusion: Cross-species Genotype-to-Phenotype Prediction via Evolutionary Diffusion. IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, HI, USA, Oct. 19-23, 2025.
44 . Jiahe Zhao, Ruibing Hou, Zejie Tian, Hong Chang, Shiguang Shan. HIS-GPT: Towards 3D Human-In-Scene Multimodal Understanding. IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, HI, USA, Oct. 19-23, 2025.
45 . Xiaorong Qin, Xinhang Song, Sixian Zhang, Xinyao Yu, Xinmiao Zhang, Shuqiang Jiang. Learning on the Go: A Meta-learning Object Navigation Model. IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, HI, USA, Oct. 19-23, 2025.
46 . Zhuo Li, Mingshuang Luo, Ruibing Hou, Xin Zhao, Hao Liu, Hong Chang, Zimo Liu, Chen Li. Morph: A Motion-free Physics Optimization Framework for Human Motion Generation. IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, HI, USA, Oct. 19-23, 2025.
47 . Zhaoxin Yuan, Shuang Yang, Shiguang Shan, Xilin Chen. Not Only Vision: Evolve Visual Speech Recognition via Peripheral Information. IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, HI, USA, Oct. 19-23, 2025.
48 . Mingquan Zhou, Chen He, Ruiping Wang, Xilin Chen. OV3D-CG: Open-vocabulary 3D Instance Segmentation with Contextual Guidance. IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, HI, USA, Oct. 19-23, 2025.
49 . Yuyi Liu, Xinhang Song, Tianliang Qi, Shuqiang Jiang. Trial-Oriented Visual Rearrangement. IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, HI, USA, Oct. 19-23, 2025.
50 . Senhao Liu, Zhiyu Guo, Zhiyuan Ji, Yueguo Chen, Yateng Tang, Yunhai Wang, Xuehao Zheng, Xiang Ao. Beyond the Pre-Service Horizon: Infusing In-Service Behavior for Improved Financial Risk Forecasting. IEEE International Conference on Data Mining (ICDM), Washington, DC, USA, Dec. 12-15, 2025.
51 . Xinxin Li, Yang Liu, Weigao Wen, Siyong Xu, Qing He, Xiang Ao. Dilution of Unreliable Information: Learning in Graph with Noisy Structures and Absent Attributes. IEEE International Conference on Data Mining (ICDM), Washington, DC, USA, Dec. 12-15, 2025.
52 . Fanglue Zhang, Shufan Shen, Chao Bi, Li Su, Qingming Huang, Shuhui Wang. SVDLoRA: Data-Driven Low-Rank Adaptation via Spectral Decomposition. IEEE International Conference on Data Mining Workshops (ICDMW), Washington, DC, USA, Dec. 12-15, 2025.
53 . Yifeng Xu, Zhenliang He, Shiguang Shan, Xilin Chen. CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation. International Conference on Learning Representations (ICLR), pp. 5844-5866, Singapore, Singapore, Apr. 24-28, 2025.
54 . Yifei Xing, Xiangyuan Lan, Ruiping Wang, Dongmei Jiang, Wenjun Huang, Zheng Qingfang, Yaowei Wang. EMMA: Empowering Multi-modal Mamba with Structural and Hierarchical Alignment. 13th International Conference on Learning Representations (ICLR), pp. 33369-33397, Singapore, Singapore, Apr. 24-28, 2025.
55 . Shufan Shen, Zhaobo Qi, Junshu Sun, Qingming Huang, Qi Tian, Shuhui Wang. Enhancing Pre-trained Representation Classifiability can Boost its Interpretability. The Thirteenth International Conference on Learning Representations (ICLR), pp. 38903-38927, Singapore, Singapore, Apr. 24-28, 2025.
56 . Yue Wu, Zhaobo Qi, Yiling Wu, Junshu Sun, Yaowei Wang, Shuhui Wang. Learning fine-grained representations through textual token disentanglement in composed video retrieval. The Thirteenth International Conference on Learning Representations (ICLR), pp. 91981-92003, Singapore, Singapore, Apr. 24-28, 2025.
57 . Qingkai Fang, Shoutao Guo, Yan Zhou, Zhengrui Ma, Shaolei Zhang, Yang Feng. LLaMA-Omni: Seamless Speech Interaction with Large Language Models. The 13th International Conference on Learning Representations (ICLR), pp. 69565-69582, Singapore, Singapore, Apr. 24-28, 2025.
58 . Shaolei Zhang, Qingkai Fang, Zhe Yang, Yang Feng. LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token. The 13th International Conference on Learning Representations (ICLR), pp. 86455-86480, Singapore, Singapore, Apr. 24-28, 2025.
59 . Wen Wang, Ruibing Hou, Hong Chang, Shiguang Shan, Xilin Chen. MATS: An Audio Language Model under Text-only Supervision. International Conference on Machine Learning (ICML), Vancouver, BC, Canada, Jul. 13-19, 2025.
60 . Cong Hua, Qianqian Xu, Zhiyong Yang, Zitai Wang, Shilong Bao, Qingming Huang. OpenworldAUC: Towards Unified Evaluation and Optimization for Open-world Prompt Tuning. International Conference on Machine Learning (ICML), Vancouver, BC, Canada, Jul. 13-19, 2025.
61 . Zhengrui Ma, Yang Feng, Min Zhang. Overcoming Non-monotonicity in Transducer-based Streaming Generation. The Forty-Second International Conference on Machine Learning (ICML), Vancouver, BC, Canada, Jul. 13-19, 2025.
62 . Hongshuo Chen, Yixuan Cao, Shiwei Ye, Ping Luo. Tagging Generatively: Dynamic and Open-Ended Image Tagging with MLLM. International Joint Conference on Neural Networks (IJCNN), Rome, Italy, Jun. 30-Jul. 5, 2025.
63 . Senwei Xie, Hongyu Wang, Zhanqi Xiao, Ruiping Wang, Xilin Chen. Robotic Programmer: Video Instructed Policy Code Generation for Robotic Manipulation. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 14923-14930, Hangzhou, China, Oct. 19-25, 2025.
64 . Hao Liang, Meina Kan, Shiguang Shan, Xilin Chen. Task-Oriented Token Pruning for Efficient Object Detection and Segmentation. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 7826-7833, Hangzhou, China, Oct. 19-25, 2025.
65 . Yixuan Cao, Juyao Liu, Haodong Wang, Jian Wang, Kun Wan, Gang Xiao, Ping Luo. ConciseExplain: Reducing Redundancy and Spuriousness in Persuasive Recommendation Explanation. Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp. 4284-4295, Toronto, ON, Canada, Aug. 3-7, 2025.
66 . Zhiyu Guo, Yang Liu, Xiang Ao, Qing He. GRASP: Differentially Private Graph Reconstruction Defense with Structured Perturbation. 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp. 767-777, Toronto, ON, Canada, Aug. 3-7, 2025.
67 . Langlin Huang, Mengyu Bu, Yang Feng. MoCE: Adaptive Mixture of Contextualization Experts for Byte-based Neural Machine Translation. 2025 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), pp. 1011–1028, Albuquerque, NM, USA, Apr. 29-May 4, 2025.
68 . Yongxin He, Shan Zhang, Yixuan Cao, Lei Ma, Ping Luo. DETree: DEtecting Human-AI Collaborative Texts via Tree-Structured Hierarchical Representation Learning. Annual Conference on Neural Information Processing Systems (NeurIPS), San Diego, CA, USA, Dec. 2-7, 2025.
69 . Jinzhe Liu, Junshu Sun, Shufan Shen, Chenxue Yang, Shuhui Wang. Edit Less, Achieve More: Dynamic Sparse Neuron Masking for Lifelong Knowledge Editing in LLMs. Annual Conference on Neural Information Processing Systems (NeurIPS), San Diego, CA, USA, Dec. 2-7, 2025.
70 . Zhengrui Ma, Yang Feng, Chenze Shao, Fandong Meng, Jie Zhou, Min Zhang. Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space. Thirty-Ninth Conference on Neural Information Processing Systems (NeurIPS), San Diego, CA, USA, Dec. 2-7, 2025.
71 . Shoutao Guo, Shaolei Zhang, Qingkai Fang, Zhengrui Ma, Min Zhang, Yang Feng. FastLongSpeech: Enhancing Large Speech-Language Models for Efficient Long-Speech Processing. Thirty-Ninth Conference on Neural Information Processing Systems (NeurIPS), San Diego, CA, USA, Dec. 2-7, 2025.
72 . Junxi Chen, Liang Li, Yunbin Tu, Li Su, Zhe Xue, Qingming Huang. Generalizing Single-Frame Supervision to Event-Level Understanding for Video Anomaly Detection. Annual Conference on Neural Information Processing Systems (NeurIPS), San Diego, CA, USA, Dec. 2-7, 2025.
73 . Zaifei Yang, Hong Chang, Ruibing Hou, Shiguang Shan, Xilin Chen. KnowMol: Advancing Molecular Large Language Models with Multi-Level Chemical Knowledge. Annual Conference on Neural Information Processing Systems (NeurIPS), San Diego, CA, USA, Dec. 2-7, 2025.
74 . Boyu Han, Qianqian Xu, Shilong Bao, Zhiyong Yang, Kangli Zi, Qingming Huang. LightFair: Towards an Efficient Alternative for Fair T2I Diffusion via Debiasing Pre-trained Text Encoders. Annual Conference on Neural Information Processing Systems (NeurIPS), San Diego, CA, USA, Dec. 2-7, 2025.
75 . Mengdi Liu, Xiaoxue Cheng, Zhangyang Gao, Hong Chang, Cheng Tan, Shiguang Shan, Xilin Chen. ProtInvTree: Deliberate Protein Inverse Folding with Reward-guided Tree Search. Annual Conference on Neural Information Processing Systems (NeurIPS), San Diego, CA, USA, Dec. 2-7, 2025.
76 . Junshu Sun, Wanxing Chang, Chenxue Yang, Qingming Huang, Shuhui Wang. Relieving the Over-aggregating Effect in Graph Transformers. Annual Conference on Neural Information Processing Systems (NeurIPS), San Diego, CA, USA, Dec. 2-7, 2025.
77 . Jiachen Liang, Ruibing Hou, Minyang Hu, Hong Chang, Shiguang Shan, Xilin Chen. Revisiting Logit Distributions for Reliable Out-of-Distribution Detection. Thirty-Ninth Conference on Neural Information Processing Systems (NeurIPS), San Diego, CA, USA, Dec. 2-7, 2025.
78 . Bo Lv, Chen Tang, Nayu Liu, Xin Liu, Yue Yu, Ping Luo. SpecFuse: Training-Free LLM Ensembling via Iterative Drafting, Verification, and Online Feedback. Annual Conference on Neural Information Processing Systems (NeurIPS), San Diego, CA, USA, Nov. 30-Dec. 7, 2025.
79 . Yinqi Li, Jiahe Zhao, Hong Chang, Ruibing Hou, Shiguang Shan, Xilin Chen. un2CLIP: Improving CLIP's Visual Detail Capturing Ability via Inverting unCLIP. Thirty-Ninth Conference on Neural Information Processing Systems (NeurIPS), San Diego, CA, USA, Dec. 2-7, 2025.
80 . Shufan Shen, Junshu Sun, Qingming Huang, Shuhui Wang. VL-SAE: Interpreting and Enhancing Vision-Language Alignment with a Unified Concept Set. Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS), San Diego, CA, USA, Dec. 2-7, 2025.
81 . Boyuan Zhang, Zhenliang He, Meina Kan, Shiguang Shan. Precise Integral in NeRFs: Overcoming the Approximation Errors of Numerical Quadrature. IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 317-326, Tucson, AZ, USA, Feb. 26-Mar. 6, 2025.
82 . Yiran Qiao, Xiang Ao, Yang Liu, Jiarong Xu, Xiaoqian Sun, Qing He. LOGIN: A Large Language Model Consulted Graph Neural Network Training Framework. 18th ACM International Conference on Web Search and Data Mining (WSDM), pp. 232-241, Hannover, Germany, Mar. 10-14, 2025.
83 . Junhong Lian, Xiang Ao, Xinyu Liu, Yang Liu, Qing He. Panoramic Interests: Stylistic-Content Aware Personalized Headline Generation. The Web Conference (WWW), pp. 1109-1112, Sydney, NSW, Australia, Apr. 28-May 2, 2025.
84 . Yuanhao Ding, Yang Liu, Yugang Ji, Weigao Wen, Qing He, Xiang Ao. SPEAR: A Structure-Preserving Manipulation Method for Graph Backdoor Attacks. The Web Conference (WWW), pp. 1237-1247, Sydney, NSW, Australia, Apr. 28-May 2, 2025.
1 . Dongjian Yu, Weiqing Min, Xin Jin, Qian Jiang, Ying Jin, Shuqiang Jiang. Diverse and High-Quality Food Image Generation from Only Food Names. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Vol. 21, No. 5, pp. 153:1–153:22, 2025.
2 . Weiqing Min, Xingjian Hong, Yuxin Liu, Mingyu Huang, Ying Jin, Pengfei Zhou, Leyi Xu, Yilin Wang, Shuqiang Jiang, Yong Rui. Multimodal Food Learning. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Vol. 21, No. 7, pp. 196:1–196:15, July 2025.
3 . Xinda Liu, Qinyu Zhang, Weiqing Min, Guohua Geng, Shuqiang Jiang. Solutions and challenges in AI-based pest and disease recognition. Computers and Electronics in Agriculture (CEA), 238: 110775, 2025.
4 . Chengxu Liu, Weiqing Min, Jingru Song, Yancun Yang, Guorui Sheng, Tao Yao, Lili Wang, Shuqiang Jiang. Channel grouping vision transformer for lightweight fruit and vegetable recognition. Expert Systems with Applications (ESWA), 292: 128636, 2025.
5 . Zheng Zhang, Xiang Ao, Claudio J. Tessone, Gang Liu, Mingyang Zhou, Rui Mao, Hao Liao. Multiplex graph fusion network with reinforcement structure learning for fraud detection in online e-commerce platforms. Expert Systems With Applications (ESWA), 262: 125598, March 2025.
6 . Ting Yu, Binhui Ge, Shuhui Wang, Yan Yang, Qingming Huang, Jun Yu. Consistency Conditioned Memory Augmented Dynamic Diagnosis Model for Medical Visual Question Answering. IEEE Journal of Biomedical and Health Informatics (JBHI), Vol. 29, No. 2, pp. 1357–1370, 2025.
7 . Yifan Zhang, Ruiping Wang, Xilin Chen. Dynamic Behavior Cloning with Temporal Feature Prediction: Enhancing Robotic Arm Manipulation in Moving Object Tasks. IEEE Robotics and Automation Letters (RA-L), Vol. 10, No. 6, pp. 5209–5216, June 2025.
8 . Mingquan Zhou, Xiaodong Wu, Chen He, Ruiping Wang, Xi-Lin Chen. FreeMask3D: Zero-Shot Point Cloud Instance Segmentation Without 3D Training. IEEE Robotics and Automation Letters (RA-L), Vol. 10, No. 12, pp. 12301–12308, December 2025.
9 . Yin Chen, Jia Li, Shiguang Shan, Meng Wang, Richang Hong. From Static to Dynamic: Adapting Landmark-Aware Image Models for Facial Expression Recognition in Videos. IEEE Transactions on Affective Computing, Vol. 16, No. 2, pp. 624-638, 2025.
10 . Jiabei Zeng, Yujian Yuan, Lu Qu, Fei Chang, Xuran Sun, Jinqiuyu Gong, Xuling Han, Min Liu, Hang Zhao, Qiaoyun Liu, Shiguang Shan, Xilin Chen. Multi-view Facial Expressions Analysis of Autistic Children in Social Play. IEEE Transactions on Affective Computing (TAFFC), Vol. 16, No. 3, pp. 2200-2214, July-Sept. 2025.
11 . Zhihui Feng, Hao Xiong, Weiqing Min, Sujuan Hou, Huichuan Duan, Zhonghua Liu, Shuqiang Jiang. Ingredient-Guided RGB-D Fusion Network for Nutritional Assessment. IEEE Transactions on AgriFood Electronics, Vol. 3, No. 1, pp. 156-166, March-April 2025.
12 . Shoutao Guo, Shaolei Zhang, Zhengrui Ma, Min Zhang, Yang Feng. Agent-SiMT: Agent-assisted Simultaneous Translation with Large Language Models. IEEE Transactions on Audio, Speech and Language Processing (TASLP), Vol. 33, pp. 2074-2083, 2025.
13 . Jiaxin An, Liang Cao, Yingxun Wang, Ahmer Khan Jadoon, Shuhui Wang. Adaptive Fault-Tolerant Optimized Platoon Cloud Tracking Control for Heterogeneous Vehicles via Dual Learning Mechanism. IEEE Transactions on Automation Science and Engineering (TASE), Vol. 22, pp. 4382–4393, 2025.
14 . Meina Kan, Lixuan Zhang, Hao Liang, Boyuan Zhang, Minxue Fang, Dongyang Liu, Shiguang Shan, Xilin Chen. eLabrador: A wearable navigation system for visually impaired individuals. IEEE Transactions on Automation Science and Engineering (TASE), Vol. 22, pp. 12228-12244, 2025.
15 . Jiehua Zhang, Liang Li, Chenggang Yan, Wei Ke, and Yihong Gong. Monocular Depth Estimation on Adverse Weathers with Curriculum Domain Distribution Alignment. IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), Vol. 35, No. 1, pp. 178-194, Jan. 2025.
16 . Ting Yu, Kunhao Fu, Shuhui Wang, Qingming Huang, Jun Yu. Prompting Video-Language Foundation Models With Domain-Specific Fine-Grained Heuristics for Video Question Answering. IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), Vol. 35, No. 2, pp. 1615–1630, February 2025.
17 . Yong Li, Menglin Liu, Zhen Cui, Yi Ding, Yuan Zong, Wenming Zheng, Shiguang Shan, Cuntai Guan. Decoupled Doubly Contrastive Learning for Cross-Domain Facial Action Unit Detection. IEEE Transactions on Image Processing (TIP), Vol. 34, pp. 2067–2080, 2025.
18 . Zheng Yuan, Jie Zhang, Shiguang Shan, Xilin Chen. FullLoRA: Efficiently Boosting the Robustness of Pretrained Vision Transformers. IEEE Transactions on Image Processing (TIP), Vol. 34, pp. 4580–4590, 2025.
19 . Jie Zhang, Zhifan Wan, Lanqing Hu, Stephen Lin, Shuzhe Wu, Shiguang Shan. Collaboratively Self-supervised Video Representation Learning for Action Recognition. IEEE Transactions on Information Forensics & Security (TIFS), Vol. 20, pp. 1895-1907, 2025.
20 . Xingming Long, Jie Zhang, Shiguang Shan. Confidence Aware Learning for Reliable Face Anti-Spoofing. IEEE Transactions on Information Forensics and Security (TIFS), Vol. 20, pp. 5083–5093, 2025.
21 . Cong Zhang, Shuhui Wang, Xiaodan Li, Yao Zhu, Honggang Qi, Qingming Huang. Enhancing the Robustness of Vision-Language Foundation Models by Alignment Perturbation. IEEE Transactions on Information Forensics and Security (TIFS), Vol. 20, pp. 7091–7105, 2025.
22 . Jiahe Zhao, Ruibing Hou, Hong Chang, Xinqian Gu, Bingpeng Ma, Shiguang Shan, Xilin Chen. Clothes-changing person re-identification with feasibility-aware intermediary matching. IEEE Transactions on Multimedia (TMM), Vol. 27, pp. 3307-3319, 2025.
23 . Yinqi Li, Hong Chang, Ruibing Hou, Shiguang Shan, Xilin Chen. DIVE: Inverting Conditional Diffusion Models for Discriminative Tasks. IEEE Transactions on Multimedia (TMM), 2025.
24 . Weiqing Min, Shuqiang Jiang, Petia Radeva, Vladimir Pavlovic, Chong-Wah Ngo, Kiyoharu Aizawa, Wanqing Li. Guest Editorial: When Multimedia Meets Food: Multimedia Computing for Food Data Analysis and Applications. IEEE Transactions on Multimedia (TMM), Vol. 27, pp. 2708–2712, 2025.
25 . Chao Bi, Shuhui Wang, Na Li, Qingming Huang. Inferential and Commonsense Visual Question Generation. IEEE Transactions on Multimedia (TMM), Vol. 27, pp. 7796–7809, 2025.
26 . Liang Li, Tongyu Lu, Yaoqi Sun, Yuhan Gao, Chenggang Yan, Zhenghui Hu and Qingming Huang. Progressive Decision Boundary Shifting for Unsupervised Domain Adaptation. IEEE Transactions on Neural Networks and Learning Systems (TNNLS), Vol. 36, No. 1, pp. 274-285, Jan. 2025.
27 . Xiang Xiang, Jing Ma, Dongrui Wu, Zhigang Zeng, Xi-Lin Chen. Aligning Logits Generatively for Principled Black-Box Knowledge Distillation in the Wild. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Vol. 47, No. 12, pp. 11929–11945, December 2025.
28 . Liang Li, Gaoxiang Cong, Yuankai Qi, Zheng-Jun Zha, Qi Wu, Michael Sheng, Qingming Huang, Ming-Hsuan Yang. Dubbing Movies via Hierarchical Phoneme Modeling and Acoustic Diffusion Denoising. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Vol. 47, No. 11, pp. 10361-10377, 2025.
29 . Xiang Xiang, Zhuo Xu, Zihan Zhang, Zhigang Zeng, Xilin Chen. Enhanced Dual-Pattern Matching With Vision-Language Representation for Out-of-Distribution Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Vol. 47, No. 11, pp. 9673–9687, November 2025.
30 . Xingming Long, Jie Zhang, Shiguang Shan. Generalized Face Liveness Detection via De-fake Face Generator. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), , Vol. 47, No. 3, pp. 1818-1831, March 2025.
31 . Sixian Zhang, Xinhang Song, Xinyao Yu, Yubing Bai, Xinlong Guo, Weijie Li, Shuqiang Jiang. HOZ++: Versatile Hierarchical Object-to-Zone Graph for Object Navigation. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Vol. 47, No. 7, pp. 5958–5975, July 2025.
32 . Yong Li, Yufei Sun, Zhen Cui, Pengcheng Shen, Shiguang Shan. Instance-Consistent Fair Face Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Vol. 47, No. 7, pp. 5319–5335, July 2025.
33 . Tianxin Xie, Hu Han, Shiguang Shan, Xilin Chen. Natural Adversarial Mask for Face Identity Protection in Physical World. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Vol. 47, No. 3, pp. 2089-2106, March 2025.
34 . Yinqi Li, Hong Chang, Shiguang Shan, Xilin Chen. PIT: A Plug-and-Play Image Translator for Making Off-the-Shelf Models Adapt to Corruptions. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Vol. 47, No. 12, pp. 11644–11661, December 2025.
35 . Dingxi Zhang, Yu-Jie Yuan, Zhuoxun Chen, Fang-Lue Zhang, Zhenliang He, Shiguang Shan, Lin Gao. StylizedGS: Controllable Stylization for 3D Gaussian Splatting. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Vol. 47, No. 12, pp. 11961–11973, December 2025.
36 . Ziqiang Chen, Dandan Wang, Liangliang Lou, Shiqing Zhang, Xiaoming Zhao, Shuqiang Jiang, Jun Yu, Jun Xiao. Text-guided multimodal depression detection via cross-modal feature reconstruction and decomposition. Information Fusion (IF), 117: 102861, 2025.
37 . Xuhan Zhu, Yifei Xing, Ruiping Wang, Yaowei Wang, Xiangyuan Lan. Generic Scene Graph Generation Model with Hierarchical Prompt Learning. International Journal of Computer Vision (IJCV), Vol. 133, No. 10, pp. 6813-6831, June 2025.
38 . Zitai Wang, Qianqian Xu, Zhiyong Yang, Peisong Wen, Yuan He, Xiaochun Cao, Qingming Huang. Top-K Pairwise Ranking: Bridging the Gap Among Ranking-Based Measures for Multi-Label Classification. International Journal of Computer Vision (IJCV), Vol. 133, No. 1, pp. 211-253, January 2025.
39 . Yujian Yuan, Jiabei Zeng, Shiguang Shan. Exp-VQA: fine-grained facial expression analysis via visual question answering. Pattern Recognition (PR), 168: 111783, 2025.
40 . Haomiao Sun, Mingjie He, Shiguang Shan, Hu Han. Leveraging face-prior knowledge for general face representation learning. Pattern Recognition (PR), 168: 111784, 2025.
41 . Jilong Zhu, Junbao Zhuo, Shuhui Wang. PIC: Domain generalization by path information constraint. Pattern Recognition (PR), 168: 111769, 2025.
42 . Liang Shi, Jie Zhang, Zhilong Ji, Jinfeng Bai, Shiguang Shan. Real face foundation representation learning for generalized deepfake detection. Pattern Recognition (PR), 161: 111299, 2025.
43 . Jiwei Xiao, Ruiping Wang, Chen He, Xilin Chen. Cross-Domain Few-Shot 3D Point Cloud Semantic Segmentation. Pattern Recognition Letters (PRL), Vol. 197, pp. 51–57, November 2025.
44 . Minyang Hu, Hong Chang, Shiguang Shan, Xilin Chen. Inference Calibration of Vision-Language Foundation Models for Zero-Shot and Few-Shot Learning. Pattern Recognition Letters (PRL), Vol. 192, pp. 15–21, 2025.
45 . Hui Nie, Ruiping Wang, Xilin Chen. UniFa: A Unified Feature Hallucination Framework for Any-shot Object Detection. Pattern Recognition Letters (PRL), Vol. 189, pp. 207–213, March 2025.
46 . Pengfei Zhou, Weiqing Min, Chaoran Fu, Ying Jin, Mingyu Huang, Xiangyang Li, Shuhuan Mei, Shuqiang Jiang. FoodSky: A food-oriented large language model that can pass the chef and dietetic examinations. Patterns(Cell Press), Vol. 6, No. 5, May 2025.
47 . 张绍磊, 冯洋. 基于连接时序分类解码器的实时语音翻译方法. 计算机学报, Vol. 48, No. 5, pp. 1100-1115, 2025.
48 . 张绍磊, 冯洋. 实时翻译研究综述. 中文信息学报, Vol. 39, No. 9, 2025.