Projects
Language Biomarkers of Alzheimer’s Disease, from Sociolinguistic, Patholinguistic, and Artificial Intelligence Perspectives. National Social Science Fund Project (23AYY012)
Advisors: Dr. Jiahong Yuan (PI)(Professor, USTC), Dr. Yizhe Wang (Postdoc, USTC), Dr. Yuanchao Li (PHD, The University of Edinburgh), Dr. Yiya Chen (Professor, Leiden University, who co-advised this project during her tenure as a Visiting Professor at USTC (Fall 2024))
This foundational research is dedicated to establishing the core methodology and creating a high-quality speech corpus for studying language biomarkers of Alzheimer’s Disease in Mandarin speakers. The goal is to build a resource analogous to the English DementiaBank, serving as a bedrock for future clinical and computational research.
- Experimental Design: Co-developed a comprehensive suite of eight speech tasks grounded in patholinguistics to precisely elicit and capture the linguistic variations of Mandarin speakers with cognitive decline.
- Infrastructure Development: Built and deployed a robust, web-based data collection platform, establishing a standardized tool for high-quality remote data acquisition.
- Sociolinguistic Analysis: Investigated how sociolinguistic factors influence the perception of speech from individuals with Alzheimer’s Disease.
- Collaborated as part of a research team in the ICASSP 2025 Dementia Detection Challenge
Explainable AI for Cross-Lingual AD Biomarker Analysis
Liu, Y.-L., Li, Y., Feng, R., He, L., Chen, J.-X., Wang, Y.-M., Chen, Y.-A., Peng, Y.-H., Yuan, J.-H., Ling, Z.-H. (2025) Leveraging Cascaded Binary Classification and Multimodal Fusion for Dementia Detection through Spontaneous Speech. Proc. Interspeech 2025, 544-548. https://doi.org/10.21437/Interspeech.2025-1564
Liu He, Yuanchao Li, Rui Feng, XinRan Han, Yin-Long Liu, Yuwei Yang, Zude Zhu and Jiahong Yuan. Exploring Gender Bias in Alzheimer’s Disease Detection: Insights from Mandarin and Greek Speech Perception. The 2025 National Conference on Man-Machine Speech Communication (NCMMSC)(Oral) arXiv Slides
- Liu He, Rui Feng, XinRan Han, Yin-Long Liu, and Jiahong Yuan. 2025.Beyond Mimicry: Auditing Human Bias to Build Fairer AI for Alzheimer’s Assessment. In Companion Proceedings of the 27th International Conference on Multimodal Interaction (ICMI Companion ’25), October 13–17, 2025, Canberra, ACT, Australia. ACM, New York, NY, USA, 4 pages. https://doi.org/10.1145/3747327.3764903
Preprints & Under Review
- He, L., Li, Y., Liu, Y.-L., Feng, R., Wang, Y., Chen, J., Wang, Y., & Yuan, J. (2025). Disentangling Acoustic Cues in Alzheimer’s Pathology and Perception: The Roles of Language and Gender. Submitted to ICASSP 2026 (under review).
AI-Powered Cognitive Impairment Screening via Speech and Language Analysis. Applied Research & System Development
In collaboration with iFlytek, BUPT Digital Intelligence (Beidian Shuzhi), and Beijing Tiantan Hospital (Ranked #1 in Neurology in China)
An ongoing research project to develop a novel, non-invasive screening tool for cognitive health, aiming for dual outcomes of academic publication and commercial application. The project leverages a unique, large-scale dataset from ~4,000 individuals, featuring multimodal data including speech, text, performance on linguistic tasks, and lifestyle habits. The primary objective is to build and validate AI models to classify various types of cognitive decline (e.g., Cognitive Decline with Alzheimer’s - CDA, Non-Alzheimer’s - CDN, and Vascular - CDV).
Multimodal Communication of HPV Vaccine on Social Media
Advisors: Dr. Mengxiao Zhu(Professor, USTC) , Dr. Cuihua Shen(Professor, UC Davis), Dr. Zhengdong Mao(Professor, School of Information Science and Technology, USTC)
This was a large-scale research project that performed a cross-modality analysis of HPV vaccine discourse on social media. The study involved collecting and analyzing over 1.5TB of data, including texts, audios, and videos from major platforms like Weibo, Douyin, and Bilibili, spanning a decade from 2010 to 2023. The goal was to understand how the same vaccine was discussed differently across various media formats.
Zhu, M.*, He, L*., Zhao, H., Su, R., Zhang, L., & Hu, B. (2025). Same Vaccine, Different Voices: A Cross-Modality Analysis of HPV Vaccine Discourse on Social Media. Proceedings of the International AAAI Conference on Web and Social Media, 19(1), 2317-2333. https://doi.org/10.1609/icwsm.v19i1.35936 Link Homepage Poster Slides
This paper, which forms the core of my Master’s thesis supervised by Prof. Mengxiao Zhu, was accepted as an Oral Presentation (Lightning Talk) in the January 2025 submission cycle as a conditional accept. The review process was unusually rigorous, involving five reviewers and one Senior Program Committee (SPC) member, to whom I submitted a 49-page response addressing all concerns. The SPC meta-review praised the “super strong” review process and how the paper was made “much stronger,” concluding it was “a clear accept.” (Jan 2025 accept rate: 18/187 ≈ 9.8%.)
* denotes equal contribution
Misinformation Spreading on Social Media
Advisors: Dr. Yongdong Zhang(Professor, School of Information Science and Technology, USTC) (PI) and Dr. Mengxiao Zhu(co-PI)
- Analyzed 110,000 Weibo posts using LDA topic model
- Analyzed large-scale social media user behavioral data
- Composed research reports of over 30,000 words
Industry & Technical Experience
Data Science Intern, ERNIE Bot, Baidu | Autumn 2023
Joined Baidu’s ERNIE Bot team during the early wave of China’s LLM development, working on model evaluation, fine-tuning, and dialogue optimization for ERNIE 3.5 and 4.0.
Model Evaluation & Improvement
- Developed comprehensive evaluation frameworks based on real user feedback
- Created benchmark datasets and assessment criteria for measuring model quality
- Conducted both automated metrics and human evaluations
- Fine-tuned models using supervised learning to improve performance
AI Agent Development
- Built domain-specific AI agents using Few-Shot Supervised Fine-Tuning (SFT)
- Covered diverse scenarios: character roleplay, gaming assistants, workplace productivity tools, and educational tutors
- Contributed to productionizing AI agents for real-world deployment
Dialogue System Engineering
- Designed query prediction and recommendation systems for multi-turn conversations
- Implemented token compression techniques for efficient processing
- Integrated user behavior analysis with co-occurrence mechanisms
- Improved user engagement through context-aware query suggestions
This experience deepened my understanding of production AI systems and directly informs my current research applying LLMs to social science and clinical applications.
ChatPaper (RAG-based Academic Tool) Mar. 2023 – Jul. 2023
Founding Team Member | 100,000+ users
- One of the earliest RAG tool, focused on academic efficiency, 100k users at the time.
USTC360 (Non-Profit Career Platform) March 2022 – Present
Co-founder | 2,000+ members
Co-founded a non-profit platform connecting over 2,000 USTC students and alumni, facilitating career development by bridging the gap between academia and industry.
Developed a Python web scraper to automatically aggregate job opportunities, providing a core technical solution for the community’s information-sharing needs.
