Jiahao Yu's Page

my_photo.jpg

I am a last-year computer science Ph.D. candidate at Northwestern University, working with Prof. Xinyu. My research interests lie in Large Language Models and cybersecurity. I hold B.S. degree from Shanghai Jiao Tong University (2021).

I will be joining the Department of Computer Engineering at New York University Abu Dhabi (NYUAD) as a Tenure-Track Assistant Professor (TTAP). I am actively looking for self-motivated Ph.D. students and postdocs to work with me — if you are passionate about Large Language Models and cybersecurity, feel free to reach out!

If you have any research issue, feel free to contact me! Enjoy research and life :)

news

Jun 4, 2026 I will be joining the Department of Computer Engineering at New York University Abu Dhabi (NYUAD) as a Tenure-Track Assistant Professor (TTAP). I am actively looking for self-motivated Ph.D. students and postdocs — feel free to reach out!
Oct 6, 2025 Our work PATCHAGENT was accepted as CSAW 2025 Finalist. We will be presenting the work in New York City!
Oct 5, 2025 The official SWEBench-Verified and SWEBench-Lite open-weight leaderboard is updated. Our EntroPO are 1st on SWEBench-Lite and 5th on SWEBench-Verified (only suppressed by models 10x larger than ours).
Oct 5, 2025 Our work GPO: Learning from Critical Steps to Improve LLM Reasoning was covered by MIT Technology Review China .
Mar 18, 2025 Our work Soft-Label Integration for Robust Toxicity Classification was covered by MIT Technology Review China .

selected publications

  1. arXiv
    Building Coding Agents via Entropy-Enhanced Multi-Turn Preference Optimization
    Jiahao Yu*, Zelei Cheng*, Xian Wu, and 1 more author
    arXiv preprint arXiv:2509.12434 2026
  2. TIFS
    PROMPTFUZZ: Harnessing Fuzzing Techniques for Robust Testing of Prompt Injection in LLMs
    Jiahao Yu*, Yangguang Shao*, Hanwen Miao, and 1 more author
    IEEE Transactions on Information Forensics and Security 2026
  3. NIPS
    GPO: Learning from Critical Steps to Improve LLM Reasoning
    Jiahao Yu*, Zelei Cheng, Xian Wu, and 1 more author
    In 2025
  4. NIPS
    BlockScan: Detecting Anomalies in Blockchain Transactions
    Jiahao Yu*, Xian Wu*, Hao Liu, and 2 more authors
    In 2025
  5. USENIX
    Mind the Inconspicuous: Revealing the Hidden Weakness in Aligned LLMs’ Ethical Boundaries
    Long Talk
    Jiahao Yu*, Haozheng Luo*, Jerry Yao-Chieh, and 3 more authors
    In Proceedings of the 2025 USENIX Security 2025
  6. USENIX
    PATCHAGENT: A Practical Program Repair Agent Mimicking Human Expertise
    Long Talk
    Patched over 10 real-world bugs
    CSAW 2025 Finalist
    Zheng Yu, Ziyi Guo, Yuhang Wu, and 5 more authors
    In Proceedings of the 2025 USENIX Security 2025
  7. ICML
    The Illusion of Role Separation: Hidden Shortcuts in LLM Role Learning (and How to Fix Them)
    Zihao Wang, Yibo Jiang, Jiahao Yu, and 1 more author
    In Proceedings of the 42nd International Conference on Machine Learning 2025
  8. USENIX
    LLM-Fuzzer: Scaling Assessment of Large Language Model Jailbreaks
    Jiahao Yu, Xingwei Lin, Zheng Yu, and 1 more author
    In Proceedings of the 2024 USENIX Security 2024
  9. NIPS
    Soft-Label Integration for Robust Toxicity Classification
    Zelei Cheng, Xian Wu, Jiahao Yu, and 3 more authors
    In Proceedings of the 38th Conference on Neural Information Processing Systems 2024
  10. ICML
    RICE: Breaking Through the Training Bottlenecks of Reinforcement Learning with Explanation
    Spotlight Top-3.5%
    Zelei Cheng, Xian Wu, Jiahao Yu, and 3 more authors
    In Proceedings of the 41st International Conference on Machine Learning 2024
  11. ICLR@SET-LLM
    Assessing Prompt Injection Risks in 200+ Custom GPTs
    Featured in WIRED
    Jiahao Yu, Yuhang Wu, Dong Shu, and 3 more authors
    In ICLR 2024 Workshop on Secure and Trustworthy Large Language Models 2024
  12. arXiv
    GPTFuzzer: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
    Jiahao Yu, Xingwei Lin, Zheng Yu, and 1 more author
    In 2023
  13. NIPS
    StateMask: Explaining Deep Reinforcement Learning through State Mask
    Zelei Cheng*, Xian Wu*, Jiahao Yu*, and 3 more authors
    In Proceedings of the 37th Conference on Neural Information Processing Systems 2023
  14. USENIX
    AIRS Explanation for Deep Reinforcement Learning based Security Applications
    Jiahao Yu, Wenbo Guo, Qi Qin, and 3 more authors
    In Proceedings of the 2023 USENIX Security 2022