Jiahao Yu's Page

my_photo.jpg

I am a last-year computer science Ph.D. candidate at Northwestern University, working with Prof. Xinyu. My research interests lie in Large Language Models and cybersecurity. I hold B.S. degree from Shanghai Jiao Tong University (2021). I am now the faculty job market this year. If you have any research issue, feel free to contact me! Enjoy research and life :)

news

Oct 6, 2025 Our work PATCHAGENT was accepted as CSAW 2025 Finalist. We will be presenting the work in New York City!
Oct 5, 2025 The official SWEBench-Verified and SWEBench-Lite open-weight leaderboard is updated. Our EntroPO are 1st on SWEBench-Lite and 5th on SWEBench-Verified (only suppressed by models 10x larger than ours).
Oct 5, 2025 Our work GPO: Learning from Critical Steps to Improve LLM Reasoning was covered by MIT Technology Review China .
Mar 18, 2025 Our work Soft-Label Integration for Robust Toxicity Classification was covered by MIT Technology Review China .
Apr 18, 2024 Our GPTFuzzer work won Geekcon 2023 Annual Themed Debate Breakthrough Awards and was covered by SECGEEK.

selected publications

  1. arXiv
    Building Coding Agents via Entropy-Enhanced Multi-Turn Preference Optimization
    Jiahao Yu*, Zelei Cheng*, Xian Wu, and 1 more author
    arXiv preprint arXiv:2509.12434 2026
  2. NIPS
    GPO: Learning from Critical Steps to Improve LLM Reasoning
    Jiahao Yu*, Zelei Cheng, Xian Wu, and 1 more author
    In 2025
  3. NIPS
    BlockScan: Detecting Anomalies in Blockchain Transactions
    Jiahao Yu*, Xian Wu*, Hao Liu, and 2 more authors
    In 2025
  4. USENIX
    Mind the Inconspicuous: Revealing the Hidden Weakness in Aligned LLMs’ Ethical Boundaries
    Long Talk
    Jiahao Yu*, Haozheng Luo*, Jerry Yao-Chieh, and 3 more authors
    In Proceedings of the 2025 USENIX Security 2025
  5. USENIX
    PATCHAGENT: A Practical Program Repair Agent Mimicking Human Expertise
    Long Talk
    Patched over 10 real-world bugs
    CSAW 2025 Finalist
    Zheng Yu, Ziyi Guo, Yuhang Wu, and 5 more authors
    In Proceedings of the 2025 USENIX Security 2025
  6. ICML
    The Illusion of Role Separation: Hidden Shortcuts in LLM Role Learning (and How to Fix Them)
    Zihao Wang, Yibo Jiang, Jiahao Yu, and 1 more author
    In Proceedings of the 42nd International Conference on Machine Learning 2025
  7. USENIX
    LLM-Fuzzer: Scaling Assessment of Large Language Model Jailbreaks
    Jiahao Yu, Xingwei Lin, Zheng Yu, and 1 more author
    In Proceedings of the 2024 USENIX Security 2024
  8. NIPS
    Soft-Label Integration for Robust Toxicity Classification
    Zelei Cheng, Xian Wu, Jiahao Yu, and 3 more authors
    In Proceedings of the 38th Conference on Neural Information Processing Systems 2024
  9. ICML
    RICE: Breaking Through the Training Bottlenecks of Reinforcement Learning with Explanation
    Spotlight Top-3.5%
    Zelei Cheng, Xian Wu, Jiahao Yu, and 3 more authors
    In Proceedings of the 41st International Conference on Machine Learning 2024
  10. ICLR@SET-LLM
    Assessing Prompt Injection Risks in 200+ Custom GPTs
    Featured in WIRED
    Jiahao Yu, Yuhang Wu, Dong Shu, and 3 more authors
    In ICLR 2024 Workshop on Secure and Trustworthy Large Language Models 2024
  11. arXiv
    GPTFuzzer: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
    Jiahao Yu, Xingwei Lin, Zheng Yu, and 1 more author
    In 2023
  12. NIPS
    StateMask: Explaining Deep Reinforcement Learning through State Mask
    Zelei Cheng*, Xian Wu*, Jiahao Yu*, and 3 more authors
    In Proceedings of the 37th Conference on Neural Information Processing Systems 2023
  13. USENIX
    AIRS Explanation for Deep Reinforcement Learning based Security Applications
    Jiahao Yu, Wenbo Guo, Qi Qin, and 3 more authors
    In Proceedings of the 2023 USENIX Security 2022