publications
publications by categories in reversed chronological order. generated by jekyll-scholar.
2026
- arXivBuilding Coding Agents via Entropy-Enhanced Multi-Turn Preference OptimizationarXiv preprint arXiv:2509.12434 2026
- arXivLocus: Agentic Predicate Synthesis for Directed FuzzingarXiv preprint arXiv:2508.21302 2026
2025
- NIPSGPO: Learning from Critical Steps to Improve LLM ReasoningFeatured in MIT Technology Review ChinaIn 2025
- NIPS
- USENIXMind the Inconspicuous: Revealing the Hidden Weakness in Aligned LLMs’ Ethical BoundariesLong TalkIn Proceedings of the 2025 USENIX Security 2025
- USENIXPATCHAGENT: A Practical Program Repair Agent Mimicking Human ExpertiseLong Talk
Patched over 10 real-world bugs
CSAW 2025 FinalistIn Proceedings of the 2025 USENIX Security 2025 - ICMLThe Illusion of Role Separation: Hidden Shortcuts in LLM Role Learning (and How to Fix Them)In Proceedings of the 42nd International Conference on Machine Learning 2025
- ACL@LLMSEC
- ICML@MemFMKnowledge-Distilled Memory Editing for Plug-and-Play LLM AlignmentIn The Impact of Memorization on Trustworthy Foundation Models: ICML 2025 Workshop 2025
- arXivPoisonCraft: Practical Poisoning of Retrieval-Augmented Generation for Large Language ModelsarXiv preprint arXiv:2505.06579 2025
- arXivA survey on explainable deep reinforcement learningarXiv preprint arXiv:2502.06869 2025
- arXivGenoArmory: A Unified Evaluation Framework for Adversarial Attacks on Genomic Foundation ModelsarXiv preprint arXiv:2505.10983 2025
- arXivBandFuzz: An ML-powered Collaborative Fuzzing FrameworkarXiv preprint arXiv:2507.10845 2025
2024
- USENIXLLM-Fuzzer: Scaling Assessment of Large Language Model JailbreaksIn Proceedings of the 2024 USENIX Security 2024
- NIPSSoft-Label Integration for Robust Toxicity ClassificationFeatured in MIT Technology Review ChinaIn Proceedings of the 38th Conference on Neural Information Processing Systems 2024
- ICMLRICE: Breaking Through the Training Bottlenecks of Reinforcement Learning with ExplanationSpotlight Top-3.5%In Proceedings of the 41st International Conference on Machine Learning 2024
- ICSE@SBFTBandFuzz: A Practical Framework for Collaborative Fuzzing with Reinforcement LearningIn The 17th Intl Workshop on Search-Based and Fuzz Testing 2024
- arXivPromptFuzz: Harnessing Fuzzing Techniques for Robust Testing of Prompt Injection in LLMsIn 2024
- arXiv
2023
- arXivGPTFuzzer: Red Teaming Large Language Models with Auto-Generated Jailbreak PromptsIn 2023
2022
2021
2020
- J Phys Conf SerResearch on Application of Artificial Intelligence Technology in Electrical Automation ControlIn Journal of Physics: Conference Series 2020
2019
- arXiv