Yisong Yue

Machine Learning Professor @ Caltech

About Me

Professor of Computing and Mathematical Sciences at Caltech.

Research Interests: Machine Learning and Artificial Intelligence.

Industry Advising: Asari AI, Cainex, Latitude AI, Lila Sciences, and Tera AI.

ICLR Leadership: Member of ICLR Board. General Chair of ICLR 2025. Senior Program Chair of ICLR 2024.

Research Themes

Modeling & Inference. We develop models that learn useful structure from complex data, from representation learning to new architectures to inverse problems.

Reasoning & Self-Improvement. We study how models solve hard problems by searching, checking their work, and improving from feedback, including code generation, LLM search, and programmable agents.

Scientific Discovery. We use agents, foundation models, and closed-loop experiment design to advance discovery in biomedical imaging, neural data, protein engineering, and more.

News & Updates

Mentorship Award

Pinned

I am honored to receive the mentoring award from the Grad Student Advisory Board of Caltech EAS.

Knowledge-Centric Self-Improvement

project paper code

We introduce knowledge-centric self-improvement, a paradigm in which generic, stateless agents improve a shared, curated knowledge base rather than changing themselves. Across abstract reasoning, coding, and terminal tasks, this approach improves solve rates at lower cost and transfers across tasks and model families.

Vera: A Layered Diffusion Model for Content-Preserving Video Editing

project paper data

We introduce Vera, a layered diffusion framework for video editing that separates what to generate from what to preserve. Vera jointly produces an edit layer, an alpha matte, and a composite video, enabling more faithful content preservation while supporting creative edits.

SpeeDiff: Scalable Pixel-Anchored End-to-End Latent Diffusion Model

paper

We introduce SpeeDiff, a scalable pixel-anchored end-to-end latent diffusion method that jointly trains the VAE and diffusion model from scratch. SpeeDiff uses a Tweedie Pixel Reconstruction loss to provide pixel-level feedback during diffusion training, preventing latent collapse and enabling efficient transformer-based scaling. SpeeDiff-XL achieves strong ImageNet generation results while training over 140x faster than Vanilla SiT and 61x faster than REPA. [CVPR 2026]

FormulaCode: Evaluating Agentic Optimization on Large Codebases

project paper code data

We introduce FormulaCode, a benchmark for evaluating how well coding agents can optimize large, real-world scientific software repositories. FormulaCode tests agents on realistic performance bottlenecks with expert-written fixes and community-maintained workloads, revealing where current agents still struggle with repository-scale optimization. [ICML 2026]

End-to-End Autoregressive Image Generation with 1D Semantic Tokenizer

paper

We introduce an end-to-end approach for autoregressive image generation that learns the visual tokenizer and generator together. By letting generation quality directly shape the tokenizer, the method produces stronger image representations and achieves competitive ImageNet generation results. [ICML 2026 Spotlight]

Krause Synchronization Transformers

project paper code

We introduce Krause Attention, a principled attention mechanism inspired by bounded-confidence consensus dynamics. Krause Attention replaces similarity-based global aggregation with distance-based, localized, and selectively sparse interactions, promoting structured local synchronization instead of global mixing. We relate this behavior to recent theory modeling Transformer dynamics as interacting particle systems, and show how bounded-confidence interactions naturally moderate attention concentration and alleviate attention sinks. [ICML 2026]

NitroGen: A Foundation Model for Generalist Gaming Agents

project paper code

We introduce NitroGen, a vision-action foundation model for generalist gaming agents trained on 40,000 hours of gameplay videos across more than 1,000 games, enabling strong cross-game generalization. [CVPR 2026 Best Paper Honorable Mention]