I am a Research Scientist at Anthropic and PhD student at MIT CSAIL (on leave) advised by Jacob Andreas and Julie Shah. I’ve spent summers at the Boston Dynamics AI Institute, MIT-IBM Watson AI Lab, Facebook AI Research (FAIR), and before grad school, two years as an AI Resident at Microsoft Research. I did my undergrad at Yale, where I got my start in research with Brian Scassellati and read a lot of dead philosophers.

I’m interested in building agents that learn representations from rich human knowledge, whether directly (e.g. from users) or through priors (e.g. from LMs). Currently, I’m thinking a lot about how to utilize pretrained models in conjunction with human feedback to interactively learn aligned preferences/rewards.

A history buff at heart, I care deeply about working with non-academic communities to create safe, ethical, and equitable AI. I currently serve as a Special Government Employee for the Defense Innovation Unit (DIU). In a previous life, I worked at the White House Office of Science and Technology Policy (OSTP), National Institute of Standards and Technology (NIST), and Schmidt Futures. I also serve on the advisory board of the Yale Jackson School of Global Affairs, where I co-teach a course on AI for policymakers.

I love being outdoors, even in the brutal Boston winters. A current goal is to run a sub-3:00 marathon (this is how I’m doing). Reach out to chat about research, policy, or running! Preferred subject line: Your cat is dope.

email | cv | google scholar | twitter | linkedin

Education

Ph.D. Computer Science, 2023 -

Massachusetts Institute of Technology
M.S. Computer Science, 2023

Massachusetts Institute of Technology
B.S. Cognitive Science, 2018

Yale University
B.A. Global Affairs, 2018

Yale University

People Financially Invested in My Future

Open Philanthropy
NSF Graduate Research Fellowship
Truman Scholarship
My parents

Recent News

All news»

[Nov 2024] Attending CoRL! I’ll be presenting Adaptive Language-Guided Abstraction from Contrastive Explanations in the main conference.

[Aug 2024] I am taking leave from MIT to lead national security evaluations on the Frontier Red Team at Anthropic.

[Aug 2024] Attending RLC! I’ll be presenting Pragmatic Feature Preferences in the RL Beyond Rewards Workshop.

[Jul 2024] Attending ICML! I’ll be presenting Pragmatic Feature Preferences in the main conference. I’ll also be attending the Alignment Workshop beforehand.

[Jul 2024] Attending RSS! I’m honored to be part of the 2024 RSS Pioneers cohort, as well as help organize the Social Intelligence in Humans and Robots Workshop and the Task Specification Workshop.

[May 2024] I started at Anthropic! I’ll be working to help make big models safer.

Publications

Learning How Hard to Think: Input-Adaptive Allocation of LM Computation

Mehul Damani, Idan Shenfeld, Andi Peng, Andreea Bobu, Jacob Andreas

ICLR 2025

Paper

Adaptive Language-Guided Abstraction from Contrastive Explanations

Andi Peng, Belinda Z. Li, Ilia Sucholutsky, Nishanth Kumar, Julie A. Shah, Jacob Andreas, Andreea Bobu

CoRL 2024

Paper

Constrained Human-AI: An Inclusive Embodied AI Assistance Challenge

Weihua Du, Qiushi Lyu, Jiaming Shen, Zhenting Qi, Hongxin Zhang, Sunli Chen, Andi Peng, Tianmin Shu, Kwonjoon Lee, Behzad Dariush, Chuang Gan

NeurIPS 2024 (Datasets and Benchmarks)

Paper

Pragmatic Feature Preferences: Learning Reward-Relevant Preferences from Human Feedback

Andi Peng, Yuying Sun, Tianmin Shu, David Abel

ICML 2024

Paper JHU News

Learning with Language-Guided State Abstractions

Andi Peng, Ilia Sucholutsky, Belinda Z. Li, Theodore R. Sumers, Thomas L. Griffiths, Jacob Andreas, Julie A. Shah

ICLR 2024
RSS Workshop on Social Intelligence in Humans and Robots, 2023 (oral)

Paper Project Site MIT News

Aligning Robot Representations with Humans

Andreea Bobu*, Andi Peng*, Pulkit Agrawal, Julie A. Shah, Anca D. Dragan

HRI 2024
ICRA Workshop on Collaborative Robots and Work of the Future, 2022
RSS Workshop on Social Intelligence in Humans and Robot, 2022
NeurIPS Workshop on ML Safety, 2022

Paper

Preference-Conditioned Language-Guided Abstractions

Andi Peng, Andreea Bobu, Belinda Z. Li, Theodore R. Sumers, Ilia Sucholutsky, Nishanth Kumar, Thomas L. Griffiths, Julie A. Shah

HRI 2024
ICLR Workshop on LLM Agents, 2024

Paper Project Site MIT News

Getting Aligned on Representational Alignment

Ilia Sucholutsky*, Lukas Muttenthaler*, Adrian Weller, Andi Peng, …, Thomas L. Griffiths

Preprint

Paper

Human-Guided Complexity-Controlled Abstractions

Andi Peng*, Mycal Tucker*, Eoin M. Kenny, Noga Zaslavsky, Pulkit Agrawal, Julie A. Shah

NeurIPS 2023

Paper Code

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Stephen Casper*, Xander Davies*, …, Andi Peng, …, Dylan Hadfield-Menell

TMLR, 2023 (Finalist, Oustanding Certification)

Paper

Diagnosis, Feedback, Adaptation: A Human-in-the-Loop Framework for Test-Time Policy Adaptation

Andi Peng, Aviv Netanyahu, Mark K. Ho, Tianmin Shu, Andreea Bobu, Julie A. Shah, Pulkit Agrawal

ICML 2023
NeurIPS Workshop on Human in the Loop Learning, 2022

Paper Poster Video Project Site MIT News (front page story)

Strengthening Subcommunities: Towards Sustainable Growth in AI Research

Andi Peng, Jessica Zosa Forde, Yonadav Shavit, Jonathan Frankle

ICLR Workshop on ML Evaluation Standards, 2022

Paper

Make Greenhouse-Gas Accounting Reliable — Build Interoperable Systems

Amy Luers, Leehi Yona, …, Andi Peng, …, Lucas Joppa

Nature, 2022

Paper

Investigations of Performance and Bias in Human-AI Teamwork in Hiring

Andi Peng, Besmira Nushi, Emre Kiciman, Kori Inkpen, Ece Kamar

AAAI 2022 (oral, top 4%)
CHI Workshop on Trust and Reliance on AI-Human Teams, 2020 (oral)

Paper Dataset Poster

On the Nature of Bias Percolation: Assessing Multiaxial Collaboration in Human-AI Systems

Andi Peng, Besmira Nushi, Kori Inkpen, Emre Kiciman, Ece Kamar

CHI Workshop on Human-Centered Approaches to Fair and Responsible AI, 2020 (oral)

Paper Slides

Human-Machine Collaboration for Fast Land Cover Mapping

Caleb Robinson, Anthony Ortiz, Kolya Malkin, Blake Elias, Andi Peng, Dan Morris, Bistra Dilkina, Nebojsa Jojic

AAAI 2020 (oral, top 3%)
NeurIPS Workshop on Tackling Climate Change with Machine Learning, 2019
ICLR Workshop on Tackling Climate Change with Machine Learning, 2020

Paper Poster Slides Project Site

The Perils of Objectivity: Towards a Normative Framework for Fair Judicial Decision-Making

Andi Peng, Malina Simard-Halm

AIES 2020 (spotlight)

Paper Slides

What You See Is What You Get? The Impact of Representation Criteria on Human Bias in Hiring

Andi Peng, Besmira Nushi, Emre Kiciman, Kori Inkpen, Siddharth Suri, Ece Kamar

HCOMP 2019

Paper Slides

An Integrated Machine Learning Approach To Studying Terrorism

Andi Peng

Undergraduate Thesis, Yale University

Paper

Conceptual Feasibility Study of the Hyperloop Vehicle for Next-Generation Transport

Kenneth Decker, Jeffrey Chin, Andi Peng, Colin Summers, Golda Nguyen, Andrew Oberlander, Gazi Sakib, Nariman Sharifrazi, Christopher Heath, Justin S. Gray, Robert D. Falck

AIAA SciTech 2017, NASA Technical Report

Paper Slides

Early Detection of Boko Haram Attacks in Nigeria

Andi Peng, Joe English, Megan Wilson, Katherine Kirk, Reem Hussein, Yara Hattab, YuHan Lim, Akshay Bery, Christine Houle, William Casey King

Technical Report, United States Institute of Peace (USIP)

Paper Slides State-Level Maps State-Level Data