Aashiq Muhamed

Hi, I’m Aashiq, a Ph.D. student in the CMU Machine Learning Department, where I’m advised by Professors Mona Diab and Virginia Smith.

My research focuses on technical preparedness for the intelligence explosion and AGI’s grand challenges. I work on building the technical foundations needed to navigate rapid AI-driven transformation safely. I’m particularly interested in:

Foundations of Mechanistic Interpretability: Developing SAEs and interpretability methods to understand foundation model internals and the science of finetuning, unlearning, alignment, and reasoning.
Efficient AI to Prevent Power Concentration: Leveraging mechanistic insights to dramatically reduce training and inference computational requirements, preventing extreme concentration of power during explosive AI growth.
Collaborative AI Systems for Epistemic Security: Building safe and verifiable collaborative RAG systems, LLM-as-a-judge frameworks, and evaluation paradigms that enhance collective decision-making during rapid technological change.
Technical Infrastructure for AGI Governance: Creating verification methods and safety monitoring systems that enable international cooperation and if-then commitments during the intelligence explosion.

background

I bring a unique interdisciplinary perspective to my research, with degrees spanning engineering, language technologies, and machine learning. I earned my B.Tech in Mechanical Engineering from the Indian Institute of Technology, Roorkee, where I was awarded the President’s Gold Medal. I subsequently completed an MS in Mechanical Engineering at Stanford University and an MS in Language Technologies from the CMU Language Technologies Institute.

Before beginning my Ph.D., I spent five years in industry as an Applied Scientist at Amazon, working across diverse AI applications including AWS DeepComposer (2019-2021), Amazon Search M5 (2021-2022), and AWS AI (2022-2023). This industry experience has shaped my research approach, emphasizing both theoretical rigor and practical impact.

collaboration

I’m always excited to collaborate with fellow researchers at CMU and beyond. If you’re working on related problems or interested in exploring research opportunities together, I’d love to hear from you.

For prospective collaborators at CMU, please feel free to fill out this brief collaboration form to help me understand your interests and background—though don’t hesitate to reach out directly regardless.

I’m also happy to chat with anyone considering a Ph.D. after industry experience, as I know firsthand how rewarding (and challenging) that transition can be.

news

Aug 1, 2025	I’m excited to be a mentor for SPAR 2025 this Fall! I’ll be mentoring students on using SAEs for interpretable and tamper-resistant alignment. If you’re interested in advancing AI safety through hands-on interpretability research, please apply!
Jul 1, 2025	I am a visiting researcher at University of California, Berkeley this summer, hosted by Prof. Dawn Song and Xuandong Zhao.
Jun 1, 2025	Working as a FIG Fellow with Chi Nguyen, Caspar Oesterheld, and Emery Cooper on “Training AIs to Aid Decision Theory and Acausal Research” through the Future Impact Group’s Philosophy for Safe AI program.
May 2, 2025	I’m excited to present two papers at NAACL 2025: “CoRAG: Collaborative Retrieval-Augmented Generation” and “Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in Foundation Models”, with the latter also featuring an oral presentation at the TrustNLP Workshop.
Oct 16, 2024	Our work Inducing Elasticity in Foundation Models: Post-Training Techniques for Adaptable Inference was accepted at the The 4th Workshop on Efficient Natural Language and Speech Processing @ NeurIPS 2024. We study weight decomposition approaches to induce elasticity in pretrained LLMs.
Oct 15, 2024	I will be presenting my MATS project, “Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in LLMs,” at the Second NeurIPS Workshop on Attributing Model Behavior at Scale. Our results show that SSAEs pareto dominate pretrained SAEs within specific subdomains, with promising implications for broader applications in AI safety.
Sep 1, 2024	Delighted to have been named a Siebel Scholar 2025.
Aug 1, 2024	Spending the summer at Berkeley as an ML alignment and Theory Scholar working with Lucius Bushnaq and Jake Mendel from Apollo Research. Excited about pushing the frontiers of mechanistic interpretability.
Jul 20, 2024	We will present our work GRASS: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients at EMNLP 2024 and the Efficient Systems for Foundation Models Workshop @ICML 2024.
Mar 5, 2024	We will present our work “Fed Up with Complexity: Simplifying Many-Task Federated Learning with NTKFedAvg”, and “Cache Me If You Can: The Case For Retrieval Augmentation in Federated Learning” at the Privacy Regulation and Protection in Machine Learning Workshop @ ICLR 2024.
Mar 1, 2024	Our work “Less is Fed More: Sparsity Reduces Feature Distortion in Federated Learning” studying sparsity and feature distortion in federated learning was accepted at the Modular and Multilingual NLP Workshop at EACL 2024.
Feb 27, 2024	Our work “Adversarial Continuous Text to Image Generation” has been accepted to CVPR24!
Dec 20, 2023	We released An In-depth Look at Gemini’s Language Abilities, an impartial, in-depth, and reproducible study comparing Gemini, GPT, and Mixtral.
Dec 20, 2023	Our solution was the winning entry at the 1st Privacy Preserving Federated Learning Document VQA competition at NeurIPS 2023.
Sep 1, 2023	Started my MS/Ph.D at CMU!

selected publications

NeurIPS MechInterp Spotlight

Position: Mechanistic Interpretability Should Prioritize Feature Consistency in SAEs

Xiangchen Song*, Aashiq Muhamed*, Yujia Zheng, and 5 more authors

In Mechanistic Interpretability Workshop at NeurIPS, 2025
COLMICML Actionable InterpICML R2FM

SAEs can improve unlearning: Dynamic Sparse Autoencoder Guardrails for Precision Unlearning in LLMs

Aashiq Muhamed, Jacopo Bonato, Mona Diab, and 1 more author

In Conference on Language Modeling (COLM), 2025
EMNLPICML ES-FoMo

Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients

Aashiq Muhamed, Oscar Li, David Woodruff, and 2 more authors

In The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
NeurIPS ENLSP Oral

CTR-BERT: Cost-effective knowledge distillation for billion-parameter teacher models

Aashiq Muhamed, Iman Keivanloo, Sujan Perera, and 6 more authors

In NeurIPS Workshop on Efficient Natural Language and Speech Processing (ENLSP), 2021