Aashiq Muhamed
MS/Ph.D. Student in Computer Science, Carnegie Mellon University
Hi, I’m Aashiq, a second-year MS/Ph.D. student at CMU Language Technologies Institute, advised by Professors Mona Diab and Virginia Smith.
My research interests are broadly in the area of responsible AI. Topics I’m interested in include:
- AI alignment, Safety, and Control
- Compute, Data and Communication Efficiency
- Data-Centric AI and Science of AI
- Designing Capability and Interpretability Benchmarks
I hold a B.Tech in Mechanical Engineering from the Indian Institute of Technology, Roorkee, where I was the President’s Gold Medalist. I also completed my MS at Stanford University.
Before embarking on my Ph.D. journey, I accrued five years of industry experience, working as an Applied Scientist at AWS DeepComposer (2019-2021), Amazon Search M5 (2021-2022), and AWS AI (2022-2023).
Feel free to reach out if you’re interested in my research or contemplating a Ph.D. after industry experience.
news
Oct 16, 2024 | Our work Inducing Elasticity in Foundation Models: Post-Training Techniques for Adaptable Inference was accepted at the The 4th Workshop on Efficient Natural Language and Speech Processing @ NeurIPS 2024. We study weight decomposition approaches to induce elasticity in pretrained LLMs. |
---|---|
Oct 15, 2024 | I will be presenting my MATS project, “Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in LLMs,” at the Second NeurIPS Workshop on Attributing Model Behavior at Scale. Our results show that SSAEs pareto dominate pretrained SAEs within specific subdomains, with promising implications for broader applications in AI safety. |
Sep 1, 2024 | Delighted to have been named a Siebel Scholar 2025. |
Aug 1, 2024 | Spending the summer at Berkeley as an ML alignment and Theory Scholar working with Lucius Bushnaq and Jake Mendel from Apollo Research. Excited about pushing the frontiers of mechanistic interpretability. |
Jul 20, 2024 | We will present our work GRASS: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients at EMNLP 2024 and the Efficient Systems for Foundation Models Workshop @ICML 2024. |
Mar 5, 2024 | We will present our work “Fed Up with Complexity: Simplifying Many-Task Federated Learning with NTKFedAvg”, and “Cache Me If You Can: The Case For Retrieval Augmentation in Federated Learning” at the Privacy Regulation and Protection in Machine Learning Workshop @ ICLR 2024. |
Mar 1, 2024 | Our work “Less is Fed More: Sparsity Reduces Feature Distortion in Federated Learning” studying sparsity and feature distortion in federated learning was accepted at the Modular and Multilingual NLP Workshop at EACL 2024. |
Feb 27, 2024 | Our work “Adversarial Continuous Text to Image Generation” has been accepted to CVPR24! |
Dec 20, 2023 | We released An In-depth Look at Gemini’s Language Abilities, an impartial, in-depth, and reproducible study comparing Gemini, GPT, and Mixtral. |
Dec 20, 2023 | Our solution was the winning entry at the 1st Privacy Preserving Federated Learning Document VQA competition at NeurIPS 2023. |
Sep 1, 2023 | Started my MS/Ph.D at CMU! |
selected publications
- PAKDDWeb-scale semantic product search with large language modelsIn Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2023
- ACL OralReAugKD: Retrieval-Augmented Knowledge Distillation for Pre-trained Language ModelsIn Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics ACL, 2023
- AAAI OralSymbolic Music Generation with Transformer-GANsIn Proceedings of the AAAI Conference on Artificial Intelligence, 2021
- NeurIPS ENLSPCTR-BERT: Cost-effective knowledge distillation for billion-parameter teacher modelsIn NeurIPS Workshop on Efficient Natural Language and Speech Processing (ENLSP) (Oral Spotlight), 2021