Aashiq Muhamed

MS/Ph.D. Student in Computer Science, Carnegie Mellon University

aashiq.jpg

Hi, I’m Aashiq, a second-year MS/Ph.D. student at CMU Language Technologies Institute, advised by Professors Mona Diab and Virginia Smith.

My research interests are broadly in the area of responsible AI. Topics I’m interested in include:

  • AI alignment, Safety, and Control
  • Compute, Data and Communication Efficiency
  • Data-Centric AI and Science of AI
  • Designing Capability and Interpretability Benchmarks

I hold a B.Tech in Mechanical Engineering from the Indian Institute of Technology, Roorkee, where I was the President’s Gold Medalist. I also completed my MS at Stanford University.

Before embarking on my Ph.D. journey, I accrued five years of industry experience, working as an Applied Scientist at AWS DeepComposer (2019-2021), Amazon Search M5 (2021-2022), and AWS AI (2022-2023).

Feel free to reach out if you’re interested in my research or contemplating a Ph.D. after industry experience.

news

Oct 16, 2024 Our work Inducing Elasticity in Foundation Models: Post-Training Techniques for Adaptable Inference was accepted at the The 4th Workshop on Efficient Natural Language and Speech Processing @ NeurIPS 2024. We study weight decomposition approaches to induce elasticity in pretrained LLMs.
Oct 15, 2024 I will be presenting my MATS project, “Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in LLMs,” at the Second NeurIPS Workshop on Attributing Model Behavior at Scale. Our results show that SSAEs pareto dominate pretrained SAEs within specific subdomains, with promising implications for broader applications in AI safety.
Sep 1, 2024 Delighted to have been named a Siebel Scholar 2025.
Aug 1, 2024 Spending the summer at Berkeley as an ML alignment and Theory Scholar working with Lucius Bushnaq and Jake Mendel from Apollo Research. Excited about pushing the frontiers of mechanistic interpretability.
Jul 20, 2024 We will present our work GRASS: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients at EMNLP 2024 and the Efficient Systems for Foundation Models Workshop @ICML 2024.
Mar 5, 2024 We will present our work “Fed Up with Complexity: Simplifying Many-Task Federated Learning with NTKFedAvg”, and “Cache Me If You Can: The Case For Retrieval Augmentation in Federated Learning” at the Privacy Regulation and Protection in Machine Learning Workshop @ ICLR 2024.
Mar 1, 2024 Our work “Less is Fed More: Sparsity Reduces Feature Distortion in Federated Learning” studying sparsity and feature distortion in federated learning was accepted at the Modular and Multilingual NLP Workshop at EACL 2024.
Feb 27, 2024 Our work “Adversarial Continuous Text to Image Generation” has been accepted to CVPR24!
Dec 20, 2023 We released An In-depth Look at Gemini’s Language Abilities, an impartial, in-depth, and reproducible study comparing Gemini, GPT, and Mixtral.
Dec 20, 2023 Our solution was the winning entry at the 1st Privacy Preserving Federated Learning Document VQA competition at NeurIPS 2023.
Sep 1, 2023 Started my MS/Ph.D at CMU!

selected publications

  1. PAKDD
    Web-scale semantic product search with large language models
    Aashiq Muhamed, Sriram Srinivasan, Choon-Hui Teo, and 4 more authors
    In Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2023
  2. ACL Oral
    ReAugKD: Retrieval-Augmented Knowledge Distillation for Pre-trained Language Models
    Jianyi Zhang, Aashiq Muhamed, Aditya Anantharaman, and 7 more authors
    In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics ACL, 2023
  3. AAAI Oral
    Symbolic Music Generation with Transformer-GANs
    Aashiq Muhamed, Liang Li, Xingjian Shi, and 6 more authors
    In Proceedings of the AAAI Conference on Artificial Intelligence, 2021
  4. NeurIPS ENLSP
    CTR-BERT: Cost-effective knowledge distillation for billion-parameter teacher models
    Aashiq Muhamed, Iman Keivanloo, Sujan Perera, and 6 more authors
    In NeurIPS Workshop on Efficient Natural Language and Speech Processing (ENLSP) (Oral Spotlight), 2021