I’ll be mentoring a project at MARS V on Behaviors That Survive Alignment — studying how hidden objectives persist through training and whether alignment erases or merely masks them. If you’re interested in mechanistic interpretability, emergent misalignment, or inductive backdoors, please consider applying!