Riya Tyagi

Riya Tyagi

Welcome! I'm Riya, a second-year MIT student taking a gap semester to work on AI safety at MATS. I think a lot about how AI models work and how to align them. I plan to spend my life learning all the things I want to — I believe I can learn anything.

Currently, I like folding origami, singing, meeting new friends, and improving my emacs config. I feel most at home in a codebase.

Some projects

Global CoT Analysis

Tools to aggregate and analyze many chain-of-thought trajectories into structured graphs. Helps understand reasoning dynamics like cycles, strategy switching, and how intermediate states relate to final answers.

Training Reliable Activation Probes with Few Examples

Work from my time at LISA with Stefan Heimersheim. I compared probe architectures in a positive class scarcity setting, where you have 2-5 misaligned examples and thousands of aligned ones. TLDR: leveraging the abundant negative examples helps.

Harmonic Loss Trains Interpretable AI Models

Collaboration with the Tegmark Lab on a new loss function that replaces cross-entropy with Euclidean distance. We found it makes small models more interpretable and reduces grokking.

Disentangling Race Features from Retinal Images

Research at Harvard Med uncovering what features let AI infer self-reported race from medical images. Did 50+ ablation studies with CNNs and CycleGANs to figure out what's actually going on.

Deep Learning for Neonatal Cardiopulmonary Disease

Trained models to predict lung and heart disease from retinal images of premature infants. Published in JAMA Ophthalmology.

Early Parkinson's Diagnosis from Handwriting

Built ML models to detect Parkinson's by analyzing handwriting for micrographia. Got featured by NVIDIA's blog, which was pretty cool!