I'm a PhD student at the
Princeton Center of Information Technology Policy, advised by
Arvind Narayanan. Previously, I graduated with a B.Sc. from the Technical University
of Munich (TUM) & M.Sc. from the Hertie School.
I am interested in developing rigorous evaluation frameworks for AI agents, with a focus on enhancing their real-world
applicability. I am also working on bridging the gap between AI research and policy-making, aiming to improve
decision-making processes in public sector applications.
CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark arXiv preprint 2409.11363 (2024)
AI Agents That Matter arXiv preprint 2407.01502 (2024)
(* indicates equal contribution)