I am a fifth year JD-PhD student in Computer Science at Stanford University (advised by Chris Ré). I'm a part of the Hazy Research Lab, Stanford Center for Research on Foundation Models, and RegLab. I graduated with a MS in Machine Learning from Carnegie Mellon University ('19) and a BS (with Honors) in Computer Science from Stanford University ('18). I am grateful to be supported by the Stanford Interdisciplinary Graduate Fellowship (SIGF) and the HAI Graduate Fellowship.
My research lies at the intersection of artificial intelligence/machine learning (AI/ML) and law. My work explores four questions:
-
How can ML advance our understanding of the law and legal institutions? ML provides a novel way to generate empirical insights into legal questions using large-scale datasets. In my work, I've developed methods to classify state statutes creating private rights of action, and used these methods to test theories of private enforcement at the state level. In ongoing work, I am extending these methods to identity other types of state statutes, such as officer immunities and state laws creating foreign policy.
-
How should we govern AI?
I examine how AI systems, particularly in sensitive domains like healthcare, can be effectively regulated. My research includes frameworks for assessing
liability for medical AI and analyzing the
technical and institutional trade-offs associated with traditional regulatory interventions, such as disclosure and licensing.
-
What types of legal reasoning can large language models (LLMs) perform? The answer to this question has implications for how lawyers and other legal institutions (e.g., agencies, courts) think about AI adoption.
I’ve contributed to the development of benchmarks that evaluate LLM performance on diverse legal reasoning and data tasks. These include
LegalBench,
CaseHOLD, and
LoCo.
-
How can LLM performance be improved without labeled data or human supervision?
I investigate techniques that leverage auxiliary information (e.g., embeddings) to enhance model performance in unsupervised or weakly supervised settings. This is particulary relevant for legal applications, where access to labeled data is challenging. Examples of this work include
Smoothie,
Embroid, and
Bootleg.