Neel Guha | Home

I am a fifth year JD-PhD student in Computer Science at Stanford University (advised by Chris Ré). I'm a part of the Hazy Research Lab, Stanford Center for Research on Foundation Models, and RegLab. I graduated with a MS in Machine Learning from Carnegie Mellon University ('19) and a BS (with Honors) in Computer Science from Stanford University ('18). I am grateful to be supported by the Stanford Interdisciplinary Graduate Fellowship (SIGF) and the HAI Graduate Fellowship.

My research lies at the intersection of artificial intelligence/machine learning (AI/ML) and law. Most of my work can be organized into four buckets:

Using machine learning to study the law and legal institutions. ML provides a useful set of techniques to perform large scale analysis of different legal corpora---cases, statutes, regulations, and more. I apply these techniques in the context of different substantive legal questions (often with a focus on civil procedure). For instance, my prior work has tested theories of private enforcement at the state level, by building a database of state private rights of action (U. Pa. L. Rev. 2024). In ongoing work, I am extending these techniques to build an open and accessible database of state laws, annotated with a broad spectrum of relevant features.
Studying questions around AI governance. I'm interested in how technical nuances of AI affect governance. My recent work has examined trends in how courts have approached liability for medical AI (NEJM 2024), and the technical and institutional trade-offs of commonly proposed regulatory interventions (GW L. Rev. 2024). In an ongoing project, I am examining how informational and structural properties of AI applications influence regulatory design across different application/use-contexts.
Measuring the legal usefulness/capabilities of large language models (LLMs). I work on building benchmarks to assess how well LLMs can perform different legal tasks. Measurement on legal tasks serves as a useful proxy for assessing reasoning capabilities generally, and also clarifies opportunities for real-world use. Examples of this work include LegalBench (NeurIPS 2023), CaseHOLD (ICAIL 2021), RAG benchmarks (CS&Law 2025), and LoCo (ICML 2024).
Improving LLM performance when labeled data is scarce. Producing labeled data for tasks is often expensive or impractical. I've studied how information contained within other prexisting sources may be extracted and leveraged to improve model performance for specific tasks. Examples of auxiliary sources of information include pretrained embedding models (e.g., BERT), and knowledge bases (e.g., Freebase, Wikipedia). Examples of this work include Smoothie (NeurIPS 2024), Embroid (NeurIPS 2023), and Bootleg (CIDR 2021).

Recent News

June 2025 Our chapter on legal benchmarking in the Oxford Handbook on the Foundations and Regulation of Generative AI is out! June 2025 We've released Cartridges, a new self-study framework for building long context representations for LLMs. Check out the code and paper! March 2025 We've built two new benchmarks for evaluating legal RAG systems--on both housing law and bar exam-esque questions! It's forthcoming at CS&Law 2025, and you can check out the work here. January 2025 Spoke on a panel at the Access to Justice and AI: New Frontiers for Research, Policy, and Practice (with David Engstrom, Gillian Hadfield, Natalie Knowlton, and Zach Zarnow) about evaluation in the A2J context. December 2024 AI Regulation Has Its Own Alignment Problem officially out in the George Washington Law Review. November 2024 Diego Zambrano and I have a short piece out in the Wisconsin Law Review talking about a new empirical project that uses LLMs to build an annotated database of state statutes! October 2024 Michelle Mello and I appeared on the Stanford Legal Podcast (hosted by Professor Pam Karlan and Professor Rich Ford) to talk about our work on AI liability in healthcare. September 2024 Work on learning unsupervised routers for LLMs accepted to NeurIPS 2024. July 2024 Open Problems in Technical AI Governance out on ArXiv. May 2024 Excited to contribute a chapter on benchmarking language models for legal applications to The Oxford Handbook on the Foundations and Regulation of Generative AI (OUP, 2024). May 2024 Two papers accepted to ICML 2024: (1) Prospectors, and (2) Long-Context Retrievers. January 2024 Understanding Liability Risk from Using Health Care Artificial Intelligence Tools out in The New England Journal of Medicine (with Michelle Mello). January 2024 Private Enforcement in the States out in University of Pennsylvania Law Review (with Diego Zambrano, Austin Peters, and Jeffrey Xia).

Website design and template by Martin Saveski.