LexGLUE: A Benchmark Dataset for Legal Language Understanding in English Paper • 2110.00976 • Published Oct 3, 2021
Precise Legal Sentence Boundary Detection for Retrieval at Scale: NUPunkt and CharBoundary Paper • 2504.04131 • Published Apr 5
KL3M Tokenizers: A Family of Domain-Specific and Character-Level Tokenizers for Legal, Financial, and Preprocessing Applications Paper • 2503.17247 • Published Mar 21
The KL3M Data Project: Copyright-Clean Training Resources for Large Language Models Paper • 2504.07854 • Published Apr 10
GPT as Knowledge Worker: A Zero-Shot Evaluation of (AI)CPA Capabilities Paper • 2301.04408 • Published Jan 11, 2023
Crowdsourcing accurately and robustly predicts Supreme Court decisions Paper • 1712.03846 • Published Dec 11, 2017
A General Approach for Predicting the Behavior of the Supreme Court of the United States Paper • 1612.03473 • Published Dec 11, 2016