Skip to Content

The California Reporting Project

A police dog in San Jose bit a toddler while an officer arrested a sobbing teenager and her mother. San Diego police officers shot and killed 21-year-old refugee from Myanmar with mental illness.

These people are among more than 4,000 seriously injured or killed by law enforcement officers in California between 2016 and 2024. Although local governments collect detailed reports about these uses of force, including the results of internal investigations and disciplinary actions, they don’t release this information in a systematic way to the public. 

But in 2018, California legislators passed a new law enabling people to request records about serious uses of force and misconduct by police. My goal is to unlock the information in these now-public records and transform it into data so people can easily look up information about a specific use of force, officer or agency.

I knew that agencies would not release the records without a coordinated request effort. I also knew that we needed to identify small jurisdictions to track officers working in places like welfare departments or rail stations. So I joined other reporters to form a collaboration requesting records from more than 700 law enforcement agencies and their overseers.

Since then, I’ve guided the collaboration’s technical decisions. We’ve tracked and stored more than 20 TB files to date. I took an initial 144 data points requested by reporters and transformed them into a process and built a Django website and system where two researchers answered a limited set of questions about each case, then the system flags discrepancies for editors, who also link named officers to state staffing data so we can track where officers worked. This data has been used to report stories across California and for training data.

Now, I’m working with computer engineers at UC Berkeley, Stanford and other organizations through a national collaboration, the Community Law Enforcement Accountability Network, to automate record organization and data extraction using LLMs. Together we’re setting standards and processes for how groups can track and extract data from police records across the country.