Stanford’s Big Local News platform empowers newsrooms to produce impactful data journalism
Ben Welsh is convinced doing data journalism projects collaboratively across newsrooms is more sustainable.
He should know. Welsh, an accomplished data journalist, is former editor of the data and graphics department at the Los Angeles Times. “Rather than having to try to do something … on a shoestring, we can all benefit by working together,” said Welsh, a visiting senior data journalist at Big Local News whose work is supported by the JSK Fellowships and the Brown Institute for Media Innovation.
Housed at Stanford University’s Journalism and Democracy Initiative, Big Local News collects local public datasets and builds larger, standardized databases out of those collections, which are shared with newsrooms across the country for analysis and telling data-driven public affairs stories. Founded in 2014 by Stanford Hearst Professional in Residence Cheryl Phillips — a longtime data journalist who previously worked at The Seattle Times — Big Local News has helped share data on topics ranging from police accountability to education to multiple COVID-19-related data. “We’re trying to lower the cost of doing that really important public service journalism,” said Phillips, Big Local News’ director.
In a similar way, National Public Radio (NPR) serves as a hub for audio content that radio members then distribute. Welsh envisions Big Local News to be a hub for big data projects in local journalism. “We want newsrooms to join as members to benefit from the work we are specialized to gather and refine — those really difficult to obtain data sources that are valuable to local outlets — and then also to work with them to help them convert the data for stories,” he said.
“Most American newsrooms embrace, in theory, the importance of data journalism, but not enough have the depth of expertise to go beyond the basics. That's where Big Local News has played a vital role since its founding,” said R.B. Brenner, managing director of the Stanford Journalism and Democracy Initiative. “By creating a collaborative, open-source ecosystem, Cheryl and her team have trained and partnered with hundreds of reporters and editors. In the process, they have informed the public about issues as diverse as disparities in our criminal justice system, a national shortage of hospital beds during the scariest periods of the COVID-19 pandemic, undercounts in the 2020 Census, and much more.”
In its early days, Big Local News dove into police accountability, gathering millions of traffic stop records from law enforcement agencies across the United States to better understand where and to what extent racial bias might be occurring. Like many of its projects, the platform utilized the journalistic expertise of Stanford’s Department of Communication with the computational power and technical innovation throughout campus, including the Stanford Computational Policy Lab. That interdisciplinary effort — known as the Stanford Open Policing Project — has spurred dozens of stories nationwide about the public’s most common interaction with law enforcement. The project used rigorous statistical analysis and found Black and Hispanic drivers nationwide were more likely than white drivers to be cited, searched and arrested after being stopped.
Like the Open Policing Project, Big Local News projects often involve intensive public records requests to gather data from all 50 states or numerous jurisdictions around the country — efforts that can often be too resource-intensive for most newsrooms to tackle alone. The platform also utilizes scraping at scale and is working to gather court records and public agendas from across the country that news organizations can use for their reporting.
The Agenda Watch project has partnered with data and web consultancy DataMade to bring together agenda and minutes from public meetings from around the U.S. “We’re hoping we can give reporters the ability to do research for stories and also keep better tabs on their beat by allowing them to subscribe to alerts,” said Serdar Tumgoren, Stanford Lorry I. Lokey Visiting Professor in Professional Journalism, who was previously a data journalist at The Associated Press (AP). “We want to give them the ability to keep a finger on the pulse of their community by getting a heads up if an item of interest appears on an upcoming agenda.”
“We’re hoping to provide another resource that will help cover local governments, be able to do local journalism, but also might yield patterns more broadly,” Phillips added.
Producing an investigation with data at scale is something Lisa Pickoff-White, data reporter at San Francisco public radio station KQED, has actively been working on. Pickoff-White, who is currently housed at Big Local News through the California Reporting Project, is gathering use of police force data from California’s nearly 700 law enforcement agencies. Through these documents, she wants to understand when and where police are using force against people and whether those incidents were investigated internally or by oversight bodies.
This police force project has already required more than 2,100 public records requests to get documents that are often written, text narratives. So Big Local News is marrying the trove of text documents with machine learning and artificial intelligence to help decipher what’s happening across all these cases. “The goal is to have humans do what humans do best and computers do what computers do best, so that everybody is working together and we can bring the public information back to the public,” Pickoff-White said.
The project has already led to published reporting in California, including an investigation into police dog bites in Richmond and a look at police force that broke people’s bones in Bakersfield (in partnership with the Investigative Reporting Program at UC Berkeley’s Graduate School of Journalism). The policing work is now scaling up as part of another effort called the Community Law Enforcement Accountability Network, a partnership on the underlying infrastructure to collect and extract use of force and misconduct information out of many reports that includes data scientists, criminal justice advocates, public defenders and journalists.
In addition to newsrooms like the Los Angeles Times and Mercury News in California, Big Local News has worked with media partners across the country. To better understand the coronavirus pandemic’s impact on school enrollment for kindergartners, Big Local News analyzed public schools enrollment data across more than half of the states in the nation to find where the health crisis had the largest educational impacts — a story ultimately published with The New York Times. The Biden administration referenced the enrollment project in its 2022 Economic Report of the President. Big Local News has also worked with Census Reporter, AP and Poynter to train journalists on how to analyze 2020 Census data for their work and published an embeddable census map to support local demographic reporting.
“Big Local News shows how local reporting can scale, bearing the fixed costs of collecting information on local agencies and outcomes, cleaning the data, and often carrying out initial analysis,” said Jay Hamilton, Hearst Professor of Communication and Stanford Journalism Program Director. “Stories that can change lives and laws go untold because of cost barriers. Big Local News significantly lowers the costs of accountability reporting through better use of data and algorithms, and shows how supporting development of tools and data can translate into stories and impact.”
While most of the data is free and publicly available, Big Local News is also building an ongoing revenue stream to support all its efforts via its new Data-Plus industrial affiliates program. Supported with seed funding from The Brown Institute for Media Innovation, Data-Plus allows newsrooms to benefit directly from Big Local News’ specialized data work by tapping into additional data services, advanced access to it and dedicated support. Reuters and the Los Angeles Times have already joined as inaugural members, and Data-Plus is open to additional members.
“I’ve done five to 10 or more projects collaborating across newsrooms, or just going it alone at the LA Times, and I’m just convinced that together is our only chance,” Welsh said. “We’re really looking to push the envelope and raise the bar for how much data that journalists go out there and get.”
In addition to Phillips, Tumgoren, Welsh and Pickoff-White, other seasoned data journalists work on Big Local News, including senior data journalist Eric Sagara (previously of Reveal and the Center for Investigative Reporting), Pulitzer Prize winner and senior data journalist Justin Mayo (previously of The Seattle Times) and Dilcia Mercedes (a Stanford Journalism Program alum).
More than 2,500 people have used Big Local News’ open datasets in recent years. The platform’s open data archive includes databases on subjects ranging from hospital bed capacities during the pandemic to the costs of wildfires. “The data sets are an open project that are geared for folks just learning how to use data analysis for stories, which I think is really important,” Phillips said. “They can do stories they otherwise wouldn’t be able to do necessarily.”
“Big Local News is enabling or empowering journalists to do their day-to-day work and fostering collaborative projects that typically no one newsroom could do by themselves,” Tumgoren added.