A few months ago, Professor Amy O’Hara, a research professor at the Massive Data Institute at Georgetown University's McCourt School of Public Policy, was feeling a little stuck.
For the previous two years, O’Hara—an expert in data governance—and her collaborator Tanina Rostain—a professor at Georgetown University Law Center and an expert on access to justice—had been developing an ambitious proposal to help make the civil justice system in the United States more equitable, but they had come to a standstill.
To those uninitiated in the world of civil justice, their idea—to create a central, standardized, aggregated, and anonymized repository of court data for researchers and judicial institutions to access—might not sound visionary. That is, until you learn that unlike the criminal justice system, which requires courts to report data for the compilation of crime statistics, U.S. civil cases have no such requirement.
“Going through the Amazon working backwards process was the real turning point.”
That was certainly news to O’Hara, who over the course of her 17-year career has worked on big data initiatives in employment, health, and housing, and had assumed that the management of civil justice data must run along similar lines. When Rostain, who studies the civil justice system, suggested collaborating to make more data available to researchers, O’Hara didn’t foresee any major issues—until she began looking into it.
“When I came into the field, I realized there’s hardly any research, because there's hardly any data,” O’Hara said. “And the data that is available is held in the courts where the cases are handled. It means we have a very limited understanding of who is passing through the system, the outcomes, and the impacts of being involved in civil cases, as well as how efficiently and fairly the courts are dealing with them.”
For O’Hara and Rostain, the potential for a “Civil Justice Data Commons” is huge. It could unlock the kinds of insights that could improve access to civil justice in local communities, protect the privacy of data subjects, and help the courts become more efficient and accountable. It is also a huge challenge that requires a sophisticated technological solution and complex coalition building across the civil justice space.
“We interviewed dozens of court personnel, judges, and nonprofits as part of our initial fact-finding,” said O’Hara. “We had all this information, but we couldn’t quite figure out how to move forward.”
They knew they needed advanced data storage and analytics technology to help them realize their vision. They engaged with Amazon Web Services (AWS) to discuss cloud computing services—a collaboration that began with AWS guiding them through Amazon’s “working backwards” approach to innovation. Amazon uses this approach to come up with its own products and services, focusing on customer needs first, rather than technical solutions.
“Going through the Amazon working backwards process was the real turning point,” said O’Hara. “It helped us carve off everything we were doing that was unnecessary and focus on who our customer was, how they would interact with the data commons, and how we would match their needs with Georgetown’s capabilities.”
“We had multiple sessions with folks from AWS, and each time, either myself or one of my colleagues had at least one epiphany, when we thought, ‘This is what we're trying to do.’”
Rostain agreed, calling the process “revelatory.”
“We were able to map all the steps to address the barriers faced by courts to sharing data and to provide researchers with fast, frictionless, and facilitated access to data,” she said.
O’Hara and Rostain hope that getting researchers the data they need, as well as encouraging more standardization of data, will lead to the identification of trends that could inform policies with potentially life-changing effects for people and local communities.
“Let’s take evictions as an example,” said O’Hara. “A recent study by some of our colleagues at Georgetown found that the fee to file for an eviction in Washington, D.C., is so low that many landlords end up taking someone to court simply to collect rent.”
While for the landlord this is just a question of how best to get the money they’re owed, for the individual, it could be the difference between whether they will ever be able to rent again.
“When something gets to court, it represents a kind of crisis point.”
“Once you’re on an eviction file, there’s a mark on your record that follows you around,” said O’Hara. “Maybe the next time you complete a rental application, that landlord will run your name through a list provided by some unknown vendor, find that you’ve been previously evicted, and put you at the bottom of the pile. Where is the chance to redeem yourself? There’s no policy tool for tidying that up.”
“If there was proper data on this, it would be possible, for example, to investigate the price point at which a landlord is prepared to take someone to court, as opposed to finding other less extreme methods of getting tenants to pay rent.”
O’Hara explained that in some jurisdictions, eviction proceedings might start four days after someone has missed a rent payment, giving people little time to engage with a local nonprofit that’s administering emergency rent assistance or get all the paperwork together to prove their income is low enough to qualify for a fee waiver.
“From an access to justice point of view, more transparency in the system would make it easier to ensure individuals are aware of the resources that are available to them, such as financial or legal aid, which could lead to better outcomes,” she added.
“These are the kind of big picture issues we would one day like to address, but you can’t address anything if you don’t have anything to measure.”
The Amazon working backwards process, which starts by asking who your customer is and what problem you’re trying to solve for them, helped O’Hara and Rostain identify two major pain points for researchers when it comes to this kind of measurement—negotiating access to data and dealing with undocumented, messy information.
“Say you’re studying consumer debt,” said O’Hara “You’re probably looking at an 8- to 12-month process for obtaining data. You must request specific information from individual courts, bearing in mind it was collected for court operations, not for secondary analysis, so it won’t necessarily be in a usable format.”
“Some documents might be very clearly categorized, while others aren’t. Until you go into the document, you’ll have no idea if it’s a business-to-business case, like a builder suing a drywall provider, or a consumer case, such as a car company suing someone for not keeping up their car payments.”
“Not all documents are machine readable, either. Perhaps the address is handwritten, or maybe there’s a staple obscuring it. You will need to manually sift through thousands of PDFs to find the kind of extracts you’re looking for.”
At the same time as accelerating academic research, O’Hara and Rostain hope that by offering value to the courts—from helping them gain greater insights into their operations to alleviating the burden of providing bespoke data extracts for each individual research request they receive—they will incentivize the courts to participate in data sharing.
“Balancing the needs and concerns of the data providers is another central component to this project,” said O’Hara. “We must be clear that this is not for marketing or surveillance purposes, and that there will be proper guardrails in place for when researchers request data to help improve court systems.”
“When something gets to court, it represents a kind of crisis point,” she said. “One long-term goal of this project is to get a better understanding of how we help people survive that crisis and thrive beyond it, not to mention how to avert it in the first place.”
“We know this is a hard problem to remedy,” she added. “But we remain optimistic. We have an opportunity to pull together all the parts—from technical, to governance, to subject matter—and create something that has the potential to benefit everyone involved in the civil justice system.”
With support from The Pew Charitable Trusts, the team at Georgetown’s McCourt School of Public Policy and the Institute for Technology Law and Policy at Georgetown Law built a demonstration project using the kind of eviction data O’Hara described to test the Civil Justice Data Commons concept. They’re now moving onto a second phase, using consumer debt data, to develop it even further.
The Georgetown Civil Justice Data Commons is the kind of project AWS will work on through its first AWS Innovation Studio, where Amazon experts will collaborate with public sector organizations virtually and at Amazon’s Arlington HQ2. The Studio will focus on finding ways to address some of the world’s most pressing societal issues—from climate change, to housing insecurity, to health and education inequality—taking organizations through Amazon’s working backwards process and applying its wide range of technologies, including artificial intelligence, data analytics, and machine learning.