USC News

show menu search

Duo breaks down big data to advance research

by Eric Lindberg
Emily Putnam-Hornstein and Jacquelyn McCroskey are developing the Children’s Data Network. (USC Photo/Eric Lindberg)
Emily Putnam-Hornstein and Jacquelyn McCroskey are developing the Children’s Data Network. (USC Photo/Eric Lindberg)

Help for people in need of social services is underway.

Backed by an initial $1 million grant from First 5 LA, researchers at the USC School of Social Work are developing a repository to integrate data across state and local agencies, in addition to fostering ongoing collaboration among researchers, policymakers, agency administrators and community leaders, to improve services for children and their families.

“The beauty of this project is that by thoughtfully compiling this information, we can very cost-effectively start to tease out what is working, what’s not, who is being served, who is not and what changes are taking place in this population,” said Emily Putnam-Hornstein, an assistant professor working in tandem with Jacquelyn McCroskey, John Milner Professor of Child Welfare, to develop the Children’s Data Network (CDN).

First 5 LA, which launched the CDN in 2010 as a strategy to increase access to timely and accurate data about children from infancy to 5 years old, recently called for strong leaders to push the project forward.

As an accomplished researcher with decades of involvement with child development agencies in Los Angeles, McCroskey has built solid relationships with key county leaders and community groups that will be necessary to bridge information gaps among various institutions.

Putnam-Hornstein’s affiliation with the California Child Welfare Indicators Project at the University of California, Berkeley, and her past research, which has involved the linkage of child welfare data with birth and death records, made her an ideal partner to oversee the CDN.

“We have different strengths, but we’re finding enormous similarity in terms of our understanding of what the issues are and how we go about engaging people and using the data they already have,” McCroskey said.

By encouraging partnerships and expanded use of existing administrative data, she is hopeful that public officials and community leaders will be able to develop a better understanding of what leads to issues such as poor developmental outcomes or child abuse and neglect, allowing them to focus on preventive efforts rather than simply reacting to maltreatment or negative behaviors.

“I hope we can demonstrate to people that this whole idea of data mining in the human services has only begun to be explored and that there is a great deal of value in terms of guiding policy and service delivery,” she said.

Exploring the effectiveness of certain services or programs by selecting a sample of children and following them over time is difficult and expensive, Putnam-Hornstein said, particularly because it may take several years before results are available and researchers may struggle to collect information across various child and family service sectors.

By linking existing data, researchers not only eliminate the time-consuming and costly nature of data collection, but they can also study an entire population of children rather than trying to develop a representative sample.

For example, during the past few years, the county’s Education Coordinating Council demonstrated how matching data from foster care, probation and 81 school districts could help improve outcomes such as academic achievement, expulsions and attendance.

Results from that project allowed policymakers to pinpoint certain geographic areas where resources could be invested to support particular students in need, a place-based approach that has been emphasized by First 5 LA.

“It was certainly light years ahead of where we had been,” McCroskey said. “But we still had to do it separately each time and for each school district.”

Putnam-Hornstein added: “We have these great exemplars of stand-alone data linkage project, but there hasn’t been a platform for the ongoing integration of data. What a shame that we do these one-off data linkages, and yet we can’t use that integrated data to answer other questions.”

As envisioned, the CDN will serve as that platform, ultimately generating integrated statewide and local information and research on services related to child health, safety and well-being.

Affiliated researchers will be able to partner with public agencies to use linked data to explore specific issues such as child obesity, the receipt of early intervention services or the effects of budget cutbacks on child care availability in local communities.

All research proposals will be reviewed by a scientific advisory board and involved agencies, and they will be subject to standard government and university approval processes to ensure that the identities of children and their families are protected, in addition to maintaining a high standard of academic rigor.

“We want to ensure we are not only focused on data security but also the scientific integrity of how the data are used,” Putnam-Hornstein said.

Ensuring political neutrality is also important, she said. Rather than adopting policy positions on any issues, the CDN will focus on generating and disseminating research to inform other stakeholders, public officials, agency leaders and child advocates.

Although similar data linkage projects have been pursued elsewhere, including Western Australia and New Zealand, McCroskey and Putnam-Hornstein hope the CDN will serve as a model — not only for other regions throughout the United States but also other areas of research beyond child health, safety and well-being — as more policymakers recognize the value of integrated data.

“We’re just beginning to figure out ways we can use existing data,” Putnam-Hornstein said. “As we get better at that, we’ll have a much more rigorous base from which to evaluate where there are service gaps and where we have really great programs that are working.”

During the coming months, they will develop the infrastructure and security protocols necessary to house the data repository at USC, which is highlighting analysis of “big data,” or large-scale data sets, as a groundbreaking strategy to advance scientific research. The duo also plans to pursue several internal research projects to showcase the potential benefits of linked data.

“This is the kind of thing that immediately appeals to geeky people, so we just want to make sure we are going beyond the people who get it right away,” McCroskey said. “We are both very enthusiastic about this. We love the idea, and we think it’s going to be a revelation for a lot of the key institutions that work with children.”

More stories about: , , , , ,