Our environmental data is hidden away in unmarked boxes; we need capacity to open them up.

Imagine you’ve been asked to help put together a puzzle. It’s a very large puzzle with a lot of pieces—and once you put it together, it will help you resolve an important problem. You’re told that the vast majority of the puzzle pieces you’ll need are kept behind a particular door. Naturally, you head inside and turn on the light. Before you sits a vast warehouse of cardboard boxes. Some are labeled while others are not, some are within your grasp and many others are stacked on towering shelves.

Swap the puzzle pieces in this analogy out with environmental data, and you can begin to grasp the situation faced by data leaders across the federal government working on evidenced-based environmental policy-making; especially Chief Data Officers (CDOs). 

The origins of that status quo aren’t complicated: the 2018 Evidence Act created the CDO position across federal agencies, and designed the role to lead efforts to find, organize, and deliver the many “puzzle pieces” that will help us answer vital national policy questions, such as:

  • Is our water safe to drink, and how can we make it safer?

  • Are our infrastructure investments reaching communities that need them most?

  • How do we manage invasive species in a changing climate?

Back at the warehouse, the good news for CDOs is that there are a ton of people that work with puzzle pieces all day, every day, and who know parts of the warehouse very well since they were involved in packing and storing the boxes to begin with. The bad news for CDOs, though, is that these experts don’t report to them—and there hasn’t been a clear method for labeling and organizing the boxes for many years. Worse still is the fact that some of the puzzle pieces are for older puzzles that no one is working on anymore.

With this challenging context in mind, EPIC set out to learn more about the capacity issues environmental data leaders like CDOs face today—and to define strategies that might help them build or bolster the capacity they need when it comes to environmental data.

What We Learned

To learn more, we conducted a series of interviews with numerous data leaders in environmental agencies (e.g., EPA, NOAA, the Forest Service, DOI). Here are the key things we discovered during the course of those conversations:
We conducted interviews with leaders involved in setting up data intermediaries to find out—and here’s what we learned:

  • Getting buy-in on data work from every corner of the agency is challenging. Environmental agencies create “fractals of silos”—both because regional and field offices tend to have a lot of autonomy, and because scientific work itself often pushes in the direction of specialization. If you need to gain awareness of what’s spread across your warehouse—and eventually label it—you first need to get the attention of many different teams, all with different cultures and ways of operating. That process is important but time-consuming.

  • A lot of time is spent navigating cascading and overlapping policies. The Evidence Act, the Geospatial Data Act, the Nelson Memo, and other data-related policies have created layers of requirements that are not always straightforward for agency staff to navigate and implement in day-to-day work. Keeping with the warehouse analogy, this challenge is like focusing on relabeling existing cardboard boxes only to be told late in the effort that you actually need to use transparent boxes of a certain size for most puzzle pieces.

  • Dedicated communications staffing is one of the most pressing needs. Great communicators were mentioned as one of the top capacity needs to advance data work. Data leaders are asked to communicate to many audiences and it's almost impossible to over communicate about data policies and processes, and the value of data governance for agency policy goals. Unfortunately, few CDOs have dedicated communications capacity at their disposal. That needs to change.

  • People with other job titles frequently end up stepping into data management roles. For example, many staff originally hired as biologists end up migrating into data roles—with or without formal training in data science or management. Moreover, titles and job descriptions often don’t reflect data-related roles—which not only matters internally for accurate work and role tracking, agency talent investments, etc., but also for signaling accurately to the “external” talent market. In our analogy, people who know their way around the warehouse probably know something about driving a forklift or delivery truck, but potential risks and inefficiencies will crop up if they haven’t been trained to take on those tasks. The same is true for data management.

  • Partnerships aren’t yet a major part of the equation. There is so much to figure out across our warehouse that there’s often little time to think about how other organizations may be able to help with the sorting, labeling, and delivery of key data. But emerging examples, such as eBird and the Trails Stewardship Initiative, show that opportunities exist for federal agencies to tap into private and non-profit capacity in tackling data governance challenges. More partnerships will help the government take advantage of these models of innovation and emerging best practices.

What Strategies Can Help?

Across our interviews, we identified several strategies that are already being employed across federal agencies to tackle these challenges. These include:

We likewise discovered examples of training that could address specific needs. These include:

Looking Ahead: Where We See Opportunities

We see three major solution areas in this space, and looking ahead, we intend to explore where EPIC can do our part to help environmental data stewards do their work more effectively: 

  1. Understanding and communicating the value of data improvements by more explicitly linking specific policies to specific data needs or issues. Capacity usually follows environmental policy goals, not data policy goals, and so making these links is crucial.

  2. Tools that harness GenerativeAI both require good data and can help alleviate some data governance capacity issues over the long term. For example, by enabling much faster, more accurate, and more comprehensive metadata entry. We’re kicking the tires to see how they might be helpful, for example in our recent test of AI-assisted data extraction from Army Corps of Engineer public notices.

  3. Leveraging partnerships for better data collection, governance, and use by distilling and sharing lessons from collaborative environmental data projects—and by supporting or facilitating the creation of data intermediaries where they’re needed.

Have ideas about where we can improve data governance capacity? Did we miss something? Don’t hesitate to reach out!

Previous
Previous

USDA Funding Ready to Pay Maryland Oyster Farmers for Nitrogen Reductions

Next
Next

Beetles in a Pay Stack: Stacking and Bundling in Biodiversity Credit Markets