McCourt School of Public Policy
PBI

The Green Space Data Challenge

To Registrants: The data challenge has officially begun! Please see important communications for registered participants, including registration instructions and next steps, on this page. To ensure that you have the full month to plan and conduct your analysis, we highly recommend completing this process as soon as possible.

How does access to green space impact public health? How can green space data provide actionable information about our communities and where people’s needs are or aren’t being met?

The Massive Data Institute at Georgetown’s McCourt School of Public Policy welcomes your participation in the virtual Green Space Data Challenge. Participants will have the entire month of February 2023 to demonstrate the value of green space data by creating analyses, visualizations, and new community indicators.

Access to green space is an essential need and has been found to have important quality of life implications.  While access to green space is not equitable across communities nationwide, it has significant implications for community well-being, and has been found to encourage outdoor recreation, human connection, and positive mental and physical health.

The Challenge

In this data challenge, you will analyze the impact of inequitable green space access on communities using various data sources on our Redivis platform in individual or team notebooks. Your goal will be to transform these green space datasets into actionable community indicators that illustrate the effects of green space across one of four dimensions: public health, public safety, effects on a specific population (e.g., by age, race, or location), or physical environment (e.g. pollution levels).

We provide six datasets directly with information on green space for you to work with for the challenge. We also are linking to a variety of datasets on green space as well as subject area indicators (e.g. for public health, public safety, etc.) from the Environmental Impact Data Collaborative; you are free to use any of these in the challenge. In addition, you may request to use additional publicly available data in your submission by submitting a request by Wednesday, February 8th at 5:00PM EST. This data will be made available to all participants through the Redivis site. See the FAQ below for more information.

You will be evaluated by judges based on the relevance, completeness, and quality of your submission. Judges include: 

Stephen T. Dickinson, PhD, Urban Health Collaborative, Drexel University
Alexander Din, United States Department of Housing and Urban Development (HUD)
Michelle Kondo, PhD, USDA-Forest Service
Jarlath O’Neil-Dunne, Spatial Analysis Lab, University of Vermont

Example Topics: 

  • Are communities in close proximity to green space more likely to have better air quality?
    Is there a relationship between green space and social vulnerability?
  • Do communities with better green space access experience lower rates of violent crime or gun deaths?
  • How strong is the relationship between community green space and obesity?
  • What are the effects of good park systems on mortality?

Lit Review

For more ideas, see our lit review with examples of research involving green space in each of the four challenge subject areas.

explore more

Guidelines

When: Feb 1-28, 2023

Where: The data challenge is entirely virtual, with information and rules of the challenge available on this website. Updates will be posted on this website as well as emailed to registered participants.

Who you are: We welcome any participants over age 18, especially undergraduate students, graduate students, and early professional data scientists. Analysts based in academic institutions, government statistical offices, think tanks and policy labs, and community organizations are encouraged to participate.

What topics you can explore: Participants can analyze green space data in tandem with various other datasets that include health, environment, and/or public safety data.

Submissions and evaluation: Participants will conduct their analyses and submit a short project narrative that describes the research question, analytic approach, and key findings. We encourage participants to find creative ways to incorporate visualizations and other aspects of data storytelling to create a compelling narrative. In addition to their completed analysis with the indicators they used, participants will be asked to submit documentation describing each step of their process. The documentation should be detailed enough as to make the project fully replicable.

The narrative, indicator and methods, and organization and documentation of each participant’s project will be evaluated for relevance, completeness, and quality. More information on evaluation can be found in the FAQs at the bottom of this page.

Accommodation requests related to a disability should be made by 1/27/23 to pbidatachallenge@georgetown.edu .

Prize Information

There will be separate prize categories for submissions examining the effects of green space on the following subject areas:

Prize CategoriesFirstSecondThird
Community Health$5000$2000$1000
Community Safety$5000$2000$1000
Specific Populations$5000$2000$1000
Physical Environment$5000$2000$1000
Best Graduate Student Submissionone $1000 prize
Best Undergraduate Submissionone $1000 prize

Participating teams will be listed on MDI’s Place-Based Indicators project webpage. Winners will be invited to present their project both at a webinar hosted by the Association of Public Data Users (APDU) and at APDU’s annual conference in July 2023.

The Data

We will provide access to the following datasets. Challenge participants are welcome to bring their own data sources to be added or to use sources from our companion site, the Environmental Impact Data Collaborative (see our FAQs for more details).

A variety of datasets on community health, public safety, physical environment, and specific populations are available for use in your analysis through the EIDC and from other sources.

DatasetDescriptionGeographic CoverageData type(s)Estimated difficulty
EnviroAtlas | US EPAThis data shows land cover for 30 US urban areas at 1-meter spatial resolution. Land cover data present a “birds-eye” view that can help identify important features, patterns, and relationships in the landscape.30 U.S. Communities. See full list here.ESRI FileGDB, CSV🌳🌳🌳 
Provision and Access to Open Spaces in CitiesShows average share of built-up area in nine urban areas that is open public space, as well as percent of population living within 400 meters walking distance of open public space.9 US cities and urban areas.CSV (various)🌳
ParkServe DataComprehensive database of local parks in census-defined urban areas.Census-defined urban areas.ESRI Shapefile, CSV🌳🌳🌳
Green space data by census blockIllustrates the square meters of total land per person within each census block group that is covered by green space.NationalESRI FileGDB, CSV🌳🌳
Climate and Economic Justice Screening ToolHighlights disadvantaged census tracts across all 50 states, the District of Columbia, and the U.S. territories.National (census tract)Varies🌳🌳
Tree Equity ScoreMetric intended to help communities assess how well they are delivering equitable tree canopy cover to all residents.National (by state)ESRI Shapefile🌳
PAD-US-ARCurated dataset based on the Protected Areas Database-United States (PAD-US) from the USGS.
NationalVaries🌳🌳

FAQs

A: You can choose to either work individually or work within a team of up to four people, including yourself.

A: Yes, each individual or team is limited to one submission. One person cannot submit multiple entries or participate through multiple teams.

A: Yes, in fact, it is encouraged! You will not be able to download the data on the Redivis, but you are welcome to share your findings with others.

A: You may submit a request to add additional public data to the Challenge workspace. Note: The data must be made available to all participants. To submit a request to add public data to the Challenge, please email pbidatachallenge@georgetown.edu by Wednesday, February 8th at 5:00 PM EST.

A: Yes! You can use any dataset that’s publicly available on Redivis (even if it’s not from the Data Challenge or EIDC) that’s relevant for your analysis. If you plan to do so, we recommend running the dataset you’re looking at using by us to confirm that it’s a good fit for the challenge.

A: Each judge will use the same rubric to score submissions – scoring is based on a range of factors (e.g., the overall narrative, methods used, visualization, and creativity). Participants will conduct their analyses and submit a short project narrative that describes the research question, analytic approach, and key findings. We encourage participants to find creative ways to incorporate visualizations and other aspects of data storytelling to create a compelling narrative.

The effort and depth shown in the project narrative in answering the research question will be given the most weight, followed by the quality and utility of the resulting indicator or finding. The final assessment criterion is participants’ organization and level of documentation.

A: We expect a broad range of experience – with some users with minimal data experience and others with some background. This competition is aimed at users who are still in college (undergraduate or graduate) or are early career professionals, but anyone is welcome to participate. Substantial experience working with these types of data is not required.

A: Submissions can only be considered for one subject area. If your submission touches on multiple subject areas, please indicate clearly which area you would like your project to be considered for.

A: We will be reaching out to winners with results in late March.

A: Only the submissions that place in the competition will be publicized. The Massive Data Institute will arrange a webinar with APDU where competition winners will have an opportunity to present their work.

Green Space Challenge Flyer