U.S. flag

An official website of the United States government, Department of Justice.

NCJRS Virtual Library

The Virtual Library houses over 235,000 criminal justice resources, including all known OJP works.
Click here to search the NCJRS Virtual Library

NIJ Recidivism Forecasting Challenge Report for Team PASDA

NCJ Number
305042
Author(s)
Michael Porter; George Mohler
Date Published
2021
Length
14 pages
Annotation

This is the report submitted by Team PASDA in the 2021 NIJ Recidivism Forecasting Challenge, a competition hosted by the U.S. Justice Department’s National Institute of Justice (NIJ), with the goal of “increasing public safety and improving the fair administration of justice across the United States.”

 

Abstract

The challenge focused on data from the State of Georgia on individuals released from prison to parole supervision for the period January 1, 2013, through December 31, 2015. Challenge participants were tasked with constructing a predictive model of 1-, 2-, and 3-year recidivism upon release from prison based on variables such as age, gender, race, education, prior arrests, and convictions, as well as other variables. In reporting which variables were statistically significant, some were from the top-performing Catboost model. Several of the handcrafted features were top predictors, including total arrests normalized by release age and the difference between the percentage of days employed and jobs per year. The team considered several model families throughout the competition. These included unpenalized linear models (lasso, ridge, elasticnet, and relaxed lasso), generalized additive models (GAM), boosted trees (GBM, xgboost, and catboost), and bagged trees (random forest). Select Interaction effects were considered in the linear mode. The report’s conclusion considers whether there are practical/applied findings that could help the field based on the team’s work. The team advises that event-level data available after parole seemed to be stronger features than static demographic data. This suggests that generating a good feature set will be important for building accurate forecasts. This is also revealed in recent research in which humans can outperform models with limited features; however, algorithms outperform humans when the feature set is expanded.