U.S. flag

An official website of the United States government, Department of Justice.

NCJRS Virtual Library

The Virtual Library houses over 235,000 criminal justice resources, including all known OJP works.
Click here to search the NCJRS Virtual Library

Predicting Recidivism Fairly: A Machine Learning Application Using Contextual and Individual Data

NCJ Number
305036
Author(s)
Eric L. Sevigny; Thaddeus L. Johnson; Jared A. Greathouse
Date Published
August 2021
Length
31 pages
Annotation

This report documents the efforts of a “Small Team’s” participation in the U.S, Justice Department’s National Institute of Justice’s (NIJ’s) “Forecasting Recidivism Challenge,” whose goal is to accelerate technical and substantive knowledge on predicting recidivism risk.

 

Abstract

As part of this competition, NIJ released data on Georgia parolees in three stages and challenged researchers to predict recidivism during the 1-year, 2-year, and 3-year periods following release from custody. The competition was judged on overall accuracy (Brier Score) and a combined measure of accuracy and fairness. There has been criticism that risk prediction will be unfair to racial minorities having higher risk prediction attributed to race, because statistics indicate Blacks are more likely to commit offenses than White persons. The competition thus emphasizes that race in itself must not be considered a criminogenic factor. Rather, criminogenic factors more likely to be prevalent among Blacks must be isolated in measuring crime risk for Whites as well as Blacks. This report constitutes the efforts of a self-identified “Small Team,” that acknowledges they are not machine learning scholars, but rather as a group of applied statisticians and researchers with experience in predictive analysis and risk assessment. As a result, the team focused only variables that contributed most to criminal behavior. A concluding comment, however, is that “these risk assessment tools attempt to predict future behavior, and that behavior is likely to be influenced by numerous contextual factors and future life events that are unknown and uncertain. Thus, these models are not likely to ever be highly accurate.”