SRLLC – one of the winners of the National Institute of Justice’s (NIJ’s) Recidivism Forecasting Challenge – presents an overview of the methodology used by SRLLC while developing the program that generated the winning submission, as well as covering the quantitative importance of features within the dataset and sources of error.
At the outset of the project, SRLLC decided to develop a commercially applicable translation layer that would convert the contents of any Pandas dataframe into a form suitable for supervised training with a Tensorflow artificial neural network. This library was used for the first year scores that SRLLC received a cash prize for submitting; the majority of the time spent on this project was devoted to the development of said library, with very little done with the specific dataset provided by NIJ. This report’s chapters address materials, methodology, and sources of error. The concluding statement notes that any interpretation of this report’s numbers is left to the reader, but a higher number in the table corresponds to a higher importance of that feature column. The report recommends considering only the training set numbers, since it is irrelevant whether SRLLC’s model generalizes without a given feature being available to it. 2 tables