Data Critique: Eastern State Penitentiary Admission Book

The data set we analyzed was that of the Eastern State Penitentiary admission books. The books listed the admission of the incoming prisoners by name and other features such as their age, ethnicity, religion, job, birth place, prisoner number, admission date, sentencing location, offense, length of sentence, number of convictions and general notes. The data describes the inmates imprisoned at Eastern State Penitentiary from the years 1832- 1868. While the data was collected in one specific region the original homelands of the inmates covers a large geographic area demonstrating the changing demographic in the industrial age. For example, imprisoned are not only people from the local area, but from countries such as Holland, Germany and Ireland. The data provided lacks much of the context needed to understand the sentencing procedure. Some were sentenced to a year for forgery, while another was sentenced to 10 years for the same offence, likewise others who committed a seemingly minor offence were given more time than those who committed murder.

The data was extrapolated from seven manuscripts collected by the prison staff, with comments made by the Moral Instructor Revered Thomas Larcombe. We believe not all manuscripts were included in the spreadsheet as the website states the records go until 1892, however, the highest date we could find in the sheet was 1868. The data is divided into rows of inmates and is divided into columns based on different categories of information provided about each inmate, such as ethnicity and offense.

If this dataset was our only source, we would be unsure as to who created it and the original layout of the information as the manuscript looked more like a journal than a spreadsheet. It also does not include a standardized way to include the gender, with few rows having gender included in the ethnicity column, especially as the years go on. Ethnicity is also not included for every inmate.

The sheet was not ideal for sorting as there was no standardized way of writing locations. For example, Bucks County Pennsylvania was referred to as both “Bucks Co” and “Bucks, PA”. Another example of this is the length of each inmates’ sentence as the same sentence can be written in multiple ways. While one could be written “1 year, 1 day,” the same length could appear as “1 year & 1 day”. It is also difficult to use as the data for ethnicity, religion, occupation, and occasionally gender are all listed within the same column, meaning you cannot sort solely by one attribute such as ethnicity or occupation.

There also seems to be user error regarding the creation of the spreadsheet, multiple inmates are listed twice in the spreadsheet. It is also unclear if certain spelling errors were part of the original document or user error. Also the “ethnicity, religion, profession” column should have been divided into individual columns for easier use. The date 12/28/2015 is also included in the spreadsheet for the inmate Robert Wadlow and it is unclear if this is a mistake or part of the original document, which would be pretty crazy since the Penitentiary closed in 1970.

Data Critique by Merissa Marthage and Clara Meyer

