Owen McCarty & Sarah Scott

The data set we chose was the United States Census for the city of Albany between 1850 and 1940. The federal census is taken once every ten years, so this data set accounts for 10 censuses from the city. Because Albany is a larger city and the censuses span 90 years, the amount of data given is incredible. The Excel spreadsheet has 708,786 cells, each representing a person. Out of curiosity, I tried scrolling to the bottom manually and it took well over five minutes.

The US census began in 1790 and numbers were taken by US Marshals until the Census Bureau was formed. Because there was not overarching federal entity, much of census taking was done on the state and local level. Enumerators would go house to house and get as much information as they could. The 1850 census was the first to feature all members of the household, not just the heads of home. Given the time period, there was also a lack of technology that could be used to expedite the census taking process/counting.

