My data set is composed of entries pulled from the Digital Public Library of America relating to the search term “Adirondacks.” Each row contains the link to an image, what organization contributed the image, the subject of the image, the image’s original creator, the medium, and a description of the image’s content. Not every entry contains all of this information. The images come from many sources, but the majority of the content comes from the Adirondack Museum and the New York Public Library.
The images are a mix of paintings, photographs, and scanned documents. There are some duplicate entries in the data set, and there are some double-sided documents that are listed as separate entries. There are also multiple columns that contain descriptions of the items, and the way in which some of the information is sorted can make it hard to determine what it means without looking at the original archive page. (For instance, names appear in a few separate columns, making it hard to determines which ones might be artists/photographers or contributors.) The data set currently has 2,659 rows. Because the data set was created based on the Digital Public Library of America’s archive, and not on, say, a transcribed historic document, there are few misspellings and way in which information, such as the contributor, is written is relatively uniform.
I anticipate having to do a good deal of data cleaning. There are currently around 38 columns, many of which contain the same kinds of information or information that could easily be combined. I intend to merge many of these columns. I believe this will require looking back often at the items’ original archive pages, especially for the location information. Some entries contain multiple locations, both within the Adirondacks, so it is hard to tell which is the location of the image’s creation. As for the double-sided documents, I’d like to find some way to make sure they stay grouped together.
For this project, I want to explore the shifts of the public’s perspective on the Adirondack region throughout the area’s history. Based on the little information I’ve already found, the way that people have viewed the Adirondacks has been altered by shifts in American understanding of wilderness, industry, and environmental protection. I aim to look more closely at when these shifts are occurring and to determine what other shifts might be happening at the same time. (Ex: shifts in American art movements causing increasing/decreasing interest in Adirondack landscapes, Growing interest in the American West causing a decline in interest in the Adirondacks, etc.)
- I also want to make a symbol map that allows users to see where each image is from. I want this map to be filterable by criteria such as year, medium, and possibly even subject (landscape, industry, etc). Moving forward with this project, I think it could be interesting to have a way for viewers to submit their own historic pictures of the Adirondacks, such as old family photos. I would like the map to also be filterable by the source of the images (DPLA or submissions).
- I also think it could be interesting to use a line graph to look at the difference in the amount of times various regions within the Adirondacks are mentioned in the entries.
- I want to make a word cloud that uses the subjects and image descriptions to determine what terms are most prevalent.
McHale, Ellen. Amusements, Summer Camps & Dude Ranches: a Guide to the Historical Records of the Warren County Tourist Industry of the Southern Adirondacks in New York State. Glens Falls: The Center for Folklife, History & Cultural Programs at Crandall Public Library, 1997.
Schneider, Paul. The Adirondacks: a History of America’s First Wilderness. New York: Henry Holt, 1997.
Tatham, David. Winslow Homer in the Adirondacks. New York : Syracuse University Press, 1996.
Terrie, Philip G. Contested Terrain: a New History of Nature and People in the Adirondacks. Syracuse: Syracuse University Press, 2008.
Terrie, Philip G. Forever Wild: a Cultural History of Wilderness in the Adirondacks. Syracuse: Syracuse University Press, 1994.(
April 2nd – Data Cleaning Complete
April 9th – Wireframe Done, beginning entry on map
April 16th – Word Cloud done, research done, continued work on map
April 30th – Essay portions 1st Draft Done
May 16th – Done