Data Critiques

Data Critique: University of Pennsylvania Museum of Archaeology and Anthropology – Online Collections

This data set contains detailed information on objects in the University of Pennsylvania Museum of Archaeology and Anthropology online collections. The orginial data set contained over 12,500 objects from the early modern period, for the purpose of this critique assignment appromixately 770 objects from the data set were anaylized using WTFcsv.

According to the museum’s website, “the online database contains over 379,499 object records representing over 1,141, 045 objects (or components of objects) with 259,376 images illustrating 84,360 object records.” Essentially, the online database has over 300,000 online objects that represent all objects at the museum, and about 84,000 of the 300,000 object records have images. The entire data set available through the museum includes all 379,499 object records.

Information in this data set was possibly pulled from two sources, the online database and the museum’s files on each object. The open source online database fields reflect majority of the feilds in the dataset including object name, title, and number, culture, providence, site, cultural area, measurements, and materials (wood, stone, etc). Information that is included in the data set but not on the online database implies this information was collected from the museum’s files (either another database or physical files). This information includes accession credit, dates made, and the emulRN. The emulRN column is not reflected anywhere on the online database, this number may be from another database used by the musuem that is not open to the public.

With the subset of data used in this post, all of the objects are housed in the American curitoral section, telling us the objects were created or collected in the Americas. This can be supported by the native_name and culture columns. However, both columns are missing data for almost all of the objects, only 19 objects have native names and 62 objects are associated with a specific culture. While the rest of the objects may not have either of these identifications, it does not mean the origins of the objects are not from the Americas. Looking at the culture_area column, all of the objects are associated with a region in North America with over 300 being from the Greater Southwest.

Using the data set, I looked at object number 86-16-3 and tried to figure out what the object is. According to the data set, the object material is textile made around 1850 that was purchased by the museum in 1986, it is described as a Men’s Ceremonial Mantle from the Andean culture and is 182 cm long and 110 cm wide. Try to picture that object in your mind. What color is it? Is it rough or soft? Who made it? How was it originally used? The data does not tell you this contextual information. However, neither does the open access online database.

Information not included in the data set is images, where the object was published, and its exhibition history. However, this information is included in the online database. If this information was also included in the data set, how would that benefit a researcher using the data? The data set is for objects, how can an object be best described without an image? As far as I can tell, the descriptions need to be stronger and more detailed for researchers using the data especially without images.

One Comment

  • Maeve Kane

    That’s a great point about the limits of knowing a specific object using this data, bring that up in class! That’s one of the biggest differences in thinking with data vs thinking with individual qualitative records–you get big contours with data, but not as much fine detail.