Project Archive - Ally, Ava, Dani, Zoey

project archive document

Our final project focused on the provided Adult Dataset, taken from 1994 census data, and is meant to be a prediction which uses multiple variables to see if adults will go on to make above or below 50,000 dollars annually. Our research question, as we proposed, is “Given that the adult data is drawn from 1994 census data, what inequalities might be reflected in the terms provided by the data set?” Using Poirier's Reading Data Sets methods (specifically Deconstructive and Connotative) we focused on what factors the data set included (or more realistically, excluded) that would have, if included, changed the outcome of the data. The large focus of our research was focused on the data set’s “listing of attributes,” though we also checked the actual data set for information on the ratio of women to men.

In doing so we realized that the data set had a disproportionate representation of men (about twice the amount of women), which we reasoned, using Poirier's Connotative analysis, that if women had been more equitably represented, the data would more accurately reflect the wage disparity that exists between men and women. Also, when using this method, we analyzed how limited the dataset's options of race and ethnicity are. They only provided 6 options for race, and one of them is now considered a slur. One of the options for race is “other” which is extremely broad and does not represent people equally.

For a Deconstructive analysis, our group considered other factors that would affect someone’s likelihood of making over 50k a year, like perhaps generational wealth (which could affect the likelihood of a college education, and in turn affect their future earnings), and ability/disability. A great deal of the jobs in the data set were that of physical labor, and there are limitations on who can safely do such labor. This leaves out an important demographic of the workforce. 

Some unrelated work that was nonetheless conducted included learning how to use PowerPoint! 

Overall, during analyzing this dataset, our group learned just how limited and unequal it truly is. With the limited and unequal representation of race, sex, workforce, and people with higher opportunity, we were able to realize just how much a modern analysis was needed.

Term
Spring 2025
Category
Privacy & Surveillance
Short Summary

Summary of group project in Rugile's 2:00pm section.