HMDA Mortgage Disclosure Data

Home Loan Mortgage Logo

We chose to analyze the Home Mortgage Disclosure Act, a law established in 1975 which required banks to publish information on loans that they have given out. This act was created to attempt to identify systemic discrimination within the home mortgage system of the US. Because the dataset is massive, we filtered the data to include the lending activities of The Bank of America in the state of Oregon during the year 2024. The dataset consists of 99 columns and 2252 rows, with each column representing a specific variable and each row representing an individual mortgage. As a group, we decided that race and ethnicity would likely be the biggest factor when it came to discrimination, so we decided to focus on the equitability of these data fields. 

We first performed a deconstructive analysis on the data sets, looking at specific descriptors such as applicant_race and applicant_ethnicity. In the applicant_race category we found that there were only 18 possible values with 13 of those 18 describing asian and pacific islander groups, while values such as white and black/african american had 0 other options or descriptors. For the applicant_ethnicity category, the only options available are hispanic or latino with a couple of broad descriptors, or not hispanic or latino. This skew in representation creates blind spots in the data which do not allow us to see unethical practices, which was the purpose for the creation of this data set. If certain races that face discrimination are grouped into the same category as races that are not, it will dilute the discrimination that can be seen in the data for that specific group. 

Using Koopman’s method of examining the micro level of the format, we found what values were allowed in each data field. After reviewing what was allowed, we used Poirier’s method of deconstructive reading to find what values are left out of the data set. We found many races and ethnicities were not represented which reinforced our findings from our deconstructive analysis. Through the analysis of this dataset, we found that the broad categories used for race and ethnicity are not accurate descriptors and misrepresent many groups by grouping them into broader categories. This leads to the data not being accurately reflected in the set, making us unable to identify discrimination of specific groups such as black and Middle Eastern individuals. 

In conclusion, we found that black and Middle Eastern individuals are less likely to get mortgage options. We also found that the loans given to these groups are often much worse than those given to their white counterparts. Without having proper definitions for these terms we obscure our ability to see discrimination within the dataset. 

Term and Year
Winter 2026
Category
Bias & Equality
Short Summary

The HMDA Mortgage data set consist of 143349 rows and 99 columns of data from Oregon, 2024, consisting of variables such as race, ethnicity, sex, interest rate, property value, etc, which determine the loans given to applicants.