Examining and Predicting Student Dropout Rates

Problem Statement:

The "Predict Student Dropout Rates and Academic Success" data set was created to reduce the number of dropouts and failures in higher levels of education through machine learning techniques.  

Methods:

We will be prioritizing Porier’s reading data sets method in order to analyze our data. We will need to use a critical, detailed approach to analyze this data. This is what Porier’s method is good for. In order to understand our question; “Predict Students Dropout and Academic Success data” is created to reduce the number of dropouts and failures we must use Porier’smethod to analyze as well as taking a head on approach like Koopman does. We will use critical thinking by asking ourselves “what does this data tell us?” Critical thinking will allow us to bring attention to deeper meanings in the data itself. 

Biggest Predictors of Success:

  • Students with tuition payments on time
  • Students who parents have graduated themselves or achieved an education
  • Students who attend classes regularly
  • Students who were given scholarships

 

Data Analysis: 

Connotative:
This was used to identify students who are at risk of dropping out at an early stage of their academic path, so that strategies to support them can be put into place.

Denotative:
• Dropout rates: Male – 45%, Female – 25%
• Displaced peoples have a higher graduation rate, while non-displaced students have a higher dropout rate
• Scholarship holders have a much higher chance of graduating (76%) compared to 41% for non-scholarship holders

Deconstructive:
• Tuition fees being up to date accounts for parental occupation but does not include income, monthly or yearly expenses, or family size.

 

Conclusion Statements/Findings:

• Students with less support or lower income are more likely to drop out
• A significant education gap exists between genders
• Displaced students tend to have more academic success
• Financial factors play an important role in achieving academic success

Term and Year
Winter 2026
Category
Bias & Equality
Short Summary

Background info and Introduction: 

 

Analyzing student dropout data would help us further understand the deeper message behind the dataset. It's important to get all different sides of the story, and recognizing the exact percentage of dropouts from each variable was key to analyzing our data.

We looked at data from:

Students enrolled between years: 2008–2019

Databases that were used in the case study were: Academic Management System (AMC), General Directorate of Higher Education (DGES), National Competition for Access to Higher Education (CNAES), Contemporary Portugal Database (PORDATA)

Data collected from 17 different fields of study stemming from: Advertising and Marketing Management, Journalism, Nursing, etc.