Use case on the colleges dataset

Dataset colleges

Machine Learning Task: Regression

This is the Colleges database. Regroups information for about 7800 different US colleges. Including geographical information, stats about the population attending, and post-graduation career earnings.

Available at OpenML:

Category: People

# Rows: 7,063 # Columns: 47

Target: percent_pell_grant


Numeric: UNITID, latitude, longitude, admission_rate, sat_verbal_midrange, sat_math_midrange, sat_writing_midrange, act_combined_midrange, act_english_midrange, act_math_midrange, act_writing_midrange, sat_total_average, undergrad_size, percent_white, percent_black, percent_hispanic, percent_asian, percent_part_time, average_cost_academic_year, average_cost_program_year, ...

Nominal: city, state, zip, predominant_degree, highest_degree, ownership, region, gender, carnegie_basic_classification, carnegie_undergraduate, carnegie_size, religious_affiliation

String: school_name, school_webpage

Machine Learning Use Case People

Root Mean Square Error (RMSE)

Colleges Rmse

Mean Absolute Error (MAE)

Colleges Mae

Coefficient of Determination (R2)

Colleges R2

« Back to Machine Learning Algorithms Comparison