Adult Census Income Dataset

Adult Income Prediction
Dataset

A classic census income dataset from the UCI Machine Learning Repository, containing 48,842 samples and 14 demographic features, predicting whether annual income exceeds $50K, is a standard dataset for social science data analysis. ```

48,842 samples 15 features CC BY 4.0 license US Census Bureau (1994)
Adult Census Income Dataset
πŸ“Š
48,842
Total number of samples
πŸ”¬
15
Feature dimensions
πŸ’°
2
Income categories
πŸ“œ
CC BY 4.0
Open license agreement

Dataset Highlights

A large-scale social science dataset suitable for classification modeling from beginner to advanced levels

🌍

Real population data

The data comes from the 1994 US Census, containing real demographic information such as age, education, occupation, race, and gender.

πŸ’°

Binary income classification

Predict whether an individual's annual income exceeds $50K, a classic dataset for learning about imbalanced classification problems.

πŸ“Š

Mixed feature types

Includes continuous (age, hours worked) and categorical (education, occupation, marital status) features, suitable for comprehensive practice in data preprocessing.

πŸ“ˆ

Large sample size

Nearly 50,000 samples support complex model training and cross-validation experiments.

βš–οΈ

Fairness research

The data includes sensitive features such as gender and race, making it an important dataset for studying algorithmic fairness and bias detection.

πŸ›οΈ

UCI authoritative source

Originating from the UCI Machine Learning Repository, it is one of the most cited datasets in the field of social science machine learning.

Applicable Scenarios

From classroom teaching to fairness research, the application scenarios are extensive

πŸ’°

Income prediction

Build a binary classification model to predict individual annual income levels and compare the performance of different algorithms

βš–οΈ

Fairness analysis

Detect prediction differences in dimensions such as gender and race, and study algorithmic bias

πŸ”

Feature engineering

Handle mixed feature types and practice encoding, scaling, and feature selection techniques

πŸ“‰

Data visualization

Explore the relationship between demographic features and income, suitable for social science EDA

Social Science Binary Classification Fairness Demographics Large-scale Data

Data Preview

Below are the first few rows of the adult income dataset

CSV
age,workclass,fnlwgt,education,education_num,marital_status,occupation,relationship,race,sex,capital_gain,capital_loss,hours_per_week,native_country,income
39,State-gov,77516,Bachelors,13,Never-married,Adm-clerical,Not-in-family,White,Male,2174,0,40,United-States,<=50K
50,Self-emp-not-inc,83311,Bachelors,13,Married-civ-spouse,Exec-managerial,Husband,White,Male,0,0,13,United-States,<=50K
38,Private,215646,HS-grad,9,Divorced,Handlers-cleaners,Not-in-family,White,Male,0,0,40,United-States,<=50K

3 Steps to Get Started Quickly

From browsing to analysis, you can start your data science project in just a few minutes

01

Browse the dataset

View the dataset details on the Ace Data Cloud platform, including field descriptions, sample size, and license agreement metadata.

02

Download the data

Download the CSV file (5.3 MB), which contains the combined data of the training and testing sets.

03

Load and analyze

Use pandas.read_csv() to load the data and start exploratory analysis and classification modeling.

Start exploring income prediction data

A classic social science dataset with an open license, available for immediate download. Nearly 50,000 real census records make it an ideal choice for classification modeling and fairness research.