Use case on the spambase dataset

Dataset spambase

Machine Learning Task: Binary classification

This is a SPAM E-mail Database. This collection of spam e-mails came from postmasters and individuals who had filed spam. Collection of non-spam e-mails came from filed work and personal e-mails, and hence the word 'george' and the area code '650' are indicators of non-spam. These are useful when constructing a personalized spam filter. One would either have to blind such non-spam indicators or get an extensive collection of non-spam to generate a general-purpose spam filter.

Available at OpenML: https://openml.org/d/44

Category: Technology

# Rows: 4,601 # Columns: 57

Target: class

Features

Numeric: word_freq_make, word_freq_address, word_freq_all, word_freq_3d, word_freq_our, word_freq_over, word_freq_remove, word_freq_internet, word_freq_order, word_freq_mail, word_freq_receive, word_freq_will, word_freq_people, word_freq_report, word_freq_addresses, word_freq_free, word_freq_business, word_freq_email, word_freq_you, word_freq_credit, ...

Machine Learning Use Case Technology

Area Under ROC Curve (AUC)

Spambase Auc

Accuracy (ACC)

Spambase Acc

Balanced Accuracy (BALACC)

Spambase Balacc

Cross-Entropy Loss (LOGLOSS)

Spambase Logloss

« Back to Machine Learning Algorithms Comparison