Machine Learning Task: Binary classification
This is a SPAM E-mail Database. This collection of spam e-mails came from postmasters and individuals who had filed spam. Collection of non-spam e-mails came from filed work and personal e-mails, and hence the word 'george' and the area code '650' are indicators of non-spam. These are useful when constructing a personalized spam filter. One would either have to blind such non-spam indicators or get an extensive collection of non-spam to generate a general-purpose spam filter.
Available at OpenML: https://openml.org/d/44