Read data
Load sample dataset
Read example dataset to pandas DataFrame. Datasets are loaded from GitHub repository datasets-for-start, you need an internet connection to load them.
Binary classification datasets:
- Adult dataset - predict whether an individual's income exceeds $50K/year based on census data,
- Breast Cancer dataset - predict the presence of breast cancer based on various medical attributes.
- Credit Scoring dataset - predict the likelihood of a customer defaulting on a loan.
- Employee Attrition dataset - predict whether an employee will leave the company based on various factors.
- Pima Indians Diabetes - predict the onset of diabetes based on diagnostic measurements.
- SPECT dataset - predict heart disease based on SPECT imaging data.
- Titanic dataset - predict the survival of passengers based on various features such as age, gender, and class.
Multiclass classification datasets:
- Iris dataset - classify iris flowers into three different species based on their physical attributes,
- Wine dataset - classify wines into different categories based on their chemical properties.
Regression datasets:
- Housing dataset - predict housing prices based on various features of the houses,
- House prices dataset - predict the final price of homes based on various features and attributes.
sampleexampledatasetpandas
Required packages
You need below packages to use the code generated by recipe. All packages are automatically installed in MLJAR Studio.
pandas>=1.0.0
Interactive recipe
You can use below interactive recipe to generate code. This recipe is available in MLJAR Studio.
Python code
# Python code will be here
Code explanation
- We are using Pandas package and
read_csv()
function. It is reading CSV files from URL. All datasets are available in the GitHub repository datasets-for-start. - After DataFrame is loaded, we display shape of data, number of rows and number of columns.
- We display first rows from DataFrame.
Example Python notebooks
Please find inspiration in example notebooks
Read data cookbook
Code recipes from Read data cookbook.