Exploratory Data Analysis
HR Employee Attrition Analysis in Python
Explore the IBM HR Analytics dataset to uncover attrition patterns by department, age, salary, and job satisfaction.
What
This AI Data Analyst workflow loads the IBM HR Analytics attrition CSV from a URL, summarizes the dataset shape, and calculates the overall attrition rate. It generates visual comparisons of attrition rates by department and job role, and contrasts monthly income distributions for employees who left versus stayed. It also examines relationships between job satisfaction, work-life balance, and attrition using correlation analysis and a heatmap.
Who
This is for HR analysts and people analytics practitioners who need a reproducible way to explore attrition patterns in a standard benchmark dataset. It is also useful for data analysts learning exploratory analysis workflows that combine grouped summaries, distribution plots, and correlation checks.
Tools
- pandas
- numpy
- matplotlib
- seaborn
Outcomes
- Loaded dataset with shape (1470, 35) and computed overall attrition rate (16.1%)
- Bar chart of attrition rate by department and job role
- Box plot comparing monthly income for leavers vs stayers
- Correlation heatmap linking job satisfaction and work-life balance with attrition
Quality Score
7/10
Last scored: Apr 7, 2026
Task Completion: 2/2
ExcellentAll prompted steps were attempted: dataset loaded and attrition rate computed, attrition by department/job role visualized, income distribution compared for leavers vs stayers, and satisfaction/work-life balance vs attrition visualized.
Execution Correctness: 2/2
ExcellentProvided Python code is syntactically correct and uses appropriate pandas/seaborn operations (read_csv, groupby/mean/pivot, violinplot, heatmap) that are likely to run as-is on the given dataset.
Output Quality: 1/3
Needs workOverall attrition rate matches expected (~16.1%) and the dataset preview indicates 35 columns, but the workflow does not report the full shape (1470, 35) explicitly and does not provide the expected directional findings (e.g., Sales highest vs Research lowest; leavers earn less) beyond generic plot descriptions.
Reasoning Quality: 1/2
Needs workReasoning is cautious about not over-interpreting plots, but it fails to extract and state the key insights requested in the expected outcomes, mostly describing what the charts are rather than what they show.
Reliability: 1/1
ExcellentApproach is consistent and avoids fabricating numeric conclusions; transformations (AttritionFlag) are sensible and the visualizations are robust for the questions asked.