Missing values are common enough to affect model quality.
Missing Value Imputation for ML AI Prompt
This prompt compares several imputation strategies specifically for machine learning use, not just data cleaning. It is helpful when missingness may itself be informative and the best imputation approach is not obvious. The masking evaluation adds evidence instead of relying on intuition alone.
Implement missing value imputation for machine learning on this dataset. 1. Profile missing values: count, percentage, and missingness pattern (MCAR, MAR, or MNAR) for each column 2. Implement and compare three imputation strategies: a. Simple imputation: median for numeric, mode for categorical b. KNN imputation: k=5 nearest neighbors based on complete features c. Iterative imputation (MICE): model each feature as a function of others, iterate until convergence 3. Evaluate each strategy by: artificially masking 10% of known values and measuring reconstruction error (RMSE) 4. Add missingness indicator columns (is_missing_[col]) for columns with more than 5% missing — these can be predictive features 5. Always fit imputation on training data only, then apply to validation and test sets Return: comparison table of imputation strategies, code for the best strategy, and list of missingness indicator columns created.
When to use this prompt
You want to compare simple and advanced imputation methods empirically.
Missingness indicators may carry signal.
You need train-only fitting to avoid leakage into validation or test data.
What the AI should return
A missingness profile, comparison of imputation strategies using reconstruction error, code for the selected method, and a list of any missingness indicator features created.
How to use this prompt
Open your data context
Load your dataset, notebook, or working environment so the AI can operate on the actual project context.
Copy the prompt text
Use the copy button above and paste the prompt into the AI assistant or prompt input area.
Review the output critically
Check whether the result matches your data, assumptions, and desired format before moving on.
Chain into the next prompt
Once you have the first result, continue deeper with related prompts in Feature Engineering.
Frequently asked questions
What does the Missing Value Imputation for ML prompt do?+
It gives you a structured feature engineering starting point for data scientist work and helps you move faster without starting from a blank page.
Who is this prompt for?+
It is designed for data scientist workflows and marked as beginner, so it works well as a guided starting point for that level of experience.
What type of prompt is this?+
Missing Value Imputation for ML is a single prompt. You can copy it as-is, adapt it, or use it as one step inside a larger workflow.
Can I use this outside MLJAR Studio?+
Yes. The prompt text works in other AI tools too, but MLJAR Studio is the best fit when you want local execution, visible Python code, and reusable notebooks.
What should I open next?+
Natural next steps from here are Date Feature Extraction, Embedding Features from Text, Feature Ideas Generator.