This chain prompt designs an ML audit trail spanning prediction logging, model lineage, deployment records, data lineage, access logs, and automated report generation. It is useful in regulated or high-accountability settings where every production prediction must be explainable and traceable.
Step 1: Define audit requirements โ identify the regulatory and business requirements driving the need for an ML audit trail. What questions must the audit trail be able to answer? (e.g. 'Which model version made this prediction on this date?' 'What data was this model trained on?' 'Who approved this model for production?')
Step 2: Prediction-level traceability โ ensure every production prediction is logged with: request_id, model_version, model_artifact_hash, feature_values, prediction, timestamp, serving_node. Verify the prediction log is immutable and tamper-proof.
Step 3: Model lineage โ for every model version in the registry, record: training dataset version and hash, git commit of training code, hyperparameters, evaluation metrics, training job ID, and who triggered the training run.
Step 4: Deployment audit log โ record every stage transition in the model registry: from stage, to stage, performed by, timestamp, reason, and approval reference. This log must be immutable.
Step 5: Data lineage โ trace the training data back to its source systems. Document: which source tables were used, which date ranges, what transformations were applied, and whether any data was excluded and why.
Step 6: Access audit โ log every access to the model registry, prediction logs, and training data: who accessed what, when, and from where. Alert on unusual access patterns.
Step 7: Audit report generation โ implement an automated audit report generator that, given a request_id, produces a complete audit trail: source data โ training data โ model training โ model approval โ deployment โ prediction. This report should be producible within 1 hour for regulatory or legal inquiries.