---
name: ai-ethics
description: Responsible AI development and ethical considerations. Use when evaluating
AI bias, implementing fairness measures, conducting ethical assessments, or ensuring
AI systems align with human values.
author: Joseph OBrien
status: unpublished
updated: '2025-12-23'
version: 1.0.1
tag: skill
type: skill
---
# AI Ethics
Comprehensive AI ethics skill covering bias detection, fairness assessment, responsible AI development, and regulatory compliance.
## When to Use This Skill
- Evaluating AI models for bias
- Implementing fairness measures
- Conducting ethical impact assessments
- Ensuring regulatory compliance (EU AI Act, etc.)
- Designing human-in-the-loop systems
- Creating AI transparency documentation
- Developing AI governance frameworks
## Ethical Principles
### Core AI Ethics Principles
| Principle | Description |
|-----------|-------------|
| **Fairness** | AI should not discriminate against individuals or groups |
| **Transparency** | AI decisions should be explainable |
| **Privacy** | Personal data must be protected |
| **Accountability** | Clear responsibility for AI outcomes |
| **Safety** | AI should not cause harm |
| **Human Agency** | Humans should maintain control |
### Stakeholder Considerations
- **Users**: How does this affect people using the system?
- **Subjects**: How does this affect people the AI makes decisions about?
- **Society**: What are broader societal implications?
- **Environment**: What is the environmental impact?
## Bias Detection & Mitigation
### Types of AI Bias
| Bias Type | Source | Example |
|-----------|--------|---------|
| Historical | Training data reflects past discrimination | Hiring models favoring male candidates |
| Representation | Underrepresented groups in training data | Face recognition failing on darker skin |
| Measurement | Proxy variables for protected attributes | ZIP code correlating with race |
| Aggregation | One model for diverse populations | Medical model trained only on one ethnicity |
| Evaluation | Biased evaluation metrics | Accuracy hiding disparate impact |
### Fairness Metrics
**Group Fairness:**
- Demographic Parity: Equal positive rates across groups
- Equalized Odds: Equal TPR and FPR across groups
- Predictive Parity: Equal precision across groups
**Individual Fairness:**
- Similar individuals should receive similar predictions
- Counterfactual fairness: Would outcome change if protected attribute differed?
### Bias Mitigation Strategies
**Pre-processing:**
- Resampling/reweighting training data
- Removing biased features
- Data augmentation for underrepresented groups
**In-processing:**
- Fairness constraints in loss function
- Adversarial debiasing
- Fair representation learning
**Post-processing:**
- Threshold adjustment per group
- Calibration
- Reject option classification
## Explainability & Transparency
### Explanation Types
| Type | Audience | Purpose |
|------|----------|---------|
| Global | Developers | Understand overall model behavior |
| Local | End users | Explain specific decisions |
| Counterfactual | Affected parties | What would need to change for different outcome |
### Explainability Techniques
- **SHAP**: Feature importance values
- **LIME**: Local interpretable explanations
- **Attention maps**: For neural networks
- **Decision trees**: Inherently interpretable
- **Feature importance**: Global model understanding
### Model Cards
Document for each model:
- Model purpose and intended use
- Training data description
- Performance metrics by subgroup
- Limitations and ethical considerations
- Version and update history
## AI Governance
### AI Risk Assessment
**Risk Categories (EU AI Act):**
| Risk Level | Examples | Requirements |
|------------|----------|--------------|
| Unacceptable | Social scoring, manipulation | Prohibited |
| High | Healthcare, employment, credit | Strict requirements |
| Limited | Chatbots | Transparency obligations |
| Minimal | Spam filters | No requirements |
### Governance Framework
1. **Policy**: Define ethical principles and boundaries
2. **Process**: Review and approval workflows
3. **People**: Roles and responsibilities (ethics board)
4. **Technology**: Tools for monitoring and enforcement
### Documentation Requirements
- Data provenance and lineage
- Model training documentation
- Testing and validation results
- Deployment and monitoring plans
- Incident response procedures
## Human Oversight
### Human-in-the-Loop Patterns
| Pattern | Use Case | Example |
|---------|----------|---------|
| Human-in-the-Loop | High-stakes decisions | Medical diagnosis confirmation |
| Human-on-the-Loop | Monitoring with intervention | Content moderation escalation |
| Human-out-of-Loop | Low-risk, high-volume | Spam filtering |
### Designing for Human Control
- Clear escalation paths
- Override capabilities
- Confidence thresholds for automation
- Audit trails
- Feedback mechanisms
## Privacy Considerations
### Data Minimization
- Collect only necessary data
- Anonymize when possible
- Aggregate rather than individual data
- Delete data when no longer needed
### Privacy-Preserving Techniques
- Differential privacy
- Federated learning
- Secure multi-party computation
- Homomorphic encryption
## Environmental Impact
### Considerations
- Training compute requirements
- Inference energy consumption
- Hardware lifecycle
- Data center energy sources
### Mitigation
- Efficient architectures
- Model distillation
- Transfer learning
- Green hosting providers
## Reference Files
- **`references/bias_assessment.md`** - Detailed bias evaluation methodology
- **`references/regulatory_compliance.md`** - AI regulation requirements
## Integration with Other Skills
- **machine-learning** - For model development
- **testing** - For bias testing
- **documentation** - For model cards