ML Model Poisoning Simulation
Simulate data poisoning and evaluate its impact on accuracy and targeted misclassifications.
This project explores data poisoning attacks in supervised and deep learning pipelines. It demonstrates how injecting crafted samples into training data can bias model behavior or embed targeted backdoors. Experiments measure accuracy degradation, misclassification rate, and resilience of different defensive mechanisms.
Tech Stack
- Python · PyTorch · Scikit-learn · Pandas
- Poisoning scenarios with targeted & untargeted backdoors
- Defense experiments: anomaly detection & data sanitization
Example (Label-flip attack)
import numpy as np
def label_flip(y, flip_ratio=0.1):
y_poisoned = y.copy()
n_flip = int(len(y) * flip_ratio)
idx = np.random.choice(len(y), n_flip, replace=False)
y_poisoned[idx] = 1 - y_poisoned[idx] # flip 0↔1
return y_poisoned
Project Highlights
- Attack Scenarios: Label-flip, backdoor trigger injection, gradient-based poisoning.
- Impact Evaluation: Track accuracy drop and F1-score variance across multiple runs.
- Defense Techniques: Data sanitization filters, robust aggregation, and anomaly-based detection.
- Visualization: Interactive charts for poisoned vs. clean decision boundaries.
