Model Extraction & API Abuse
Reconstruct ML models via API queries and test countermeasures like rate-limiting, noise, and watermarking.
This project simulates model extraction attacks on ML-as-a-Service APIs, where adversaries query a deployed model and train a surrogate to approximate its decision boundary. It measures the fidelity of the extracted model compared to the original and evaluates defensive strategies such as rate-limiting, randomized outputs, and watermark verification.
Tech Stack
- Python · FastAPI · PyTorch · Scikit-learn
- Black-box query-based extraction algorithms
- Logging, anomaly-based detection, and noise injection defenses
Example (Basic Extraction Loop)
import requests
import numpy as np
from sklearn.tree import DecisionTreeClassifier
# Query black-box API
X_train = np.random.rand(1000, 20)
y_labels = [requests.post("https://target.api/predict", json=x.tolist()).json()["label"] for x in X_train]
# Train local surrogate
surrogate = DecisionTreeClassifier().fit(X_train, y_labels)
print("Extracted surrogate model trained successfully!")
Project Highlights
- Attack Simulation: Perform black-box extractions with adaptive query selection.
- Defense Testing: Implement rate-limiting, response noise, and watermark-based model ownership checks.
- Evaluation: Compare accuracy, confidence divergence, and fidelity metrics between original and surrogate models.
- Visualization: Decision boundary plots and similarity heatmaps via Matplotlib and Streamlit.
