AI-based Malware Detection
Classify binaries and network traces using ML models with adversarial robustness evaluation.
This lab explores the use of machine learning for malware classification across both static and dynamic analysis pipelines. It implements binary feature extraction, training of gradient-boosted and deep models, and adversarial evasion testing to assess robustness against obfuscation and polymorphic malware.
Tech Stack
- Python · Scikit-learn · XGBoost · PyTorch
- Static & dynamic feature extraction (PE headers, API calls, opcodes)
- Adversarial evasion tests and feature-importance explainability
Example (XGBoost feature importances)
from xgboost import XGBClassifier
import shap
model = XGBClassifier().fit(X_train, y_train)
explainer = shap.Explainer(model)
shap_values = explainer(X_test)
shap.summary_plot(shap_values, X_test)
Project Highlights
- Static Analysis: Extract PE header, string entropy, and opcode frequency features.
- Dynamic Analysis: Simulated runtime behaviors and API call sequences.
- Adversarial Evaluation: Generate evasion examples using feature mutation and adversarial embedding perturbations.
- Visualization: SHAP-based model explainability and confusion matrix dashboards.
