File size: 1,950 Bytes
df8494d 2b7a277 df8494d 2b7a277 4d52571 2b7a277 4d52571 2b7a277 4d52571 2b7a277 019bbee 2b7a277 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 |
---
license: mit
tags:
- peft
- lora
- transformers
- roberta
- phishing-detection
language:
- en
---
# 🛡️ Phishing URL Detector — RoBERTa + LoRA
This repository stores a LoRA adapter for `roberta-base`, fine‑tuned for phishing URL detection.
⚠️ **Note:** The 🤗 Inference Widget is disabled because this uses PEFT LoRA adapters. Use the code snippet below to run inference locally.
---
## 📦 Model Details
- **Base**: `roberta-base`
- **Adapter Type**: LoRA via PEFT
- **Task**: Phishing vs Legitimate URL classification
- **Model Size**: ~8.4 MB
- **Files Included** (9 total):
- `adapter_model.safetensors` — LoRA weights (~3.6 MB)
- `adapter_config.json`
- `tokenizer_config.json`
- `special_tokens_map.json`
- `vocab.json`
- `merges.txt`
---
## 📈 Performance
| Metric | Score |
|------------|-----------|
| Accuracy | 99.81% |
| Precision | 99.99% |
| Recall | 99.68% |
| F1 Score | 99.84% |
| AUC | 99.78% |
---
## 🧪 Usage in Python
```python
from transformers import RobertaTokenizerFast, RobertaForSequenceClassification
from peft import PeftModel, PeftConfig
import torch
# Load LoRA adapter with base model
config = PeftConfig.from_pretrained("Irshadcse2k16/PhishUrlDetection-RoBERTa")
base_model = RobertaForSequenceClassification.from_pretrained(
config.base_model_name_or_path, num_labels=2
)
model = PeftModel.from_pretrained(
base_model, "Irshadcse2k16/PhishUrlDetection-RoBERTa"
)
tokenizer = RobertaTokenizerFast.from_pretrained(
"Irshadcse2k16/PhishUrlDetection-RoBERTa"
)
# Inference
url = "http://secure-login-update.com"
inputs = tokenizer(url, return_tensors="pt", truncation=True, padding=True, max_length=128)
with torch.no_grad():
logits = model(**inputs).logits
probs = torch.softmax(logits, dim=1)
label = torch.argmax(probs).item()
print("🚨 Phishing" if label == 1 else "✅ Legitimate")
|