File size: 1,950 Bytes
df8494d
 
2b7a277
 
 
 
 
 
df8494d
2b7a277
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4d52571
2b7a277
4d52571
2b7a277
 
 
 
 
 
 
 
 
 
4d52571
 
 
 
 
 
 
 
2b7a277
019bbee
2b7a277
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
---
license: mit
tags:
  - peft
  - lora
  - transformers
  - roberta
  - phishing-detection
language:
  - en
---

# 🛡️ Phishing URL Detector — RoBERTa + LoRA

This repository stores a LoRA adapter for `roberta-base`, fine‑tuned for phishing URL detection.

⚠️ **Note:** The 🤗 Inference Widget is disabled because this uses PEFT LoRA adapters. Use the code snippet below to run inference locally.

---

## 📦 Model Details

- **Base**: `roberta-base`
- **Adapter Type**: LoRA via PEFT
- **Task**: Phishing vs Legitimate URL classification
- **Model Size**: ~8.4 MB
- **Files Included** (9 total):
  - `adapter_model.safetensors` — LoRA weights (~3.6 MB)
  - `adapter_config.json`
  - `tokenizer_config.json`
  - `special_tokens_map.json`
  - `vocab.json`
  - `merges.txt`

---

## 📈 Performance

| Metric     | Score     |
|------------|-----------|
| Accuracy   | 99.81%    |
| Precision  | 99.99%    |
| Recall     | 99.68%    |
| F1 Score   | 99.84%    |
| AUC        | 99.78%    |




---

## 🧪 Usage in Python

```python
from transformers import RobertaTokenizerFast, RobertaForSequenceClassification
from peft import PeftModel, PeftConfig
import torch

# Load LoRA adapter with base model
config = PeftConfig.from_pretrained("Irshadcse2k16/PhishUrlDetection-RoBERTa")
base_model = RobertaForSequenceClassification.from_pretrained(
    config.base_model_name_or_path, num_labels=2
)
model = PeftModel.from_pretrained(
    base_model, "Irshadcse2k16/PhishUrlDetection-RoBERTa"
)
tokenizer = RobertaTokenizerFast.from_pretrained(
    "Irshadcse2k16/PhishUrlDetection-RoBERTa"
)

# Inference
url = "http://secure-login-update.com"
inputs = tokenizer(url, return_tensors="pt", truncation=True, padding=True, max_length=128)
with torch.no_grad():
    logits = model(**inputs).logits
    probs = torch.softmax(logits, dim=1)
    label = torch.argmax(probs).item()
print("🚨 Phishing" if label == 1 else "✅ Legitimate")