PhishUrlDetection-RoBERTa / README.md

Irshadcse2k16

Update README.md

019bbee verified 6 months ago

preview code

raw

history blame contribute delete

1.95 kB

metadata

license: mit
tags:
  - peft
  - lora
  - transformers
  - roberta
  - phishing-detection
language:
  - en

🛡️ Phishing URL Detector — RoBERTa + LoRA

This repository stores a LoRA adapter for roberta-base, fine‑tuned for phishing URL detection.

⚠️ Note: The 🤗 Inference Widget is disabled because this uses PEFT LoRA adapters. Use the code snippet below to run inference locally.

📦 Model Details

Base: roberta-base
Adapter Type: LoRA via PEFT
Task: Phishing vs Legitimate URL classification
Model Size: ~8.4 MB
Files Included (9 total):
- adapter_model.safetensors — LoRA weights (~3.6 MB)
- adapter_config.json
- tokenizer_config.json
- special_tokens_map.json
- vocab.json
- merges.txt

📈 Performance

Metric	Score
Accuracy	99.81%
Precision	99.99%
Recall	99.68%
F1 Score	99.84%
AUC	99.78%

🧪 Usage in Python

from transformers import RobertaTokenizerFast, RobertaForSequenceClassification
from peft import PeftModel, PeftConfig
import torch

# Load LoRA adapter with base model
config = PeftConfig.from_pretrained("Irshadcse2k16/PhishUrlDetection-RoBERTa")
base_model = RobertaForSequenceClassification.from_pretrained(
    config.base_model_name_or_path, num_labels=2
)
model = PeftModel.from_pretrained(
    base_model, "Irshadcse2k16/PhishUrlDetection-RoBERTa"
)
tokenizer = RobertaTokenizerFast.from_pretrained(
    "Irshadcse2k16/PhishUrlDetection-RoBERTa"
)

# Inference
url = "http://secure-login-update.com"
inputs = tokenizer(url, return_tensors="pt", truncation=True, padding=True, max_length=128)
with torch.no_grad():
    logits = model(**inputs).logits
    probs = torch.softmax(logits, dim=1)
    label = torch.argmax(probs).item()
print("🚨 Phishing" if label == 1 else "✅ Legitimate")